Linked Data can be generated by applying mapping rules on existing (semi-)structured data. The manual creation of these rules involves a costly process for users. Therefore, (semi-)automatic approaches have been developed to assist users. Although, they provide promising results, in use cases where examples of the desired Linked Data are available they do not use the knowledge provided by these examples, resulting in Linked Data that might not be as desired. This in turn requires manual updates of the rules. These examples can in certain cases be easy to create and offer valuable knowledge relevant for the mapping process, such as which data corresponds to entities and attributes, how this data is an- notated and modeled, and how different entities are linked to each other. In this paper, we introduce a semi-automatic approach to create rules based on examples for both the existing data and corresponding Linked Data. Furthermore, we made the approach available via the RMLEditor, making it readily accessible for users through a graphical user interface. The proposed approach provides a first attempt to generate a complete Linked Dataset based on user-provided examples, by creating an initial set of rules for the users.
https://pieterheyvaert.com/research/publications/heyvaert_ld4ie_2017/
2. Semantic Web technologies rely on Linked Data,
but not all data is accessible as Linked Data.
databases
XML files
Solutions to provide access exist,
but results are not always as desired because limited knowledge is used:
data schema
ontology
2
6. Input data
id title author
0 Harry Potter and The Sorcerer’s Stone J.K. Rowling
1 Homo Deus Yuval Noah Harari
{
"authors": [{
"id": "jkr",
"name": "J.K. Rowling",
"country": "UK"
"birthdate": "1965-07-31"
},{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}]
}
6
7. Desired Linked Data
book:0 a schema:Book;
schema:title "Harry Potter and The Sorcerer’s Stone"@en;
schema:author author:jkr.
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:jkr a foaf:Person;
foaf:name "J.K. Rowling";
foaf:country "UK";
schema:birthdate "1965-07-21"^^xsd:date.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "UK";
schema:birthdate "1976-04-24"^^xsd:date.
7
8. Apply rules to generate Linked Data
original
data
Linked
Data
rules
rules state how to generate RDF terms and triples using data and ontologies
8
9. Linked Data example available
9
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
10. Use example to create rules
sample example
rules
10
original
data
Linked
Data
11. Linked Data example aligns
with sample of original data
id title author
1 Homo Deus Yuval Noah Harari
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
11
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
12. Alignment with original data and create rules
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
id title author
1 Homo Deus Yuval Noah Harari
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
id
12
rule: IRI is “book” + value from column “id”
13. Alignment with original data and create rules
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
id title author
1 Homo Deus Yuval Noah Harari
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
title
13
rule: literal uses value from column “title”
14. Alignment with original data and create rules
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
id title author
1 Homo Deus Yuval Noah Harari
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
titleproperty
14
rule: predicate is schema:title
15. Alignment with original data and create rules
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
id title author
1 Homo Deus Yuval Noah Harari
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
type
15
rule: type of a book is schema:Book
16. Alignment with original data and create rules
book:1 a schema:Book;
schema:title "Homo Deus"@en;
schema:author author:ynh.
author:ynh a foaf:Person;
foaf:name "Yuval Noah Harari";
foaf:country "Israel";
schema:birthdate "1976-04-24"^^xsd:date.
id title author
1 Homo Deus Yuval Noah Harari
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
other entity
16
rule: a book is related to its author
17. All rules
IRI is “book” + value from column “id”
Literal uses value from column “title”
Predicate is schema:title
Type of a book is schema:Book
A book is related to its author
17
18. Apply rules to generate all Linked Data
sample example
rules
18
original
data
Linked
Data
19. Linked Data might not be as desired
Rules are prone to errors when created manually
Wrong use of ontology classes, properties, and datatypes
Wrong alignments with original data
Especially when dealing
with large and complex data sources
multiple data sources at the same time
19
22. Solutions to reduce manual effort
when creating rules
Semi-automatic: users provide feedback
Automatic: no user interaction required
22
23. Current solutions use limited knowledge
Only work with
data schemas
data values
ontologies
Do not consider knowledge embedded in
query workload of Linked Data
Linked Data examples
23
31. {
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
Align with data sources
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
id title author
1 Homo Deus Yuval Noah Harari
CSV
JSON
31
32. {
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
Align with data sources
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
id title author
1 Homo Deus Yuval Noah Harari
CSV
CSV
JSON
32
33. {
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
Align with data sources
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
id title author
1 Homo Deus Yuval Noah Harari
CSV
CSV
CSV
JSON
33
34. Align with data sources
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
JSONCSV
CSV CSV JSON JSON JSON
{
"id": "ynh",
"name": "Yuval Noah Harari",
"country": "Israel",
"birthdate": "1976-04-24"
}
id title author
1 Homo Deus Yuval Noah Harari
CSV
JSON
34
35. Select best data source
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
JSONCSV
CSV CSV JSON JSON JSON
for each subgraph with an entity
35
36. Select best data source
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
JSONCSV
CSV CSV JSON JSON JSON
only CSV data source
CSV
36
37. Select best data source
schema:Book foaf:Person
Homo Deus
@en
Yuval Noah Harari Israel 1976-04-24
xsd:date
book:1 author:ynh
schema:author
schema:title
foaf:name schema:birthdate
foaf:country
JSONCSV
CSV CSV JSON JSON JSON
CSV data source match with 1 node
JSON data source match with all nodes
JSON
37
40. Create rules for entity
schema:Book
book:1
CSV
IRI is “book” + id
type is schema:Book
40
41. Create rules for attribute
schema:Book
book:1
CSVuse predicate schema:title
literal uses value from column title
language of the title is English
Homo Deus
@en
schema:title
CSV
41
42. Create rules for interlinked entities
schema:Book
book:1
CSV
use predicate schema:author
join condition: names match
foaf:Person
author:ynh
schema:author
JSON
42
48. Discussion
Advantages
Use knowledge embedded in Linked Data examples
Minimize errors and user interaction
Approach can be combined with other approaches
Disadvantages
Linked Data example is required
User action might still be required for special cases
48
49. Recap
Use cases can have Linked Data example available.
Example contains knowledge to create rules.
We introduced approach that uses this knowledge.
This approach can be combined with other approaches.
49