SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
The SPARQL Anything project
Enrico Daga and Luigi Asprino
The Web Conference - Developers Track
22/04/2021 - online @enridaga
Background
• Semantic Web developers always concerned with methods to
“lift” legacy content to RDF:
• Targeting specific types/formats: SPARQL Microservices
[Michel, 2019], Tarql, Any23, JSON2RDF, CSV2RDF
• Mapping languages, several types of (e.g. RML,
ShexML): high learning demands. [Dimou, 2014]
[García-González, 2020]
• SPARQL Generate: learning demands, difficult to extend
to other formats. [Lefrançois, 2017]
• Solutions transfer data source complexity to the user (e.g.
know XPath for XML, JsonPath for JSON, …)
• End-user development [Lieberman, 2006]. Many SPARQL
users fall into the category of end-user developer. In a recent
survey, 42% SPARQL users are from non-IT areas,
including social sciences and the humanities, business and
economics, and biomedical, engineering or physical sciences.
SPICE
Social Cohesion, Participation and Inclusion
through Cultural Engagement
Polifonia
Digital Harmoniser of Musical Cultural Heritage
-
Cultural Heritage Knowledge Graphs
-
Sources in different formats
x
Multiple / unknown ontologies
=
Duplication of effort!!!
https://spice.kmi.open.ac.uk/
http://spice-h2020.eu
https://polifonia-project.eu/
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme
Knowledge Graph Construction
Composite process:
• Observe: the data source (e.g. a CSV file)
• Map: develop mappings to a target ontology
• Triplify: run the mappings and evaluate the result
• (many iterations)
KG construction is a twofold job:
• perform a syntax/structure conversion (e.g. from CSV to RDF)
• project semantics onto the data (applying a domain ontology)
Concept
… twofold job:
• perform a syntax/structure conversion -> Re-engineering
• We want to solve this problem once and for all
• project semantics onto the data (applying a domain ontology) -> Re-modelling
• We leave this to the end user, powered by SPARQL 1.1
• Approach: design a single RDF facade for any data format
• Re-engineering
• Focus on the syntax and the meta-model (data structure)
• Leave data as much as possible as-it-is!
• apply the least possible “ontological commitment”
https://en.wikipedia.org/wiki/Facade_pattern
An RDF Facade?
Problem Space
• CSV
• JSON
• HTML
• XML
• Binary (JPEG, PNG, …)
• Text
Solution Space
• https://www.w3.org/TR/rdf11-concepts/
• https://www.w3.org/TR/rdf-schema/
rdf:type, rdf:Property, rdfs:label,
rdfs:Resource, rdfs:Class, rdf:Bag,
rdfs:Container, rdf:List, RDF Dataset,
Graph, …
Facade-X: (to be filled by picking and mixing from the solution space)
Ups! We are facing the same old problem … only this time we don’t care about the content
(domain) and we only focus on the format and data structure (meta-model)
CSV
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
id,name,gender,dates,yearOfBirth,yearOfDeath,placeOfBirth,placeOfDeath,url
10093,"Abakanowicz, Magdalena",Female,born 1930,1930,,Polska,,http://www.tate.org.uk/art/artists/magdalena-abakanowicz-10093
…
https://github.com/tategallery/collection/blob/master/artist_data.csv
[ a fx:root ;
rdf:_1 [ xyz:dates "born 1930" ;
xyz:gender "Female" ;
xyz:id "10093" ;
xyz:name "Abakanowicz, Magdalena" ;
xyz:placeOfBirth "Polska" ;
xyz:placeOfDeath "" ;
xyz:url "http://www.tate.org.uk/art/artists/magdalena-
abakanowicz-10093" ;
xyz:yearOfBirth "1930" ;
xyz:yearOfDeath ""
] ;
csv.headers=true|false
[ a fx:root ;
rdf:_1 [ rdf:_1 "id" ;
rdf:_2 "name" ;
rdf:_3 "gender" ;
rdf:_4 "dates" ;
rdf:_5 "yearOfBirth" ;
rdf:_6 "yearOfDeath" ;
rdf:_7 "placeOfBirth" ;
rdf:_8 “placeOfDeath" ;
rdf:_9 "url"
] ;
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
@enridaga
JSON
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
https://github.com/tategallery/collection/artworks/t/023/t02319-9205.json
[ a fx:root ;
xyz:acno "T02319" ;
xyz:acquisitionYear "1978"^^<http://www.w3.org/2001/XMLSchema#int> ;
xyz:all_artists "Kazimir Malevich" ;
xyz:catalogueGroup [] ;
xyz:classification "painting" ;
xyz:contributorCount "1"^^<http://www.w3.org/2001/XMLSchema#int> ;
…
{
"acno": "T02319",
"acquisitionYear": 1978,
"all_artists": "Kazimir Malevich",
"catalogueGroup": {},
"classification": "painting",
"contributorCount": 1,
"contributors": [
{
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
DOM (HTML, XML, …)
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
rdf:type rdf:type rdf:Property
https://imma.ie/artists/
[ a fx:root , xhtml:div ;
xhtml:id “az-group” ;
rdf:_1 [ a xhtml:div ;
rdf:_1 [ a xhtml:h4 ;
rdf:_1 "A" ;
<https://html.spec.whatwg.org/#innerHTML>
"A" ;
<https://html.spec.whatwg.org/#innerText>
"A"
] ;
…
html.selector=#az-group
@prefix xhtml: <http://www.w3.org/1999/xhtml#> .
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
Binary and Text
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
xsd:base64Binary a rdfs:Datatype.
rdf:type df:type rdf:Property
[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> “/9j/
4AAQSkZJRgABAQEASABIAAD/
4QmsRXhpZgAASUkqAAgAAAALAA8BAgAGAAAAkgAAABABAgAOAAAAmAAAABIBAw
ABAAAAAQAAABoBBQABAAAApgAAABsBBQABAAAArgAAACgBAwABAAAAAgAAADEB
AgALAAAAtgAAADIBAgAUAAAAwgAAABMCAwABAAAAAgAAAGmHBAABAAAA1gAAAC
WIBAABAAAA0gMAAOQDAABDYW5vbgBDYW5vbiBFT1MgNDBEAEgAAAABAAAASAAA
AAEAAABHSU1QIDIuNC41AAAyMDA4OjA3OjMxIDEwOjM4OjExAB4Am…”^^<http
://www.w3.org/2001/XMLSchema#base64Binary> ] .
bin.encoding # BASE64
txt.regex # tokenise into a sequence
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
https://imma.ie/collection/freeing-the-voice/
Hello World! [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "Hello World!" ] .
https://sparql-anything.cc/
https://
github.com/
SPARQL-
Anything/
showcase-tate
Assumption: SPARQL
1.1 CONSTRUCT
queries will be enough
to design mappings (the
re-modelling phase)
https://github.com/
SPARQL-Anything/
showcase-imma
https://imma.ie/collection/freeing-the-memory/
Preliminary feedback
• From 27 users, diverse SPARQL expertise
• Essential or very important
• the system should minimise the languages or syntaxes needed
• mappings should be easy to read and interpret
• the system must be easy to learn for a Semantic Web practitioner
• the system is able to support new types of data sources without changes to the mapping language
• How easy is this code to understand
(comparing equivalent mappings)?
• (a) RML
• (b) SPARQL Generate
• (c) SPARQL Anything
Benefits
• Transform / Query resources having heterogeneous formats
• Low learning demands (plain SPARQL 1.1)
• Minimise complexity of the mappings
• A single+consistent abstraction for any data format
• Enable data exploration in the absence of a domain ontology
• Integrate with a typical Semantic Web engineering workflow
• Flexible and adaptable (Facade-X can be extended, if needed)
• Easy to extend:
• new transformers just need to return the facade
• no major changes to the user experience
Challenges
• No commitment on the internal machinery! (It is a gift and a curse …)
• Current version v0.1.1 (we started Nov 2020):
• implemented on top of Apache Jena ARQ
• limited to files
• loads the triples in-memory and then performs the query
• A triple filtering strategy reduces in-memory dataset
• Very large files require very large memory
• Next: to develop strategies to cope with large resources (e.g. slicing)
• Next: to develop query-rewriting strategies, eventually rewriting mappings into efficient,
iterator-based transformers (mapping translation [Corcho 2020])
• Next: Relational Database, No-SQL (e.g. mongoDB)
• Reuse existing approaches (e.g. OBDA) but hide complexity to the user
Get in touch!
SPARQL Anything is under active development
https://github.com/SPARQL-Anything/sparql.anything
enrico.daga@open.ac.uk
@enridaga
www.enridaga.net
References
• Daga, E., Asprino, L., Mulholland, P., Gangemi, A.: Facade-x: an opinionated approach to sparql anything (submitted). In: SEMANTiCS 2021:
17th International Conference on Semantic Systems (2021)
• Daga, E., Meroño-Peñuela, A., Motta, E.: Sequential linked data: the state of affairs. Semantic Web (2021)
• Warren, P., Mulholland, P.: Using sparql–the practitioners’ viewpoint. In: European Knowledge Acquisition Workshop. pp. 485–500. Springer
(2018)
• Corcho, O., Priyatna, F., Chaves-Fraga, D.: Towards a new generation of ontology based data access. Semantic Web 11(1), 153–160 (2020)
• Michel, F., Faron-Zucker, C., Corby, O., Gandon, F.: Enabling automatic discovery and querying of web apis at web scale using linked data
standards. In: Companion Proceedings of The 2019 World Wide Web Conference. pp. 883–892 (2019)
• Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: Rml: a generic language for integrated rdf mappings
of heterogeneous data. In: 7th Workshop on Linked Data on the Web (2014)
• García-González, H., Boneva, I., Staworko, S., Labra-Gayo, J.E., Lovelle, J.M.C.: Shexml: improving the usability of heterogeneous data
mapping languages for firsttime users. PeerJ Computer Science 6, e318 (2020)
• Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., et al.: The state
of the art in enduser software engineering. ACM Computing Surveys (CSUR) 43(3), 1–44 (2011)
• Lefrançois, M., Zimmermann, A., Bakerally, N.: A sparql extension for generating rdf from heterogeneous formats. In: European Semantic Web
Conference. pp. 35– 50. Springer (2017)
• Lieberman, H., Paternò, F., Klann, M., Wulf, V.: End-user development: An emerging paradigm. In: End user development, pp. 1–8. Springer
(2006)
• Cyganiak, Richard. Tarql (sparql for tables): Turn csv into rdf using sparql syntax. Technical Report, 2015. http://tarql. github. io, 2015.

Weitere ähnliche Inhalte

Was ist angesagt?

Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Mapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked DataMapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked DataPeter Haase
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introductionGraphity
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudRichard Cyganiak
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic WebJan Beeck
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-tonvitucci
 

Was ist angesagt? (12)

Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
 
RDF data model
RDF data modelRDF data model
RDF data model
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
Mapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked DataMapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked Data
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introduction
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic Web
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 

Ähnlich wie The SPARQL Anything project

Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsMaxime Lefrançois
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introductionHektor Jacynycz García
 
Two heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshopTwo heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshopYuji Nonaka
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFDimitris Kontokostas
 
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...Franck Michel
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataPaco Nathan
 
JSON-LD update DC 2017
JSON-LD update DC 2017JSON-LD update DC 2017
JSON-LD update DC 2017Gregg Kellogg
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Kai Eckert
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
 
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case StudyLeveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case StudyLuca Berardinelli
 
Hadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindHadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindEMC
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM4Science
 

Ähnlich wie The SPARQL Anything project (20)

Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Spark meetup TCHUG
Spark meetup TCHUGSpark meetup TCHUG
Spark meetup TCHUG
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developments
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
Two heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshopTwo heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshop
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
JSON-LD update DC 2017
JSON-LD update DC 2017JSON-LD update DC 2017
JSON-LD update DC 2017
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case StudyLeveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
 
Changing Platforms
Changing PlatformsChanging Platforms
Changing Platforms
 
Hadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindHadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilind
 
Grails goes Graph
Grails goes GraphGrails goes Graph
Grails goes Graph
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM
 

Mehr von Enrico Daga

Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyEnrico Daga
 
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...Enrico Daga
 
Capturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchCapturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchEnrico Daga
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Enrico Daga
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchEnrico Daga
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachEnrico Daga
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Enrico Daga
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterEnrico Daga
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesEnrico Daga
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User StudyEnrico Daga
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsEnrico Daga
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionEnrico Daga
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsEnrico Daga
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEnrico Daga
 

Mehr von Enrico Daga (16)

Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data Journey
 
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
 
Capturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchCapturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities research
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...
 
Ld4 dh tutorial
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorial
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data Cluster
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selection
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 

Kürzlich hochgeladen

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Kürzlich hochgeladen (20)

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 

The SPARQL Anything project

  • 1. The SPARQL Anything project Enrico Daga and Luigi Asprino The Web Conference - Developers Track 22/04/2021 - online @enridaga
  • 2. Background • Semantic Web developers always concerned with methods to “lift” legacy content to RDF: • Targeting specific types/formats: SPARQL Microservices [Michel, 2019], Tarql, Any23, JSON2RDF, CSV2RDF • Mapping languages, several types of (e.g. RML, ShexML): high learning demands. [Dimou, 2014] [García-González, 2020] • SPARQL Generate: learning demands, difficult to extend to other formats. [Lefrançois, 2017] • Solutions transfer data source complexity to the user (e.g. know XPath for XML, JsonPath for JSON, …) • End-user development [Lieberman, 2006]. Many SPARQL users fall into the category of end-user developer. In a recent survey, 42% SPARQL users are from non-IT areas, including social sciences and the humanities, business and economics, and biomedical, engineering or physical sciences.
  • 3. SPICE Social Cohesion, Participation and Inclusion through Cultural Engagement Polifonia Digital Harmoniser of Musical Cultural Heritage - Cultural Heritage Knowledge Graphs - Sources in different formats x Multiple / unknown ontologies = Duplication of effort!!! https://spice.kmi.open.ac.uk/ http://spice-h2020.eu https://polifonia-project.eu/ This project has received funding from the European Union’s Horizon 2020 research and innovation programme
  • 4. Knowledge Graph Construction Composite process: • Observe: the data source (e.g. a CSV file) • Map: develop mappings to a target ontology • Triplify: run the mappings and evaluate the result • (many iterations) KG construction is a twofold job: • perform a syntax/structure conversion (e.g. from CSV to RDF) • project semantics onto the data (applying a domain ontology)
  • 5. Concept … twofold job: • perform a syntax/structure conversion -> Re-engineering • We want to solve this problem once and for all • project semantics onto the data (applying a domain ontology) -> Re-modelling • We leave this to the end user, powered by SPARQL 1.1 • Approach: design a single RDF facade for any data format • Re-engineering • Focus on the syntax and the meta-model (data structure) • Leave data as much as possible as-it-is! • apply the least possible “ontological commitment” https://en.wikipedia.org/wiki/Facade_pattern
  • 6. An RDF Facade? Problem Space • CSV • JSON • HTML • XML • Binary (JPEG, PNG, …) • Text Solution Space • https://www.w3.org/TR/rdf11-concepts/ • https://www.w3.org/TR/rdf-schema/ rdf:type, rdf:Property, rdfs:label, rdfs:Resource, rdfs:Class, rdf:Bag, rdfs:Container, rdf:List, RDF Dataset, Graph, … Facade-X: (to be filled by picking and mixing from the solution space) Ups! We are facing the same old problem … only this time we don’t care about the content (domain) and we only focus on the format and data structure (meta-model)
  • 7. CSV Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . id,name,gender,dates,yearOfBirth,yearOfDeath,placeOfBirth,placeOfDeath,url 10093,"Abakanowicz, Magdalena",Female,born 1930,1930,,Polska,,http://www.tate.org.uk/art/artists/magdalena-abakanowicz-10093 … https://github.com/tategallery/collection/blob/master/artist_data.csv [ a fx:root ; rdf:_1 [ xyz:dates "born 1930" ; xyz:gender "Female" ; xyz:id "10093" ; xyz:name "Abakanowicz, Magdalena" ; xyz:placeOfBirth "Polska" ; xyz:placeOfDeath "" ; xyz:url "http://www.tate.org.uk/art/artists/magdalena- abakanowicz-10093" ; xyz:yearOfBirth "1930" ; xyz:yearOfDeath "" ] ; csv.headers=true|false [ a fx:root ; rdf:_1 [ rdf:_1 "id" ; rdf:_2 "name" ; rdf:_3 "gender" ; rdf:_4 "dates" ; rdf:_5 "yearOfBirth" ; rdf:_6 "yearOfDeath" ; rdf:_7 "placeOfBirth" ; rdf:_8 “placeOfDeath" ; rdf:_9 "url" ] ; CSV JSON HTML XML Binary (JPEG, PNG, …) Text @enridaga
  • 8. JSON Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. https://github.com/tategallery/collection/artworks/t/023/t02319-9205.json [ a fx:root ; xyz:acno "T02319" ; xyz:acquisitionYear "1978"^^<http://www.w3.org/2001/XMLSchema#int> ; xyz:all_artists "Kazimir Malevich" ; xyz:catalogueGroup [] ; xyz:classification "painting" ; xyz:contributorCount "1"^^<http://www.w3.org/2001/XMLSchema#int> ; … { "acno": "T02319", "acquisitionYear": 1978, "all_artists": "Kazimir Malevich", "catalogueGroup": {}, "classification": "painting", "contributorCount": 1, "contributors": [ { CSV JSON HTML XML Binary (JPEG, PNG, …) Text
  • 9. DOM (HTML, XML, …) Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. rdf:type rdf:type rdf:Property https://imma.ie/artists/ [ a fx:root , xhtml:div ; xhtml:id “az-group” ; rdf:_1 [ a xhtml:div ; rdf:_1 [ a xhtml:h4 ; rdf:_1 "A" ; <https://html.spec.whatwg.org/#innerHTML> "A" ; <https://html.spec.whatwg.org/#innerText> "A" ] ; … html.selector=#az-group @prefix xhtml: <http://www.w3.org/1999/xhtml#> . CSV JSON HTML XML Binary (JPEG, PNG, …) Text
  • 10. Binary and Text Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. xsd:base64Binary a rdfs:Datatype. rdf:type df:type rdf:Property [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> “/9j/ 4AAQSkZJRgABAQEASABIAAD/ 4QmsRXhpZgAASUkqAAgAAAALAA8BAgAGAAAAkgAAABABAgAOAAAAmAAAABIBAw ABAAAAAQAAABoBBQABAAAApgAAABsBBQABAAAArgAAACgBAwABAAAAAgAAADEB AgALAAAAtgAAADIBAgAUAAAAwgAAABMCAwABAAAAAgAAAGmHBAABAAAA1gAAAC WIBAABAAAA0gMAAOQDAABDYW5vbgBDYW5vbiBFT1MgNDBEAEgAAAABAAAASAAA AAEAAABHSU1QIDIuNC41AAAyMDA4OjA3OjMxIDEwOjM4OjExAB4Am…”^^<http ://www.w3.org/2001/XMLSchema#base64Binary> ] . bin.encoding # BASE64 txt.regex # tokenise into a sequence CSV JSON HTML XML Binary (JPEG, PNG, …) Text https://imma.ie/collection/freeing-the-voice/ Hello World! [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "Hello World!" ] .
  • 12. https:// github.com/ SPARQL- Anything/ showcase-tate Assumption: SPARQL 1.1 CONSTRUCT queries will be enough to design mappings (the re-modelling phase)
  • 14. Preliminary feedback • From 27 users, diverse SPARQL expertise • Essential or very important • the system should minimise the languages or syntaxes needed • mappings should be easy to read and interpret • the system must be easy to learn for a Semantic Web practitioner • the system is able to support new types of data sources without changes to the mapping language • How easy is this code to understand (comparing equivalent mappings)? • (a) RML • (b) SPARQL Generate • (c) SPARQL Anything
  • 15. Benefits • Transform / Query resources having heterogeneous formats • Low learning demands (plain SPARQL 1.1) • Minimise complexity of the mappings • A single+consistent abstraction for any data format • Enable data exploration in the absence of a domain ontology • Integrate with a typical Semantic Web engineering workflow • Flexible and adaptable (Facade-X can be extended, if needed) • Easy to extend: • new transformers just need to return the facade • no major changes to the user experience
  • 16. Challenges • No commitment on the internal machinery! (It is a gift and a curse …) • Current version v0.1.1 (we started Nov 2020): • implemented on top of Apache Jena ARQ • limited to files • loads the triples in-memory and then performs the query • A triple filtering strategy reduces in-memory dataset • Very large files require very large memory • Next: to develop strategies to cope with large resources (e.g. slicing) • Next: to develop query-rewriting strategies, eventually rewriting mappings into efficient, iterator-based transformers (mapping translation [Corcho 2020]) • Next: Relational Database, No-SQL (e.g. mongoDB) • Reuse existing approaches (e.g. OBDA) but hide complexity to the user
  • 17. Get in touch! SPARQL Anything is under active development https://github.com/SPARQL-Anything/sparql.anything enrico.daga@open.ac.uk @enridaga www.enridaga.net
  • 18. References • Daga, E., Asprino, L., Mulholland, P., Gangemi, A.: Facade-x: an opinionated approach to sparql anything (submitted). In: SEMANTiCS 2021: 17th International Conference on Semantic Systems (2021) • Daga, E., Meroño-Peñuela, A., Motta, E.: Sequential linked data: the state of affairs. Semantic Web (2021) • Warren, P., Mulholland, P.: Using sparql–the practitioners’ viewpoint. In: European Knowledge Acquisition Workshop. pp. 485–500. Springer (2018) • Corcho, O., Priyatna, F., Chaves-Fraga, D.: Towards a new generation of ontology based data access. Semantic Web 11(1), 153–160 (2020) • Michel, F., Faron-Zucker, C., Corby, O., Gandon, F.: Enabling automatic discovery and querying of web apis at web scale using linked data standards. In: Companion Proceedings of The 2019 World Wide Web Conference. pp. 883–892 (2019) • Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: Rml: a generic language for integrated rdf mappings of heterogeneous data. In: 7th Workshop on Linked Data on the Web (2014) • García-González, H., Boneva, I., Staworko, S., Labra-Gayo, J.E., Lovelle, J.M.C.: Shexml: improving the usability of heterogeneous data mapping languages for firsttime users. PeerJ Computer Science 6, e318 (2020) • Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., et al.: The state of the art in enduser software engineering. ACM Computing Surveys (CSUR) 43(3), 1–44 (2011) • Lefrançois, M., Zimmermann, A., Bakerally, N.: A sparql extension for generating rdf from heterogeneous formats. In: European Semantic Web Conference. pp. 35– 50. Springer (2017) • Lieberman, H., Paternò, F., Klann, M., Wulf, V.: End-user development: An emerging paradigm. In: End user development, pp. 1–8. Springer (2006) • Cyganiak, Richard. Tarql (sparql for tables): Turn csv into rdf using sparql syntax. Technical Report, 2015. http://tarql. github. io, 2015.