SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Ontology-Based Data Access Mapping Generation
via Data, Schema, Query, and Mapping Knowledge
Pieter Heyvaert
pheyvaer.heyvaert@ugent.be
Semantic Web technologies rely on Linked Data
querying
visualizations
publishing
But not all data is accessible as Linked Data
databases
XML files
JSON files
Solutions to provide access exist
manual: completely done by the user
semi-automatic: users provide feedback
automatic: no user interaction required
But they have limitations
limited to specific use cases
limited support for complex use cases
PhD’s goal: improve access to Linked Data
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
How do we provide access?
non-Linked
Data
Linked
Data
?
How do we provide access?
non-Linked
Data
Linked
Data
?
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
Apply mappings on non-Linked Data
non-Linked
Data
Linked
Data
mapping
mapping: rules to generate RDF terms and triples using data and ontologies
Apply mappings on non-Linked Data
non-Linked Data Linked Datamapping
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
rule: create url from id
rule: name is value for ex:fullname
rule: if genre is ‘fiction’
class is ex:FictionAuthor
else
class is ex:NonFictionAuthor
Apply mappings on non-Linked Data
non-Linked Data Linked Datamapping
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
ex:0 a ex:FictionAuthor .
ex:0 ex:fullname ‘J.K. Rowling’ .
ex:1 a ex:NonFictionAuthor .
ex:1 ex:fullname ‘George Orwell’ .
Mappings need to be created
from scratch (single-scenario use case)
mapping A
by reusing previous mappings (multi-scenario use case)
mapping B mapping C
mapping
(Semi-)automatic methods are preferred
mapping
manual
(semi-)automatic
Still a number of challenges left
dealing complex data (schemas)
not all techniques work on single-scenario use cases
Dealing with complex data (schemas)
e.g., when the class of an entity does not depend on the table, but on a value
rule: if genre is ‘fiction’,
class is ex:FictionAuthor
else
class is ex:NonFictionAuthor
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
Not all techniques work on single-scenario use cases
scenario A scenario Bmulti
single
because they rely on readily-available previous mappings
mapping
results in reuse
? scenario B?
results in reuse
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Current solutions
What knowledge is used?
How is this knowledge used?
What knowledge is not used?
What do current solutions use?
knowledge from the mapping process
existing knowledge outside the mapping process
Knowledge from mapping process is used
data
data schema
ontologies
not all elements are required
Existing knowledge is used
data
data schemas
mappings
ontologies
Linked Data
not all elements are required
How is all this knowledge used?
data schema + existing ontology
data + existing mapping
Data schema + existing ontology
data schema
new ontology
1
Data schema + existing ontology
data schema
existing ontologynew ontology match
1
2 2
Data schema + existing ontology
data schema
existing ontologynew ontology match
mapping
1
2 2
3
Data + existing mapping
data
classesproperties
1
Data + existing mapping
data existing mapping
classesproperties classespropertiesmodel
1
2 2
2
Data + existing mapping
data existing mapping
classes
mapping
properties classespropertiesmodel
1
2 2
2
3
3 3
These methods are not combined
only a single method is used
combining multiple methods has not been explored
What knowledge do current solutions not use?
not all knowledge from previous mappings
neglect query workload
Not all knowledge from previous mappings is used
data transformations
to lowercase
substring
conditions: if-else rules
Query workload is neglected
queries to be executed on the non-existing Linked Dataset
queries contains knowledge
model
used ontologies
annotations
select * where {
?s a ex:FictionAuthor .
?s ex:fullname ?n .
}
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
ontology to use: http://example.com
model + annotations: ex:FictionAuthor
ex:fullname
How can we use queries?
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Research questions
discover existing knowledge
use discovered knowledge
Question 1: how can we discover
existing knowledge that is relevant?
?mappings
ontologies
(Linked) Data
query workload
data schema
existing
mapping
Question 2: how can we use the discovered knowledge
to generate a new mapping?
mapping
mappings
ontologies
(Linked) Data
query workload
data
data schema
ontologies
query workload
data schema
existing mapping process
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Hypotheses
improve quality
decrease task complexity
Hypothesis 1: using existing knowledge improves
the quality of a new single-scenario mapping.
quality → fitness for use
Hypothesis 2: using existing knowledge
decreases the task complexity of the mapping process.
Lui and Li developed model to measure task complexity.
5 characteristics that influence the task’s performance
Task complexity has 5 characteristics
input: e.g., data, ontologies, user feedback
output: Linked Data, mapping
process: steps, user actions
duration: time to complete task
presentation: user interface
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Two aspects need to be tackled
discover existing knowledge
use knowledge
both can be tackled separately
Discover existing knowledge
infer knowledge from mapping process where possible
find relevant other existing knowledge via similarity metrics
Infer knowledge from mapping process
e.g., infer data schema from data
e.g., infer ontology from queries
Infer data schema from data
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
table: authors
columns: id, name, genre
id: index, integer
name: string
genre: string (‘fiction’ or ‘non-fiction’)
Infer ontology from queries
select * where {
?s a ex:FictionAuthor .
?s ex:fullname ?n .
}
http://example.com
Find relevant existing knowledge via similarity metrics
mapping process
mapping
1. determine similarity
2. consider in mapping process
existing
table: authors
columns: id, name, genre
id: index, integer, unique
name: string
genre: string (‘fiction’ or
‘non-fiction’)
table: author
columns: id, fullname,
genres
id: index, integer
fullname: string
genres: string
Similarity metrics on different/combination of elements
metrics on data schema, ontologies, data, and query workload
PhD:
Which metrics do we use?
How do we combine the different metrics?
Two aspects need to be tackled
discover existing knowledge
use knowledge
Use knowledge
work with existing methods, e.g.:
data schema + existing ontology
data + existing mappings
PhD:
how do we include new knowledge?
how do we combine these methods?
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Preliminary Results
RMLEditor
RMLWorkbench
mapping generation approaches
hierarchical data analysis
RMLEditor eases the creation of mappings
GUI so domain experts can create mappings
users can view the data, mappings, and RDF triples
usable by both non-SW and SW experts
PhD: present mappings to get feedback during mapping process
RMLWorkbench eases generation and publication
graphical user interface so domain experts can administer
Linked Data generation
publication workflow
PhD: manage elements of the mapping generation process
Identified mapping generation approaches
data-driven
schema-driven
model-driven
result-driven
PhD:
provides insights on how users work
this can be applied when developing an (semi-)automatic approach
Developed tool for data analysis on hierarchical data
efficient discovery of unique identifiers in hierarchical data
PhD: to infer knowledge within the mapping process
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Evaluation Plan
mapping quality
task complexity
Evaluate mapping quality
existing benchmark RODI
great for tabular data
no support for other formats, such as hierarchical data formats
Evaluate task complexity via 5 characteristics
input: e.g., data, ontologies, user feedback
output: Linked Data, mapping
process: steps, user actions
duration: time to complete task
presentation: user interface
Limited in current evaluations to single aspect
only duration
only number of user actions
only precision and recall
Roundup
improve single-scenario mappings by discovering and using existing knowledge
What similarity metrics we use for discovery?
How do we use and combine
the different methods and knowledge?

Weitere ähnliche Inhalte

Was ist angesagt?

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Dirk Lewandowski
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsMatthias Braunhofer
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsUniversity of Bergen
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluationijnlc
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentationnirvdrum
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsMohammad Aliannejadi
 
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question GenerationDynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question Generationijtsrd
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsMatthias Braunhofer
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USAIadh Ounis
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsMatthias Braunhofer
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingJoe Raad
 
Carma internet research module n-bias
Carma internet research module   n-biasCarma internet research module   n-bias
Carma internet research module n-biasSyracuse University
 
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsMarco Rossetti
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queriesiosrjce
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
 

Was ist angesagt? (20)

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender Systems
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
 
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question GenerationDynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender Systems
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USA
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start Recommendations
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema Matching
 
Carma internet research module n-bias
Carma internet research module   n-biasCarma internet research module   n-bias
Carma internet research module n-bias
 
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queries
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
 

Ähnlich wie Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge

313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptxsameernsn1
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data ConferenceDataTactics
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationRich Heimann
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature reviewKashif Hussain
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Riccardo Albertoni
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabusanoop bk
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfRAKESHG79
 
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsAn Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsGregor Polančič
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
 
Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networkingStenio Fernandes
 
Lec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustLec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustMenchita Falcutila Dumlao
 

Ähnlich wie Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge (20)

Phd thesis final presentation
Phd thesis   final presentationPhd thesis   final presentation
Phd thesis final presentation
 
Topic modeling
Topic modelingTopic modeling
Topic modeling
 
OpenSciMatch
OpenSciMatchOpenSciMatch
OpenSciMatch
 
Topic model
Topic modelTopic model
Topic model
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics Corporation
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature review
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystemDigital repertoires of poetry metrics: towards a Linked Open Data ecosystem
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabus
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdf
 
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsAn Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networking
 
Lec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustLec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrust
 
My experiment
My experimentMy experiment
My experiment
 

Mehr von Pieter Heyvaert

Semi-Automatic Example-Driven Linked Data Mapping Creation
Semi-Automatic  Example-Driven Linked Data Mapping CreationSemi-Automatic  Example-Driven Linked Data Mapping Creation
Semi-Automatic Example-Driven Linked Data Mapping CreationPieter Heyvaert
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsPieter Heyvaert
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Pieter Heyvaert
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsPieter Heyvaert
 
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Pieter Heyvaert
 
FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)Pieter Heyvaert
 
Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Pieter Heyvaert
 

Mehr von Pieter Heyvaert (7)

Semi-Automatic Example-Driven Linked Data Mapping Creation
Semi-Automatic  Example-Driven Linked Data Mapping CreationSemi-Automatic  Example-Driven Linked Data Mapping Creation
Semi-Automatic Example-Driven Linked Data Mapping Creation
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping Definitions
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
 
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
 
FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)
 
Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)
 

Kürzlich hochgeladen

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 

Kürzlich hochgeladen (20)

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 

Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge

  • 1. Ontology-Based Data Access Mapping Generation via Data, Schema, Query, and Mapping Knowledge Pieter Heyvaert pheyvaer.heyvaert@ugent.be
  • 2. Semantic Web technologies rely on Linked Data querying visualizations publishing
  • 3. But not all data is accessible as Linked Data databases XML files JSON files
  • 4. Solutions to provide access exist manual: completely done by the user semi-automatic: users provide feedback automatic: no user interaction required
  • 5. But they have limitations limited to specific use cases limited support for complex use cases
  • 6. PhD’s goal: improve access to Linked Data
  • 7. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 8. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 9. How do we provide access? non-Linked Data Linked Data ?
  • 10. How do we provide access? non-Linked Data Linked Data ? id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors
  • 11. Apply mappings on non-Linked Data non-Linked Data Linked Data mapping mapping: rules to generate RDF terms and triples using data and ontologies
  • 12. Apply mappings on non-Linked Data non-Linked Data Linked Datamapping id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors rule: create url from id rule: name is value for ex:fullname rule: if genre is ‘fiction’ class is ex:FictionAuthor else class is ex:NonFictionAuthor
  • 13. Apply mappings on non-Linked Data non-Linked Data Linked Datamapping id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors ex:0 a ex:FictionAuthor . ex:0 ex:fullname ‘J.K. Rowling’ . ex:1 a ex:NonFictionAuthor . ex:1 ex:fullname ‘George Orwell’ .
  • 14. Mappings need to be created from scratch (single-scenario use case) mapping A by reusing previous mappings (multi-scenario use case) mapping B mapping C mapping
  • 15. (Semi-)automatic methods are preferred mapping manual (semi-)automatic
  • 16. Still a number of challenges left dealing complex data (schemas) not all techniques work on single-scenario use cases
  • 17. Dealing with complex data (schemas) e.g., when the class of an entity does not depend on the table, but on a value rule: if genre is ‘fiction’, class is ex:FictionAuthor else class is ex:NonFictionAuthor id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors
  • 18. Not all techniques work on single-scenario use cases scenario A scenario Bmulti single because they rely on readily-available previous mappings mapping results in reuse ? scenario B? results in reuse
  • 19. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 20. Current solutions What knowledge is used? How is this knowledge used? What knowledge is not used?
  • 21. What do current solutions use? knowledge from the mapping process existing knowledge outside the mapping process
  • 22. Knowledge from mapping process is used data data schema ontologies not all elements are required
  • 23. Existing knowledge is used data data schemas mappings ontologies Linked Data not all elements are required
  • 24. How is all this knowledge used? data schema + existing ontology data + existing mapping
  • 25. Data schema + existing ontology data schema new ontology 1
  • 26. Data schema + existing ontology data schema existing ontologynew ontology match 1 2 2
  • 27. Data schema + existing ontology data schema existing ontologynew ontology match mapping 1 2 2 3
  • 28. Data + existing mapping data classesproperties 1
  • 29. Data + existing mapping data existing mapping classesproperties classespropertiesmodel 1 2 2 2
  • 30. Data + existing mapping data existing mapping classes mapping properties classespropertiesmodel 1 2 2 2 3 3 3
  • 31. These methods are not combined only a single method is used combining multiple methods has not been explored
  • 32. What knowledge do current solutions not use? not all knowledge from previous mappings neglect query workload
  • 33. Not all knowledge from previous mappings is used data transformations to lowercase substring conditions: if-else rules
  • 34. Query workload is neglected queries to be executed on the non-existing Linked Dataset queries contains knowledge model used ontologies annotations
  • 35. select * where { ?s a ex:FictionAuthor . ?s ex:fullname ?n . } id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors ontology to use: http://example.com model + annotations: ex:FictionAuthor ex:fullname How can we use queries?
  • 36. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 37. Research questions discover existing knowledge use discovered knowledge
  • 38. Question 1: how can we discover existing knowledge that is relevant? ?mappings ontologies (Linked) Data query workload data schema existing mapping
  • 39. Question 2: how can we use the discovered knowledge to generate a new mapping? mapping mappings ontologies (Linked) Data query workload data data schema ontologies query workload data schema existing mapping process
  • 40. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 42. Hypothesis 1: using existing knowledge improves the quality of a new single-scenario mapping. quality → fitness for use
  • 43. Hypothesis 2: using existing knowledge decreases the task complexity of the mapping process. Lui and Li developed model to measure task complexity. 5 characteristics that influence the task’s performance
  • 44. Task complexity has 5 characteristics input: e.g., data, ontologies, user feedback output: Linked Data, mapping process: steps, user actions duration: time to complete task presentation: user interface
  • 45. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 46. Two aspects need to be tackled discover existing knowledge use knowledge both can be tackled separately
  • 47. Discover existing knowledge infer knowledge from mapping process where possible find relevant other existing knowledge via similarity metrics
  • 48. Infer knowledge from mapping process e.g., infer data schema from data e.g., infer ontology from queries
  • 49. Infer data schema from data id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors table: authors columns: id, name, genre id: index, integer name: string genre: string (‘fiction’ or ‘non-fiction’)
  • 50. Infer ontology from queries select * where { ?s a ex:FictionAuthor . ?s ex:fullname ?n . } http://example.com
  • 51. Find relevant existing knowledge via similarity metrics mapping process mapping 1. determine similarity 2. consider in mapping process existing table: authors columns: id, name, genre id: index, integer, unique name: string genre: string (‘fiction’ or ‘non-fiction’) table: author columns: id, fullname, genres id: index, integer fullname: string genres: string
  • 52. Similarity metrics on different/combination of elements metrics on data schema, ontologies, data, and query workload PhD: Which metrics do we use? How do we combine the different metrics?
  • 53. Two aspects need to be tackled discover existing knowledge use knowledge
  • 54. Use knowledge work with existing methods, e.g.: data schema + existing ontology data + existing mappings PhD: how do we include new knowledge? how do we combine these methods?
  • 55. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 56. Preliminary Results RMLEditor RMLWorkbench mapping generation approaches hierarchical data analysis
  • 57. RMLEditor eases the creation of mappings GUI so domain experts can create mappings users can view the data, mappings, and RDF triples usable by both non-SW and SW experts PhD: present mappings to get feedback during mapping process
  • 58. RMLWorkbench eases generation and publication graphical user interface so domain experts can administer Linked Data generation publication workflow PhD: manage elements of the mapping generation process
  • 59. Identified mapping generation approaches data-driven schema-driven model-driven result-driven PhD: provides insights on how users work this can be applied when developing an (semi-)automatic approach
  • 60. Developed tool for data analysis on hierarchical data efficient discovery of unique identifiers in hierarchical data PhD: to infer knowledge within the mapping process
  • 61. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 63. Evaluate mapping quality existing benchmark RODI great for tabular data no support for other formats, such as hierarchical data formats
  • 64. Evaluate task complexity via 5 characteristics input: e.g., data, ontologies, user feedback output: Linked Data, mapping process: steps, user actions duration: time to complete task presentation: user interface
  • 65. Limited in current evaluations to single aspect only duration only number of user actions only precision and recall
  • 66. Roundup improve single-scenario mappings by discovering and using existing knowledge What similarity metrics we use for discovery? How do we use and combine the different methods and knowledge?