SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
Data Dependency Management in
Heterogeneous and Dynamic DIS

                             Giorgio Orsi
                           orsi@elet.polimi.it



                                 Ph. Day
                              June 26th 2008

 Politecnico di Milano
 Dipartimento di Elettronica e Informazione
Motivations
• Heterogeneous, Independent, Dynamic and Mobile Data Sources.
   • Heterogeneity of models and technologies.
   • Systems designed independently.
   • Real world is not static and changes rapidly.
   • Users and data sources move and the engineer cannot
     follow them all the day to solve their problems.


• We need on-the-fly, integrated access to relevant information.




                                                             PHDAY '08
Ontologies at rescue
   Possible Solution:   Domain Ontology
    Ontology-Based
                                               Context-aware
    Data Integration                              Access



Data Source                            CA-DL Mappings
Ontologies




                        Data Sources
                                                         PHDAY '08
Tasks and Challenges
Tasks:
• (User-driven) Automatic Schema Extraction (ROSEX).
• (User-driven) Lightweight Automatic Data Integration (X-SOM).
• Cross-Model, Distributed Query processing (SPARQL-Explorer).
• Context-Aware Data Filtering (CADD Tool).


Challenges:
• Cognitive support to HD-DIS modeling
  (What you see is what you get).
• On-the-fly tailoring of relevant data and smart caching
  (What you get is what you need).
• Technology gaps: Mobile devices, data streams, sensor
  networks.
                                                            PHDAY '08
Tools
Extraction:
• Focus on really used data models (e.g., relational, XML, RDF) and
  Natural Language.
• Output: ontological representations of the data sources.
• Data structures obtained through reverse engineering of best practices in
  design or through data-mining.


Data Integration:
• Current solutions (DL-Lite, El++) are almost theoretical solutions and
  are far from being real systems (Kripke frames are not user-friendly).
• Keep the data where they currently are, use ontologies to get them out!
• Use data dependencies to optimize query plans.


Data Tailoring:
• Satisfy all the constraints is nor practical nor really needed.
• The system/user context determines the subset of constraints and data to
  be considered.                                                   PHDAY '08
And now...




    Questions
      ( where: ~□(question => answer) )




                                          PHDAY '08

Weitere ähnliche Inhalte

Was ist angesagt?

Overview AG AKSW
Overview AG AKSWOverview AG AKSW
Overview AG AKSW
Sören Auer
 
Что такое Data Science
Что такое Data ScienceЧто такое Data Science
Что такое Data Science
Olga Lavrentieva
 

Was ist angesagt? (17)

20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
 
Online resources for data management planning
Online resources for data management planning Online resources for data management planning
Online resources for data management planning
 
Yjs: A Framework for Near Real-time P2P Shared Editing on Arbitrary Data Types
Yjs: A Framework for Near Real-time P2P Shared Editing on Arbitrary Data TypesYjs: A Framework for Near Real-time P2P Shared Editing on Arbitrary Data Types
Yjs: A Framework for Near Real-time P2P Shared Editing on Arbitrary Data Types
 
G01 blazek betanski_locloud_collections
G01 blazek betanski_locloud_collectionsG01 blazek betanski_locloud_collections
G01 blazek betanski_locloud_collections
 
LoCloud Collections Introduction
LoCloud Collections IntroductionLoCloud Collections Introduction
LoCloud Collections Introduction
 
Corrin What Comes Next
Corrin What Comes NextCorrin What Comes Next
Corrin What Comes Next
 
Unit 1
Unit 1Unit 1
Unit 1
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
ARCLib project presentation from Pasig 2016
ARCLib project presentation from Pasig 2016ARCLib project presentation from Pasig 2016
ARCLib project presentation from Pasig 2016
 
Elixir at de.nbi meeting
Elixir at de.nbi meetingElixir at de.nbi meeting
Elixir at de.nbi meeting
 
44
4444
44
 
Towards preservation of semantically enriched architectural knowledge
Towards preservation of semantically enriched architectural knowledgeTowards preservation of semantically enriched architectural knowledge
Towards preservation of semantically enriched architectural knowledge
 
9th International Conference on Database and Data Mining (DBDM 2021)
9th International Conference on Database and Data Mining (DBDM 2021)9th International Conference on Database and Data Mining (DBDM 2021)
9th International Conference on Database and Data Mining (DBDM 2021)
 
Approaches to Mining Large-Scale Heterogeneous Data: Old and New
Approaches to Mining Large-Scale Heterogeneous Data: Old and NewApproaches to Mining Large-Scale Heterogeneous Data: Old and New
Approaches to Mining Large-Scale Heterogeneous Data: Old and New
 
Overview AG AKSW
Overview AG AKSWOverview AG AKSW
Overview AG AKSW
 
DDI Data Description Statistics Protection Software
DDI Data Description Statistics Protection SoftwareDDI Data Description Statistics Protection Software
DDI Data Description Statistics Protection Software
 
Что такое Data Science
Что такое Data ScienceЧто такое Data Science
Что такое Data Science
 

Andere mochten auch (9)

Po presentation ddc
Po presentation ddcPo presentation ddc
Po presentation ddc
 
Essex Presentation
Essex PresentationEssex Presentation
Essex Presentation
 
Orsi Vldb11
Orsi Vldb11Orsi Vldb11
Orsi Vldb11
 
Po presentation ddc
Po presentation ddcPo presentation ddc
Po presentation ddc
 
Gottlob ICDE 2011
Gottlob ICDE 2011Gottlob ICDE 2011
Gottlob ICDE 2011
 
Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5
 
Hot technologies slideshare
Hot technologies   slideshareHot technologies   slideshare
Hot technologies slideshare
 
Essex Presentation
Essex PresentationEssex Presentation
Essex Presentation
 
The Diadem Ontology
The Diadem OntologyThe Diadem Ontology
The Diadem Ontology
 

Ähnlich wie Phdaey

Context Addict Presentation
Context Addict PresentationContext Addict Presentation
Context Addict Presentation
Giorgio Orsi
 
Entity centric data_management_2013
Entity centric data_management_2013Entity centric data_management_2013
Entity centric data_management_2013
eXascale Infolab
 
Linked Open data: CNR
Linked Open data: CNRLinked Open data: CNR
Linked Open data: CNR
DatiGovIT
 

Ähnlich wie Phdaey (20)

Fqas09
Fqas09Fqas09
Fqas09
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Context Addict Presentation
Context Addict PresentationContext Addict Presentation
Context Addict Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage Analysis
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials Science
 
51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Entity centric data_management_2013
Entity centric data_management_2013Entity centric data_management_2013
Entity centric data_management_2013
 
Linked Open data: CNR
Linked Open data: CNRLinked Open data: CNR
Linked Open data: CNR
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 

Mehr von Giorgio Orsi

wadar_poster_final
wadar_poster_finalwadar_poster_final
wadar_poster_final
Giorgio Orsi
 
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
Giorgio Orsi
 
Datalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web DatabasesDatalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web Databases
Giorgio Orsi
 
AMBER WWW 2012 Poster
AMBER WWW 2012 PosterAMBER WWW 2012 Poster
AMBER WWW 2012 Poster
Giorgio Orsi
 
DIADEM WWW 2012
DIADEM WWW 2012DIADEM WWW 2012
DIADEM WWW 2012
Giorgio Orsi
 

Mehr von Giorgio Orsi (20)

Web Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseWeb Data Extraction: A Crash Course
Web Data Extraction: A Crash Course
 
Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)
 
Joint Repairs for Web Wrappers
Joint Repairs for Web WrappersJoint Repairs for Web Wrappers
Joint Repairs for Web Wrappers
 
SAE: Structured Aspect Extraction
SAE: Structured Aspect ExtractionSAE: Structured Aspect Extraction
SAE: Structured Aspect Extraction
 
diadem-vldb-2015
diadem-vldb-2015diadem-vldb-2015
diadem-vldb-2015
 
wadar_poster_final
wadar_poster_finalwadar_poster_final
wadar_poster_final
 
Query Rewriting and Optimization for Ontological Databases
Query Rewriting and Optimization for Ontological DatabasesQuery Rewriting and Optimization for Ontological Databases
Query Rewriting and Optimization for Ontological Databases
 
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
 
Deos 2014 - Welcome
Deos 2014 - WelcomeDeos 2014 - Welcome
Deos 2014 - Welcome
 
Perv a ds-rr13
Perv a ds-rr13Perv a ds-rr13
Perv a ds-rr13
 
Heuristic Ranking in Tightly Coupled Probabilistic Description Logics
Heuristic Ranking in Tightly Coupled Probabilistic Description LogicsHeuristic Ranking in Tightly Coupled Probabilistic Description Logics
Heuristic Ranking in Tightly Coupled Probabilistic Description Logics
 
Datalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web DatabasesDatalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web Databases
 
AMBER WWW 2012 Poster
AMBER WWW 2012 PosterAMBER WWW 2012 Poster
AMBER WWW 2012 Poster
 
AMBER WWW 2012 (Demonstration)
AMBER WWW 2012 (Demonstration)AMBER WWW 2012 (Demonstration)
AMBER WWW 2012 (Demonstration)
 
DIADEM WWW 2012
DIADEM WWW 2012DIADEM WWW 2012
DIADEM WWW 2012
 
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
 
Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams - FoSSaCS 2012Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams - FoSSaCS 2012
 
OPAL: automated form understanding for the deep web - WWW 2012
OPAL: automated form understanding for the deep web - WWW 2012OPAL: automated form understanding for the deep web - WWW 2012
OPAL: automated form understanding for the deep web - WWW 2012
 
Nyaya: Semantic data markets: a flexible environment for knowledge management...
Nyaya: Semantic data markets: a flexible environment for knowledge management...Nyaya: Semantic data markets: a flexible environment for knowledge management...
Nyaya: Semantic data markets: a flexible environment for knowledge management...
 
Table Recognition
Table RecognitionTable Recognition
Table Recognition
 

Phdaey

  • 1. Data Dependency Management in Heterogeneous and Dynamic DIS Giorgio Orsi orsi@elet.polimi.it Ph. Day June 26th 2008 Politecnico di Milano Dipartimento di Elettronica e Informazione
  • 2. Motivations • Heterogeneous, Independent, Dynamic and Mobile Data Sources. • Heterogeneity of models and technologies. • Systems designed independently. • Real world is not static and changes rapidly. • Users and data sources move and the engineer cannot follow them all the day to solve their problems. • We need on-the-fly, integrated access to relevant information. PHDAY '08
  • 3. Ontologies at rescue Possible Solution: Domain Ontology Ontology-Based Context-aware Data Integration Access Data Source CA-DL Mappings Ontologies Data Sources PHDAY '08
  • 4. Tasks and Challenges Tasks: • (User-driven) Automatic Schema Extraction (ROSEX). • (User-driven) Lightweight Automatic Data Integration (X-SOM). • Cross-Model, Distributed Query processing (SPARQL-Explorer). • Context-Aware Data Filtering (CADD Tool). Challenges: • Cognitive support to HD-DIS modeling (What you see is what you get). • On-the-fly tailoring of relevant data and smart caching (What you get is what you need). • Technology gaps: Mobile devices, data streams, sensor networks. PHDAY '08
  • 5. Tools Extraction: • Focus on really used data models (e.g., relational, XML, RDF) and Natural Language. • Output: ontological representations of the data sources. • Data structures obtained through reverse engineering of best practices in design or through data-mining. Data Integration: • Current solutions (DL-Lite, El++) are almost theoretical solutions and are far from being real systems (Kripke frames are not user-friendly). • Keep the data where they currently are, use ontologies to get them out! • Use data dependencies to optimize query plans. Data Tailoring: • Satisfy all the constraints is nor practical nor really needed. • The system/user context determines the subset of constraints and data to be considered. PHDAY '08
  • 6. And now... Questions ( where: ~□(question => answer) ) PHDAY '08