1. Data Dependency Management in
Heterogeneous and Dynamic DIS
Giorgio Orsi
orsi@elet.polimi.it
Ph. Day
June 26th 2008
Politecnico di Milano
Dipartimento di Elettronica e Informazione
2. Motivations
• Heterogeneous, Independent, Dynamic and Mobile Data Sources.
• Heterogeneity of models and technologies.
• Systems designed independently.
• Real world is not static and changes rapidly.
• Users and data sources move and the engineer cannot
follow them all the day to solve their problems.
• We need on-the-fly, integrated access to relevant information.
PHDAY '08
3. Ontologies at rescue
Possible Solution: Domain Ontology
Ontology-Based
Context-aware
Data Integration Access
Data Source CA-DL Mappings
Ontologies
Data Sources
PHDAY '08
4. Tasks and Challenges
Tasks:
• (User-driven) Automatic Schema Extraction (ROSEX).
• (User-driven) Lightweight Automatic Data Integration (X-SOM).
• Cross-Model, Distributed Query processing (SPARQL-Explorer).
• Context-Aware Data Filtering (CADD Tool).
Challenges:
• Cognitive support to HD-DIS modeling
(What you see is what you get).
• On-the-fly tailoring of relevant data and smart caching
(What you get is what you need).
• Technology gaps: Mobile devices, data streams, sensor
networks.
PHDAY '08
5. Tools
Extraction:
• Focus on really used data models (e.g., relational, XML, RDF) and
Natural Language.
• Output: ontological representations of the data sources.
• Data structures obtained through reverse engineering of best practices in
design or through data-mining.
Data Integration:
• Current solutions (DL-Lite, El++) are almost theoretical solutions and
are far from being real systems (Kripke frames are not user-friendly).
• Keep the data where they currently are, use ontologies to get them out!
• Use data dependencies to optimize query plans.
Data Tailoring:
• Satisfy all the constraints is nor practical nor really needed.
• The system/user context determines the subset of constraints and data to
be considered. PHDAY '08