SlideShare a Scribd company logo
1 of 14
Download to read offline
Soft Cardinality Constraints on XML Data
How Exceptions Prove the Business Rule
Emir Muñoz
Fujitsu Ireland Ltd.
Joint work with F. Ferrarotti, S. Hartmann, S. Link, M. Marin
@ Nanjing, China, 14th October 2013
Contribution
• Introduce the definition of soft cardinality
constraints over XML data.
• Efficient low-degree polynomial time decision
algorithm for the implication problem.
• Empirical evaluation of soft cardinality
constraints on real XML data.

Emir M. - WISE, Nanjing, China, 14th October 2013

2
Outline
1.
2.
3.
4.
5.

Introduction
Soft Cardinality Constraints
The Implication Problem
Performance Evaluation
Conclusion

Emir M. - WISE, Nanjing, China, 14th October 2013

3
Introduction
Concepts

• Cardinality constraints:
– Capture information about the frequency with
which certain data items occur in particular
context.

• Soft cardinality constraints:
– Constraints which need to be satisfied on average
only, and thus permit violations in a controlled
manner.

Emir M. - WISE, Nanjing, China, 14th October 2013

4
Introduction
Example (1/2)

Project within a research institute

support

Emir M. - WISE, Nanjing, China, 14th October 2013

research

5
Introduction
Example (2/2)

• Some cardinality constraints:
– Every scientist is a member of 2, 3, or 4 research
teams.
– Every technician can work in up to 4 different
support teams.
– A project cannot have more than one manager.
– In every team, there should be two employees for
each expertise level.

Emir M. - WISE, Nanjing, China, 14th October 2013

6
Introduction
Example (2/2)

• Some cardinality constraints:

Scientist working in 5
research teams or more

– Every scientist is a member of 2, 3, or 4 research
teams. Probably will be exceptions
Soft constraints
– Every technician can work in up to 4 different
support teams.
– A project cannot have more than one manager.
– In every team, there should be two employees for
each expertise level.

Emir M. - WISE, Nanjing, China, 14th October 2013

7
Soft Cardinality Constraints
Definition

• Expressiveness from the ability to specify soft
upper bounds (soft-max) as well as soft lower
bounds (soft-min) on the number of nodes.
• soft-card(Q, (Q´, {Q1,…, Qk})) = (soft-min, soft-max)
Context path
Target path

Field paths

• With some sources of intractability
Emir M. - WISE, Nanjing, China, 14th October 2013

soft-min = 1
8
Soft Cardinality Constraints
Examples

• Every scientist is a member of 2, 3, or 4 research
teams.
– soft-card(ε, (_.RTeam.Sci, {id})) = (2, 4)

• Every technician can work in up to 4 different
support teams.
– soft-card(ε, (_.STeam.Tech, {id})) = (1, 4)

• A project cannot have more than one manager.
– soft-card(_, (Manager, Ø)) = (1, 1)

• In every team, there should be two employees
for each expertise level.
– soft-card(_._, (_, {Expertise.S})) = (2, 2)
Emir M. - WISE, Nanjing, China, 14th October 2013

9
The Implication Problem
Definition and Algorithm
• Let
be a finite set of (soft) constraints.
• We say that finitely implies , denoted by
if every finite XML T that satisfies all
also
satisfies

Emir M. - WISE, Nanjing, China, 14th October 2013

10
Performance Evaluation
Configuration

• We compare the performance against XML
Keys
• Machine Intel Core i7 2.8GHz, with 4G RAM
• Documents:
– 321gone, yahoo (auction data)
– dblp (bibliographic information on CS)
– nasa (astronomical data)
– SigmodRecord (articles from SIGMOD Record)
– mondial (world geographic db)
Emir M. - WISE, Nanjing, China, 14th October 2013

11
Performance Evaluation
Results
In comparison with
previous XML keys
Expressivity
Time

Emir M. - WISE, Nanjing, China, 14th October 2013

12
Conclusion
• We introduced an expressive class of soft
cardinality constraints, sufficiently flexible to
boost XML applications such as data exchange
and integration.
• Slight extensions result in the intractability of the
associated implication problem.
• We give an axiomatization for this new class.
• Present an empirical performance test that
indicate its efficient application in real use cases.
Emir M. - WISE, Nanjing, China, 14th October 2013

13
Discussion
• Questions & Answers
– Soft Cardinality Constraints on XML Data

THANKS!
Emir Muñoz
emir@emunoz.org
Emir M. - WISE, Nanjing, China, 14th October 2013

14

More Related Content

Similar to Soft Cardinality Constraints on XML Data

Benefit based data caching in ad hoc networks (synopsis)
Benefit based data caching in ad hoc networks (synopsis)Benefit based data caching in ad hoc networks (synopsis)
Benefit based data caching in ad hoc networks (synopsis)
Mumbai Academisc
 
Design for reliability in resistive ram for ict‐enabled devices
Design for reliability in resistive ram for ict‐enabled devicesDesign for reliability in resistive ram for ict‐enabled devices
Design for reliability in resistive ram for ict‐enabled devices
Dawn Chia
 

Similar to Soft Cardinality Constraints on XML Data (20)

Deep Learning for Stock Prediction
Deep Learning for Stock PredictionDeep Learning for Stock Prediction
Deep Learning for Stock Prediction
 
Benefit based data caching in ad hoc networks (synopsis)
Benefit based data caching in ad hoc networks (synopsis)Benefit based data caching in ad hoc networks (synopsis)
Benefit based data caching in ad hoc networks (synopsis)
 
A Modified Technique For Performing Data Encryption & Data Decryption
A Modified Technique For Performing Data Encryption & Data DecryptionA Modified Technique For Performing Data Encryption & Data Decryption
A Modified Technique For Performing Data Encryption & Data Decryption
 
50120130406035
5012013040603550120130406035
50120130406035
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
 
On the Tree Construction of Multi hop Wireless Mesh Networks with Evolutionar...
On the Tree Construction of Multi hop Wireless Mesh Networks with Evolutionar...On the Tree Construction of Multi hop Wireless Mesh Networks with Evolutionar...
On the Tree Construction of Multi hop Wireless Mesh Networks with Evolutionar...
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Design for reliability in resistive ram for ict‐enabled devices
Design for reliability in resistive ram for ict‐enabled devicesDesign for reliability in resistive ram for ict‐enabled devices
Design for reliability in resistive ram for ict‐enabled devices
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
386 390
386 390386 390
386 390
 
386 390
386 390386 390
386 390
 
Icbai 2018 ver_1
Icbai 2018 ver_1Icbai 2018 ver_1
Icbai 2018 ver_1
 
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
 
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
 
Tungsten Fabric and DPDK vRouter Architecture
Tungsten Fabric and DPDK vRouter ArchitectureTungsten Fabric and DPDK vRouter Architecture
Tungsten Fabric and DPDK vRouter Architecture
 
D-5436
D-5436D-5436
D-5436
 
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
 
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
 
Science DMZ
Science DMZScience DMZ
Science DMZ
 

More from Emir Muñoz

Reading Group 2014
Reading Group 2014Reading Group 2014
Reading Group 2014
Emir Muñoz
 
WikiTables DERI Talk
WikiTables DERI TalkWikiTables DERI Talk
WikiTables DERI Talk
Emir Muñoz
 

More from Emir Muñoz (11)

A Linked Data-Based Decision Tree Classifier to Review Movies
A Linked Data-Based Decision Tree Classifier to Review MoviesA Linked Data-Based Decision Tree Classifier to Review Movies
A Linked Data-Based Decision Tree Classifier to Review Movies
 
The Philosophical Aspects of Data Modelling
The Philosophical Aspects of Data ModellingThe Philosophical Aspects of Data Modelling
The Philosophical Aspects of Data Modelling
 
Web Intelligence - 2010
Web Intelligence - 2010Web Intelligence - 2010
Web Intelligence - 2010
 
μRaptor: A DOM-based system with appetite for hCard elements
μRaptor: A DOM-based system with appetite for hCard elementsμRaptor: A DOM-based system with appetite for hCard elements
μRaptor: A DOM-based system with appetite for hCard elements
 
Learning Content Patterns from Linked Data
Learning Content Patterns from Linked DataLearning Content Patterns from Linked Data
Learning Content Patterns from Linked Data
 
Claves XML: Una Implementación de Algoritmos de Implicación y Validación
Claves XML: Una Implementación de Algoritmos de Implicación y ValidaciónClaves XML: Una Implementación de Algoritmos de Implicación y Validación
Claves XML: Una Implementación de Algoritmos de Implicación y Validación
 
Using Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's TablesUsing Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's Tables
 
Reading Group 2014
Reading Group 2014Reading Group 2014
Reading Group 2014
 
DRETa: Extracting RDF From Wikitables
DRETa: Extracting RDF From WikitablesDRETa: Extracting RDF From Wikitables
DRETa: Extracting RDF From Wikitables
 
DEXA 2012 Talk
DEXA 2012 TalkDEXA 2012 Talk
DEXA 2012 Talk
 
WikiTables DERI Talk
WikiTables DERI TalkWikiTables DERI Talk
WikiTables DERI Talk
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Soft Cardinality Constraints on XML Data

  • 1. Soft Cardinality Constraints on XML Data How Exceptions Prove the Business Rule Emir Muñoz Fujitsu Ireland Ltd. Joint work with F. Ferrarotti, S. Hartmann, S. Link, M. Marin @ Nanjing, China, 14th October 2013
  • 2. Contribution • Introduce the definition of soft cardinality constraints over XML data. • Efficient low-degree polynomial time decision algorithm for the implication problem. • Empirical evaluation of soft cardinality constraints on real XML data. Emir M. - WISE, Nanjing, China, 14th October 2013 2
  • 3. Outline 1. 2. 3. 4. 5. Introduction Soft Cardinality Constraints The Implication Problem Performance Evaluation Conclusion Emir M. - WISE, Nanjing, China, 14th October 2013 3
  • 4. Introduction Concepts • Cardinality constraints: – Capture information about the frequency with which certain data items occur in particular context. • Soft cardinality constraints: – Constraints which need to be satisfied on average only, and thus permit violations in a controlled manner. Emir M. - WISE, Nanjing, China, 14th October 2013 4
  • 5. Introduction Example (1/2) Project within a research institute support Emir M. - WISE, Nanjing, China, 14th October 2013 research 5
  • 6. Introduction Example (2/2) • Some cardinality constraints: – Every scientist is a member of 2, 3, or 4 research teams. – Every technician can work in up to 4 different support teams. – A project cannot have more than one manager. – In every team, there should be two employees for each expertise level. Emir M. - WISE, Nanjing, China, 14th October 2013 6
  • 7. Introduction Example (2/2) • Some cardinality constraints: Scientist working in 5 research teams or more – Every scientist is a member of 2, 3, or 4 research teams. Probably will be exceptions Soft constraints – Every technician can work in up to 4 different support teams. – A project cannot have more than one manager. – In every team, there should be two employees for each expertise level. Emir M. - WISE, Nanjing, China, 14th October 2013 7
  • 8. Soft Cardinality Constraints Definition • Expressiveness from the ability to specify soft upper bounds (soft-max) as well as soft lower bounds (soft-min) on the number of nodes. • soft-card(Q, (Q´, {Q1,…, Qk})) = (soft-min, soft-max) Context path Target path Field paths • With some sources of intractability Emir M. - WISE, Nanjing, China, 14th October 2013 soft-min = 1 8
  • 9. Soft Cardinality Constraints Examples • Every scientist is a member of 2, 3, or 4 research teams. – soft-card(ε, (_.RTeam.Sci, {id})) = (2, 4) • Every technician can work in up to 4 different support teams. – soft-card(ε, (_.STeam.Tech, {id})) = (1, 4) • A project cannot have more than one manager. – soft-card(_, (Manager, Ø)) = (1, 1) • In every team, there should be two employees for each expertise level. – soft-card(_._, (_, {Expertise.S})) = (2, 2) Emir M. - WISE, Nanjing, China, 14th October 2013 9
  • 10. The Implication Problem Definition and Algorithm • Let be a finite set of (soft) constraints. • We say that finitely implies , denoted by if every finite XML T that satisfies all also satisfies Emir M. - WISE, Nanjing, China, 14th October 2013 10
  • 11. Performance Evaluation Configuration • We compare the performance against XML Keys • Machine Intel Core i7 2.8GHz, with 4G RAM • Documents: – 321gone, yahoo (auction data) – dblp (bibliographic information on CS) – nasa (astronomical data) – SigmodRecord (articles from SIGMOD Record) – mondial (world geographic db) Emir M. - WISE, Nanjing, China, 14th October 2013 11
  • 12. Performance Evaluation Results In comparison with previous XML keys Expressivity Time Emir M. - WISE, Nanjing, China, 14th October 2013 12
  • 13. Conclusion • We introduced an expressive class of soft cardinality constraints, sufficiently flexible to boost XML applications such as data exchange and integration. • Slight extensions result in the intractability of the associated implication problem. • We give an axiomatization for this new class. • Present an empirical performance test that indicate its efficient application in real use cases. Emir M. - WISE, Nanjing, China, 14th October 2013 13
  • 14. Discussion • Questions & Answers – Soft Cardinality Constraints on XML Data THANKS! Emir Muñoz emir@emunoz.org Emir M. - WISE, Nanjing, China, 14th October 2013 14