SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Ilias Tachmazidis1,2, Grigoris Antoniou1,2,3, Giorgos Flouris2, Spyros Kotoulas4

University of Crete
2 Foundation for Research and Technolony, Hellas (FORTH)
3 University of Huddersfield
4 Smarter Cities Technology Centre, IBM Research, Ireland
1
Motivation
Background

◦ Defeasible Logic
◦ MapReduce Framework
◦ RDF

Multi-Argument Implementation over RDF
Experimental Evaluation
Future Work
Huge data set coming from

◦ the Web, government authorities, scientific
databases, sensors and more

Defeasible logic

◦ is suitable for encoding commonsense knowledge
and reasoning
◦ avoids triviality of inference due to low-quality data

Defeasible logic has low complexity

◦ The consequences of a defeasible theory D can be
computed in O(N) time, where N is the number of
symbols in D
Reasoning is performed in the presence of
defeasible rules
Defeasible logic has been implemented for
in-memory reasoning, however, it was not
applicable for huge data sets
Solution: scalability/parallelization using the
MapReduce framework
Facts

◦ e.g. bird(eagle)

Strict Rules

◦ e.g. bird(X) → animal(X)

Defeasible Rules

◦ e.g. bird(X) ⇒ flies(X)
Defeaters

◦ e.g. brokenWing(X) ↝ ¬ flies(X)

Priority Relation (acyclic relation on the set of
rules)

◦ e.g.

r: bird(X) ⇒ flies(X)

r’: brokenWing(X) ⇒ ¬ flies(X)
r’ > r
Inspired by similar primitives in LISP and
other functional languages
Operates exclusively on <key, value> pairs
Input and Output types of a MapReduce job:
◦
◦
◦
◦

Input: <k1, v1>
Map(k1,v1) → list(k2,v2)
Reduce(k2, list (v2)) → list(k3,v3)
Output: list(k3,v3)
Provides an infrastructure that takes care of
◦ distribution of data
◦ management of fault tolerance
◦ results collection

For a specific problem

◦ developer writes a few routines which are following
the general interface
Rule sets can be divided into two categories:
◦ Stratified
◦ Non-stratified

Predicate Dependency Graph
Consider the following rule set:

◦ r1: X sentApplication A, A completeFor D ⇒ X acceptedBy D.

◦ r2: X hasCerticate C, C notValidFor D ⇒ X ¬acceptedBy D.
◦ r3: X acceptedBy D, D subOrganizationOf U ⇒
X studentOfUniversity U.
◦ r1 > r2.

Both acceptedBy and ¬acceptedBy are
represented by acceptedBy
Superiority relation is not part of the graph
Initial pass:

◦ Transform facts into <fact, (+Δ, +∂)> pairs

No reasoning needs to be performed for the
lowest stratum (stratum 0)
For each stratum from 1 to N
◦ Pass1: Calculate fired rules
◦ Pass2: Perform defeasible reasoning
INPUT
Literals in multiple files

MAP phase Input
<position in file, literal and knowledge>

File01
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

<key, < John sentApplication App, (+Δ, +∂)>>

<John sentApplication App, (+Δ, +∂)> 
<App completeFor Dep, (+Δ, +∂)> 

< key, < App completeFor Dep, (+Δ, +∂)>>
< key, < John hasCerticate Cert, (+Δ, +∂)>>

File02
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
<John hasCerticate Cert, (+Δ, +∂)>
<Cert notValidFor Dep, (+Δ, +∂)>
<Dep subOrganizationOf Univ, (+Δ, +∂)>

< key, < Cert notValidFor Dep, (+Δ, +∂)>>
< key, < Dep subOrganizationOf Univ, (+Δ, +∂)>>
<App,(John,sentApplication,+Δ,+∂)>
<App,(Dep,completeFor,+Δ,+∂)>
<Cert, (John,hasCerticate,+Δ,+∂)>
<Cert, (Dep,notValidFor,+Δ,+∂)>

Grouping/Sorting

MAP phase Output
<matchingArgValue, 
(Non‐MatchingArgValue,
Predicate, knowledge)>

Reduce phase Input
<matchingArgValue, 
List(Non‐MatchingArgValue,
Predicate, knowledge)>
<App, <(John,sentApplication,+Δ,+∂),
(Dep,completeFor,+Δ,+∂)>>
<Cert,<(John,hasCerticate,+Δ,+∂),
(Dep,notValidFor,+Δ,+∂)>>
Reduce phase Output 
(Final Output)
<literal and knowledge>
<John acceptedBy Dep, (+∂, r1)>
<John acceptedBy Dep,(¬, +∂,r2)>
INPUT
Literals in multiple files

MAP phase Input
<position in file, literal and knowledge>

File01
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

<key, < John sentApplication App, (+Δ, +∂)>>

<John sentApplication App, (+Δ, +∂)> 
<App completeFor Dep, (+Δ, +∂)> 
<John hasCerticate Cert, (+Δ, +∂)>
<Cert notValidFor Dep, (+Δ, +∂)>
<Dep subOrganizationOf Univ, (+Δ, +∂)>

< key, < App completeFor Dep, (+Δ, +∂)>>

File02
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
<John acceptedBy Dep, (+∂, r1)>
<John acceptedBy Dep, (¬, +∂, r2)>

< key, < John hasCerticate Cert, (+Δ, +∂)>>
< key, < Cert notValidFor Dep, (+Δ, +∂)>>
<key, < Dep subOrganizationOf Univ, 
(+Δ,+∂)>>
< key, < John acceptedBy Dep, (+∂, r1)>>
< key, < John acceptedBy Dep, (¬, +∂, r2)>>
< Dep subOrganizationOf Univ, (+Δ,+∂)>

< John acceptedBy Dep, (+∂, r1)>
< John acceptedBy Dep, (¬, +∂, r2)>

Grouping/Sorting

MAP phase Output
<literal, knowledge>

Reduce phase Input
<literal, list(knowledge)>
< Dep subOrganizationOf Univ, 
(+Δ,+∂)>
< John acceptedBy Dep, <(+∂, r1), 
(¬, +∂, r2)>>
Reduce phase Output 
(Final Output)
<Conclusions after reasoning>
No output
< John acceptedBy Dep, (+∂)>
LUBM (up to 1B)
Custom defeasible ruleset
IBM Hadoop Cluster v1.3 (Apache Hadoop
0.20.2)
40-core server
XIV storage SAN
Challenges of Non-Stratified Rule Sets
An efficient mechanism is need for –Δ and -∂

◦ all the available information for the literal must be
processed by a single node causing:
main memory insufficiency
skewed load balancing

Storing conclusions for +/–Δ and +/-∂ is not
feasible
◦ Consider the cartesian product of X, Y, Z for
X Predicate1 Y, Y Predicate2 Z.
Run extensive experiments to test the
efficiency of multi-argument defeasible logic
Applications on real datasets, with lowquality data
More complex knowledge representation
methods such as:
◦ Answer-Set programming
◦ Ontology evolution, diagnosis and repair

AI Planning
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Weitere ähnliche Inhalte

Ähnlich wie Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

DATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE FOR SECURE CLOUD STORAGE
DATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE  FOR SECURE CLOUD STORAGEDATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE  FOR SECURE CLOUD STORAGE
DATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE FOR SECURE CLOUD STORAGE
IJTRET-International Journal of Trendy Research in Engineering and Technology
 
91 Conf Presentation
91 Conf Presentation91 Conf Presentation
91 Conf Presentation
Ryohei Suzuki
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
Arumugam90
 

Ähnlich wie Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce (20)

Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
DATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE FOR SECURE CLOUD STORAGE
DATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE  FOR SECURE CLOUD STORAGEDATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE  FOR SECURE CLOUD STORAGE
DATA INTEGRITY AUDITING WITHOUT PRIVATE KEY STORAGE FOR SECURE CLOUD STORAGE
 
Big datascienceh2oandr
Big datascienceh2oandrBig datascienceh2oandr
Big datascienceh2oandr
 
Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
 
Ieeepro techno solutions ieee java project - generalized approach for data
Ieeepro techno solutions  ieee java project - generalized approach for dataIeeepro techno solutions  ieee java project - generalized approach for data
Ieeepro techno solutions ieee java project - generalized approach for data
 
Ieeepro techno solutions ieee dotnet project - generalized approach for data
Ieeepro techno solutions  ieee dotnet project - generalized approach for dataIeeepro techno solutions  ieee dotnet project - generalized approach for data
Ieeepro techno solutions ieee dotnet project - generalized approach for data
 
Ieeepro techno solutions ieee java project - generalized approach for data
Ieeepro techno solutions  ieee java project - generalized approach for dataIeeepro techno solutions  ieee java project - generalized approach for data
Ieeepro techno solutions ieee java project - generalized approach for data
 
Ieeepro techno solutions ieee java project - generalized approach for data
Ieeepro techno solutions  ieee java project - generalized approach for dataIeeepro techno solutions  ieee java project - generalized approach for data
Ieeepro techno solutions ieee java project - generalized approach for data
 
91 Conf Presentation
91 Conf Presentation91 Conf Presentation
91 Conf Presentation
 
Chapter 10 ds
Chapter 10 dsChapter 10 ds
Chapter 10 ds
 
Chapter 2 ds
Chapter 2 dsChapter 2 ds
Chapter 2 ds
 
[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊
 
GlobalLogic Webinar: Massive aggregations with Spark and Hadoop
GlobalLogic Webinar: Massive aggregations with Spark and HadoopGlobalLogic Webinar: Massive aggregations with Spark and Hadoop
GlobalLogic Webinar: Massive aggregations with Spark and Hadoop
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
 
Evaluation of Uncertain Location
Evaluation of Uncertain Location Evaluation of Uncertain Location
Evaluation of Uncertain Location
 
Ijsws14 423 (1)-paper-17-normalization of data in (1)
Ijsws14 423 (1)-paper-17-normalization of data in (1)Ijsws14 423 (1)-paper-17-normalization of data in (1)
Ijsws14 423 (1)-paper-17-normalization of data in (1)
 
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handout
 
User biglm
User biglmUser biglm
User biglm
 

Mehr von PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
PlanetData Network of Excellence
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
PlanetData Network of Excellence
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
PlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
PlanetData Network of Excellence
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
PlanetData Network of Excellence
 

Mehr von PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce