SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
DBpedia (in) ALIGNED
From DBpedia to DBpedia+
Dimitris Kontokostas
AKSW Group, Leipzig University
DBpedia Association
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2007
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2008
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2009
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2010
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2011
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2014
February 9th 2015 / 3rd DBpedia Meeting in Dublin
RDF Stats (2014 release)
3B facts (only 580M facts in English)
● DBpedia En: 4.58M Things / 4.22M typed
● 125 Localized versions: 38.3M Things
● 50M links to other datasets
Many more stats @:
dbpedia.org/Datasets2014/DatasetStatistics
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Dev Stats
DBpedia Information Extraction Framework
● Java/Scala based framework
○ Old PHP-based framework
● 5.1K Commits
● 52K lines of code (100K/1M AT)
● 71 total contributors
Many more stats @:
www.openhub.net/p/dbpedia
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Aligning Problem
Lot’s of code & a lot more data
● Wikipedia evolves over time
○ Infobox Templates change, merge, deleted
○ New formatting templates
○ Structural differences per language edition
● Code should adapt to all the changes
○ hard at this (data) scale
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Unit-testing to the rescue?
● Software & Data testing
● Straightforward for software (since 70’s)
● Preliminary for (RDF) data
○ RDFUnit, SPIN, OWL, PelletICV, ShEx,...
■ W3C Data Shapes WG
Data testing++
● Generation: manual, (Semi)automatic, ...
● Linking: data & software tests
February 9th 2015 / 3rd DBpedia Meeting in Dublin
RDFUnit
http://rdfunit.aksw.org
February 9th 2015 / 3rd DBpedia Meeting in Dublin
UT feedback loop
Data verification and feedback at different
data extraction stages
● Three main points of failure in DBpedia:
○ Code
○ Infobox mappings
○ Wikipedia (!!!)
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia+
Workflow
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Additional feedback
We are looking into:
● Reporting
● Statistics
● Inter-Wikipedia cross-checking
● ML techniques
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Thank you & Questions?
ALIGNED
Aligned, Quality-centric Software and Data
Engineering

Weitere ähnliche Inhalte

Was ist angesagt?

Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDFariz Darari
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018Fast Reports
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring databodaceacat
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016Sergio Fernández
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF Navid Sedighpour
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experimentWARCnet
 
Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1GraylinKim
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DataeXascale Infolab
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UKMagnus Manske
 
RDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsRDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsDigitalLibraryServices
 
Eighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateEighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateopenCypher
 
Steam Learn: An introduction to Redis
Steam Learn: An introduction to RedisSteam Learn: An introduction to Redis
Steam Learn: An introduction to Redisinovia
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Projectmbruemmer
 

Was ist angesagt? (19)

Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
 
Wikidata & dbpedia
Wikidata & dbpediaWikidata & dbpedia
Wikidata & dbpedia
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring data
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4j
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK
 
RDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsRDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used Terms
 
Eighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateEighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status Update
 
Data quality in Real Estate
Data quality in Real EstateData quality in Real Estate
Data quality in Real Estate
 
Steam Learn: An introduction to Redis
Steam Learn: An introduction to RedisSteam Learn: An introduction to Redis
Steam Learn: An introduction to Redis
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Project
 
Wikidata
WikidataWikidata
Wikidata
 

Andere mochten auch

DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)Dimitris Kontokostas
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016Dimitris Kontokostas
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)Dimitris Kontokostas
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFDimitris Kontokostas
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsDimitris Kontokostas
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use caseDimitris Kontokostas
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Qualityandimou
 
Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015SemanticCode
 
D bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgD bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgWolters Kluwer Germany
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterBianca Pereira
 
Linking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupLinking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupSujan Perera
 
20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_finalGerard Kuys
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesJulien PLU
 
20150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v220150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v2Gerard Kuys
 
Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Net7
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016Sebastian Hellmann
 
DBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudDBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudFumihiro Kato
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaAntoine Isaac
 

Andere mochten auch (20)

DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016
 
DBpedia ♥ Commons
DBpedia ♥ CommonsDBpedia ♥ Commons
DBpedia ♥ Commons
 
DBpedia past, present & future
DBpedia past, present & futureDBpedia past, present & future
DBpedia past, present & future
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use case
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015
 
D bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgD bpedia association meeting dublin wkg
D bpedia association meeting dublin wkg
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge Chapter
 
Linking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupLinking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia Meetup
 
20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating Entities
 
20150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v220150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v2
 
Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
DBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudDBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloud
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpedia
 

Ähnlich wie DBpedia (in) ALIGNED: From DBpedia to DBpedia

DBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseDBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseVolha Bryl
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realizationSeonho Kim
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Oscar Corcho
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)robin fay
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Juan Sequeda
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Evangelos Kalampokis
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann
 
Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAnshika865276
 
Integrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentIntegrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentCloudera, Inc.
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LODDiego Valerio Camarda
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4Marin Dimitrov
 
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsJoachim Neubert
 

Ähnlich wie DBpedia (in) ALIGNED: From DBpedia to DBpedia (20)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
DBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseDBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New Release
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realization
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
 
Linked Data
Linked DataLinked Data
Linked Data
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptx
 
4-Managing CrossRef DOIs
4-Managing CrossRef DOIs4-Managing CrossRef DOIs
4-Managing CrossRef DOIs
 
Integrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentIntegrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI Environment
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
 

Kürzlich hochgeladen

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Kürzlich hochgeladen (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

DBpedia (in) ALIGNED: From DBpedia to DBpedia

  • 1. DBpedia (in) ALIGNED From DBpedia to DBpedia+ Dimitris Kontokostas AKSW Group, Leipzig University DBpedia Association
  • 2. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2007
  • 3. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2008
  • 4. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2009
  • 5. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2010
  • 6. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2011
  • 7. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2014
  • 8. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDF Stats (2014 release) 3B facts (only 580M facts in English) ● DBpedia En: 4.58M Things / 4.22M typed ● 125 Localized versions: 38.3M Things ● 50M links to other datasets Many more stats @: dbpedia.org/Datasets2014/DatasetStatistics
  • 9. February 9th 2015 / 3rd DBpedia Meeting in Dublin Dev Stats DBpedia Information Extraction Framework ● Java/Scala based framework ○ Old PHP-based framework ● 5.1K Commits ● 52K lines of code (100K/1M AT) ● 71 total contributors Many more stats @: www.openhub.net/p/dbpedia
  • 10. February 9th 2015 / 3rd DBpedia Meeting in Dublin Aligning Problem Lot’s of code & a lot more data ● Wikipedia evolves over time ○ Infobox Templates change, merge, deleted ○ New formatting templates ○ Structural differences per language edition ● Code should adapt to all the changes ○ hard at this (data) scale
  • 11. February 9th 2015 / 3rd DBpedia Meeting in Dublin Unit-testing to the rescue? ● Software & Data testing ● Straightforward for software (since 70’s) ● Preliminary for (RDF) data ○ RDFUnit, SPIN, OWL, PelletICV, ShEx,... ■ W3C Data Shapes WG Data testing++ ● Generation: manual, (Semi)automatic, ... ● Linking: data & software tests
  • 12. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDFUnit http://rdfunit.aksw.org
  • 13. February 9th 2015 / 3rd DBpedia Meeting in Dublin UT feedback loop Data verification and feedback at different data extraction stages ● Three main points of failure in DBpedia: ○ Code ○ Infobox mappings ○ Wikipedia (!!!)
  • 14. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia+ Workflow
  • 15. February 9th 2015 / 3rd DBpedia Meeting in Dublin Additional feedback We are looking into: ● Reporting ● Statistics ● Inter-Wikipedia cross-checking ● ML techniques
  • 16. February 9th 2015 / 3rd DBpedia Meeting in Dublin Thank you & Questions? ALIGNED Aligned, Quality-centric Software and Data Engineering