SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Gathering Alternative Surface Forms
for DBpedia Entities
Volha Bryl
University of Mannheim, Germany  Springer Nature
Christian Bizer, Heiko Paulheim
University of Mannheim, Germany
NLP & DBpedia @ ISWC 2015, Bethlehem, USA, October 11, 2015
Why you need Surface Forms
• Surface form (SF) of an entity is a collection of strings it can be
referred as to: synonyms, alternatives names, etc.
• Used to support many NLP tasks: co-reference resolution, entity
linking, disambiguation
2Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Why you need Surface Forms
• Surface form (SF) of an entity is a collection of strings it can be
referred as to: synonyms, alternatives names, etc.
• Used to support many NLP tasks: co-reference resolution, entity
linking, disambiguation
“Billionaire Elon Musk has spelled out how he plans to
create temporary suns over Mars in order to heat the
Red Planet. Dismissing earlier comments that he
intended to nuke the planet’s surface, he says he wants
to create aerial explosions to heat it up. ”
--- to link the three entities, your machine should know that red planet is
an alternative name for Mars, and that Mars can be referred to just by its
“type” – planet
3Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Surface Forms from Wiki(DB)pedia
• Some of Wikipedia’s (hence, DBpedia’s) crowd-sourced content look
quite like surface forms
• Page titles
• Redirects
• Account for alternative names, word forms (e.g. plurals), closely related words,
abbreviations, alternative spellings, likely misspellings, subtopics
• Disambiguation pages
• There are 10+ Bethlehem’s in US, according to
https://en.wikipedia.org/wiki/Bethlehem_(disambiguation)
• Anchor texts of links between wiki pages
Named after the Roman god of war, it is often referred to as the “Red
Planet”...
Source: Named after the [[Mars (mythology)|Roman god of war]], it is
often referred to as the "Red Planet“
• …additionally, we use anchor texts of links from external pages to Wikipedia
4Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Surface Forms from Wiki(DB)pedia
• Not a new idea
• BabelNet, DBpedia Spotlight, … [see our paper for more links]
5Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Mars in BabelNet:
Surface Forms from Wiki(DB)pedia
• Not a new idea
• BabelNet, DBpedia Spotlight, … [see our paper for more links]
• Problem: Quality
• …it is not only that quality is a problem, it is also that it have never been
assessed or addressed
• Reason 1: good quality of Wikipedia content is taken for granted
• Reason 2: hopes are that NLP algorithms won’t be influenced by noise
6Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Mars in BabelNet:
Surface Forms from Wiki(DB)pedia
• Not a new idea
• BabelNet, DBpedia Spotlight, … [see our paper for more links]
• Problem: Quality – Why?
• By adding a redirect or an anchor text of internal Wikipedia link, a Wikipedia
editor might mean not only same as or also known as, but also related to,
contains, etc.
• Both variants serve the purpose of pointing to the correct wiki page
7Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Mars in BabelNet:
Solution: Focus on Quality
• Step 1: Extract
• We extract SFs from Wikipedia labels, redirects, disambiguations, and anchor
texts of internal wiki-links
• Step 2: Evaluate
• We create a gold standard to evaluate the SFs quality
• Step 3: Filter
• We implement three filters to improve SFs quality
• Bonus: More SFs
• We extract SFs from anchor texts of Wiki links found in the Common Crawl
2014 corpus
• All datasets are available at
http://data.dws.informatik.uni-mannheim.de/dbpedia/nlp2014/
8Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
SFs Dataset Statistics
• LRD = Labels, Redirects, Disambiguations
• Extracted from DBpedia dumps
• WAT = Wikipedia Anchor Texts
• Extracted by a new DBpedia extractor (based on PageLinksExtractor)
9Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Gold Standard
• Manual annotation, 1 annotator, 2 subsets
• Popular subset: manually selected 34 popular entities of different types
• Denmark, Berlin, Apple Inc., Animal Farm, Michael Jackson, Star Wars, Diego
Maradona, Mars, etc.
• ~82 SFs per entity, linked from other Wiki pages 813,736 times
• Random subset: randomly selected 81 entities each having at least 5 SFs
• Andy_Zaltzman, Bell AH-1 SuperCobra, Biarritz, Castellum, Firefox (film), Kipchak
languages, ParisTech, Psychokinesis, etc.
• ~13 SFs per entity , linked from other Wiki pages 14,760 times
Available at http://data.dws.informatik.uni-mannheim.de/dbpedia/nlp2014/gold/
10Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Gold Standard
• Type of annotations
• correct (“the eternal city” for Rome),
• contained (“Google Japan” for Google), contains (“Turkey” for Istanbul),
• type of (“the city” for Rome)
• partial (“Diego” for Diego Maradona)
• related (“Google Blog” for Google)
• wrong (“during World War I” for United States)
11Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Evaluation: How many correct SFs?
• SFs extracted from labels, redirects, disambiguations
• correct, popular subset: 66.8%
• correct, random subset: 86.6%
• SFs extracted from Wikipedia anchor texts
• correct, popular subset: 38.5%
• correct, random subset: 70.7%
• Combined dataset
• correct, popular subset: 45.7%
• correct, random subset: 75%
12Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
(1) Filtering: String Patterns
• Data analysis  there are patterns wrong SFs follow
• URLs: contain .com or .net (“Berlin-china.net” for Berlin)
• of-phrases, with the exceptions for city of, state of, and the like (“Issues of
Toronto” for Toronto)
• in-phrases (“Historical sites in Berlin” for Berlin)
• and-phrases (“Tom Cruise and Katie Holmes” for Tom Cruise)
• list-of (“List of Toronto MPs and MPPs” for Toronto)
• Increase in precision
• popular subset: 1.33%
• popular subset, LRD only: 3.75%
• random subset: less than 1%
13Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
(2) Filtering: Wikidata
• Observation: some SFs are entities on their own in other languages
• E.g. “Neckarau” city area of Mannheim redirects to Mannheim in English
Wikipedia, but has its own page in German Wikipedia
• Implementation: use DBpedia- Wikidata dumps, released in May 2015
• Check whether a SF exactly matches or is close (Levenshtein distance) to any
of the labels of Wikidata entities that do not have English but have other
Wikipedia pages
• Increase in precision
• 0.5% compared to pattern-based filtering
• 1.5% for SF extracted only from LRD
14Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
(3) Filtering: Frequency Scores
• For SFs extracted from anchor texts, frequencies are available
 TF-IDF scores
• Determining the threshold: 1.0 .. 8.0 values with a step of 0.2 evaluated
•Two thresholds selected, highest values of F1: 1.8 and 2.6
•Threshold 0 (no filtering) used as baseline
• Increase in precision
•20% for popular subset, 10% for random subset
* Filtering done on the dataset to which pattern- and Wikidata-based filters are already applied
15Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
SFs from Common Crawl
• Common Crawl (CC) is the largest publicly available web corpus
• Extraction done on Winter 2014 CC Corpus, in the context of the Web
Data Commons project
• http://webdatacommons.org/ -- extracting and providing for public download
various types of structured data from CC
• Data required a lot of cleaning
• 3M SFs added to our LRD&WAT corpus
• No annotated gold standard: left for future work
• Available at
http://data.dws.informatik.uni-mannheim.de/dbpedia/nlp2014/lrd-cc/
16Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
Conclusion and Future Work
• Main message
• quality of Wikipedia-base surface forms is often overlooked!
• Contributions
• Gold standard SFs, made available
• 3 filtering strategies: precision improved by > 20% for popular Wikipedia
entities, for > 10% for random entities
• Extracted SFs from Common Crawl corpus
• All data publicly available
• Future work directions
• Task-based evaluation of the resource, further work on the gold standard
17Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim

Weitere ähnliche Inhalte

Was ist angesagt?

Another RDF Encoding Form
Another RDF Encoding FormAnother RDF Encoding Form
Another RDF Encoding FormJakob .
 
Contexts and Importing in RDF
Contexts and Importing in RDFContexts and Importing in RDF
Contexts and Importing in RDFJie Bao
 
DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...
DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...
DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...Krzysztof Wecel
 
RDA: thinking globally, acting globally
RDA: thinking globally, acting globallyRDA: thinking globally, acting globally
RDA: thinking globally, acting globallyGordon Dunsire
 
OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?Aidan Hogan
 
Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...Alison Hitchens
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologiesProf. Wim Van Criekinge
 
FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Dataaschwarte
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011Peter Mika
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparqlDhavalkumar Thakker
 
Federated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFedFederated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFedMuhammad Saleem
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataMuhammad Saleem
 
Efficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationEfficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationMuhammad Saleem
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD VivaAidan Hogan
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevTrudat
 
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...NASIG
 
Querying Linked Data on Android
Querying Linked Data on AndroidQuerying Linked Data on Android
Querying Linked Data on AndroidEUCLID project
 
Federated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 TutorialFederated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 TutorialMuhammad Saleem
 
Information-rich programming in F# (ML Workshop 2012)
Information-rich programming in F# (ML Workshop 2012)Information-rich programming in F# (ML Workshop 2012)
Information-rich programming in F# (ML Workshop 2012)Tomas Petricek
 

Was ist angesagt? (20)

Another RDF Encoding Form
Another RDF Encoding FormAnother RDF Encoding Form
Another RDF Encoding Form
 
Contexts and Importing in RDF
Contexts and Importing in RDFContexts and Importing in RDF
Contexts and Importing in RDF
 
DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...
DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...
DBpedia Citation Challenge. (Not only) Polish Citations in Wikipedia: analysi...
 
RDA: thinking globally, acting globally
RDA: thinking globally, acting globallyRDA: thinking globally, acting globally
RDA: thinking globally, acting globally
 
OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?OWL: Yet to arrive on the Web of Data?
OWL: Yet to arrive on the Web of Data?
 
Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...
 
Memento 101
Memento 101Memento 101
Memento 101
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologies
 
FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Data
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparql
 
Federated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFedFederated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFed
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of Data
 
Efficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationEfficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federation
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko Valtchev
 
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
 
Querying Linked Data on Android
Querying Linked Data on AndroidQuerying Linked Data on Android
Querying Linked Data on Android
 
Federated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 TutorialFederated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 Tutorial
 
Information-rich programming in F# (ML Workshop 2012)
Information-rich programming in F# (ML Workshop 2012)Information-rich programming in F# (ML Workshop 2012)
Information-rich programming in F# (ML Workshop 2012)
 

Andere mochten auch

Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsMarieke van Erp
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataOlaf Hartig
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine LearningHeiko Paulheim
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementJindřich Mynarz
 
Exploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesExploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesLuiz Henrique Zambom Santana
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Heiko Paulheim
 
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data CloudA Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data CloudSyed Muhammad Ali Hasnain
 
Unsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product DescriptionUnsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product DescriptionRakuten Group, Inc.
 
FedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and ExecutionFedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and ExecutionSyed Muhammad Ali Hasnain
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031kwangsub kim
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQLOlaf Hartig
 
The Future is Federated
The Future is FederatedThe Future is Federated
The Future is FederatedRuben Verborgh
 
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...Semantic Web Company
 

Andere mochten auch (20)

DBpedia InsideOut
DBpedia InsideOutDBpedia InsideOut
DBpedia InsideOut
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
NLP todo
NLP todoNLP todo
NLP todo
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public Procurement
 
Exploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesExploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queries
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
 
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data CloudA Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
 
Unsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product DescriptionUnsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product Description
 
FedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and ExecutionFedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and Execution
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
The Future is Federated
The Future is FederatedThe Future is Federated
The Future is Federated
 
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
Julien Gonçalves: Named entity recognition and disambiguation using an iterat...
 

Ähnlich wie Gathering Alternative Surface Forms for DBpedia Entities

DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly ResourcesRobert Sanderson
 
New Directions in Information Organization: A Linked Data Model with BIBFRAME
New Directions in Information Organization: A Linked Data Model with BIBFRAMENew Directions in Information Organization: A Linked Data Model with BIBFRAME
New Directions in Information Organization: A Linked Data Model with BIBFRAMESharonYang
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending InfluenceRichard Wallis
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
WikiAsp: A Dataset for Multi-domain Aspect-based SummarizationWikiAsp: A Dataset for Multi-domain Aspect-based Summarization
WikiAsp: A Dataset for Multi-domain Aspect-based SummarizationHiroaki Hayashi
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 
Learning Conflict Resolution Strategies for Cross-Language Wikipedia Data Fusion
Learning Conflict Resolution Strategies for Cross-Language Wikipedia Data FusionLearning Conflict Resolution Strategies for Cross-Language Wikipedia Data Fusion
Learning Conflict Resolution Strategies for Cross-Language Wikipedia Data FusionVolha Bryl
 
Type Inference on Noisy RDF Data
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF DataHeiko Paulheim
 
Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...Sean Petiya
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending BenefitsRichard Wallis
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discoveryodsc
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4jSimon Jupp
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ Prateek Jain
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedSören Auer
 

Ähnlich wie Gathering Alternative Surface Forms for DBpedia Entities (20)

DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly Resources
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
New Directions in Information Organization: A Linked Data Model with BIBFRAME
New Directions in Information Organization: A Linked Data Model with BIBFRAMENew Directions in Information Organization: A Linked Data Model with BIBFRAME
New Directions in Information Organization: A Linked Data Model with BIBFRAME
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending Influence
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
WikiAsp: A Dataset for Multi-domain Aspect-based SummarizationWikiAsp: A Dataset for Multi-domain Aspect-based Summarization
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Learning Conflict Resolution Strategies for Cross-Language Wikipedia Data Fusion
Learning Conflict Resolution Strategies for Cross-Language Wikipedia Data FusionLearning Conflict Resolution Strategies for Cross-Language Wikipedia Data Fusion
Learning Conflict Resolution Strategies for Cross-Language Wikipedia Data Fusion
 
Type Inference on Noisy RDF Data
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF Data
 
Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending Benefits
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 
Linked Data Basics
Linked Data BasicsLinked Data Basics
Linked Data Basics
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 

Mehr von Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...Heiko Paulheim
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsHeiko Paulheim
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsHeiko Paulheim
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph BlockHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Heiko Paulheim
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge GraphsHeiko Paulheim
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphHeiko Paulheim
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Heiko Paulheim
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Heiko Paulheim
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsHeiko Paulheim
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingHeiko Paulheim
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the WebHeiko Paulheim
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyHeiko Paulheim
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopHeiko Paulheim
 

Mehr von Heiko Paulheim (20)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids  on the Knowledge Graph BlockBeyond DBpedia and YAGO – The New Kids  on the Knowledge Graph Block
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
 
Machine Learning & Embeddings for Large Knowledge Graphs
Machine Learning & Embeddings  for Large Knowledge GraphsMachine Learning & Embeddings  for Large Knowledge Graphs
Machine Learning & Embeddings for Large Knowledge Graphs
 
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge GraphFrom Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
 
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
 
Make Embeddings Semantic Again!
Make Embeddings Semantic Again!Make Embeddings Semantic Again!
Make Embeddings Semantic Again!
 
How much is a Triple?
How much is a Triple?How much is a Triple?
How much is a Triple?
 
Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge Graphs
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
 

Kürzlich hochgeladen

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Kürzlich hochgeladen (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Gathering Alternative Surface Forms for DBpedia Entities

  • 1. Gathering Alternative Surface Forms for DBpedia Entities Volha Bryl University of Mannheim, Germany  Springer Nature Christian Bizer, Heiko Paulheim University of Mannheim, Germany NLP & DBpedia @ ISWC 2015, Bethlehem, USA, October 11, 2015
  • 2. Why you need Surface Forms • Surface form (SF) of an entity is a collection of strings it can be referred as to: synonyms, alternatives names, etc. • Used to support many NLP tasks: co-reference resolution, entity linking, disambiguation 2Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 3. Why you need Surface Forms • Surface form (SF) of an entity is a collection of strings it can be referred as to: synonyms, alternatives names, etc. • Used to support many NLP tasks: co-reference resolution, entity linking, disambiguation “Billionaire Elon Musk has spelled out how he plans to create temporary suns over Mars in order to heat the Red Planet. Dismissing earlier comments that he intended to nuke the planet’s surface, he says he wants to create aerial explosions to heat it up. ” --- to link the three entities, your machine should know that red planet is an alternative name for Mars, and that Mars can be referred to just by its “type” – planet 3Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 4. Surface Forms from Wiki(DB)pedia • Some of Wikipedia’s (hence, DBpedia’s) crowd-sourced content look quite like surface forms • Page titles • Redirects • Account for alternative names, word forms (e.g. plurals), closely related words, abbreviations, alternative spellings, likely misspellings, subtopics • Disambiguation pages • There are 10+ Bethlehem’s in US, according to https://en.wikipedia.org/wiki/Bethlehem_(disambiguation) • Anchor texts of links between wiki pages Named after the Roman god of war, it is often referred to as the “Red Planet”... Source: Named after the [[Mars (mythology)|Roman god of war]], it is often referred to as the "Red Planet“ • …additionally, we use anchor texts of links from external pages to Wikipedia 4Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 5. Surface Forms from Wiki(DB)pedia • Not a new idea • BabelNet, DBpedia Spotlight, … [see our paper for more links] 5Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim Mars in BabelNet:
  • 6. Surface Forms from Wiki(DB)pedia • Not a new idea • BabelNet, DBpedia Spotlight, … [see our paper for more links] • Problem: Quality • …it is not only that quality is a problem, it is also that it have never been assessed or addressed • Reason 1: good quality of Wikipedia content is taken for granted • Reason 2: hopes are that NLP algorithms won’t be influenced by noise 6Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim Mars in BabelNet:
  • 7. Surface Forms from Wiki(DB)pedia • Not a new idea • BabelNet, DBpedia Spotlight, … [see our paper for more links] • Problem: Quality – Why? • By adding a redirect or an anchor text of internal Wikipedia link, a Wikipedia editor might mean not only same as or also known as, but also related to, contains, etc. • Both variants serve the purpose of pointing to the correct wiki page 7Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim Mars in BabelNet:
  • 8. Solution: Focus on Quality • Step 1: Extract • We extract SFs from Wikipedia labels, redirects, disambiguations, and anchor texts of internal wiki-links • Step 2: Evaluate • We create a gold standard to evaluate the SFs quality • Step 3: Filter • We implement three filters to improve SFs quality • Bonus: More SFs • We extract SFs from anchor texts of Wiki links found in the Common Crawl 2014 corpus • All datasets are available at http://data.dws.informatik.uni-mannheim.de/dbpedia/nlp2014/ 8Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 9. SFs Dataset Statistics • LRD = Labels, Redirects, Disambiguations • Extracted from DBpedia dumps • WAT = Wikipedia Anchor Texts • Extracted by a new DBpedia extractor (based on PageLinksExtractor) 9Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 10. Gold Standard • Manual annotation, 1 annotator, 2 subsets • Popular subset: manually selected 34 popular entities of different types • Denmark, Berlin, Apple Inc., Animal Farm, Michael Jackson, Star Wars, Diego Maradona, Mars, etc. • ~82 SFs per entity, linked from other Wiki pages 813,736 times • Random subset: randomly selected 81 entities each having at least 5 SFs • Andy_Zaltzman, Bell AH-1 SuperCobra, Biarritz, Castellum, Firefox (film), Kipchak languages, ParisTech, Psychokinesis, etc. • ~13 SFs per entity , linked from other Wiki pages 14,760 times Available at http://data.dws.informatik.uni-mannheim.de/dbpedia/nlp2014/gold/ 10Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 11. Gold Standard • Type of annotations • correct (“the eternal city” for Rome), • contained (“Google Japan” for Google), contains (“Turkey” for Istanbul), • type of (“the city” for Rome) • partial (“Diego” for Diego Maradona) • related (“Google Blog” for Google) • wrong (“during World War I” for United States) 11Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 12. Evaluation: How many correct SFs? • SFs extracted from labels, redirects, disambiguations • correct, popular subset: 66.8% • correct, random subset: 86.6% • SFs extracted from Wikipedia anchor texts • correct, popular subset: 38.5% • correct, random subset: 70.7% • Combined dataset • correct, popular subset: 45.7% • correct, random subset: 75% 12Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 13. (1) Filtering: String Patterns • Data analysis  there are patterns wrong SFs follow • URLs: contain .com or .net (“Berlin-china.net” for Berlin) • of-phrases, with the exceptions for city of, state of, and the like (“Issues of Toronto” for Toronto) • in-phrases (“Historical sites in Berlin” for Berlin) • and-phrases (“Tom Cruise and Katie Holmes” for Tom Cruise) • list-of (“List of Toronto MPs and MPPs” for Toronto) • Increase in precision • popular subset: 1.33% • popular subset, LRD only: 3.75% • random subset: less than 1% 13Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 14. (2) Filtering: Wikidata • Observation: some SFs are entities on their own in other languages • E.g. “Neckarau” city area of Mannheim redirects to Mannheim in English Wikipedia, but has its own page in German Wikipedia • Implementation: use DBpedia- Wikidata dumps, released in May 2015 • Check whether a SF exactly matches or is close (Levenshtein distance) to any of the labels of Wikidata entities that do not have English but have other Wikipedia pages • Increase in precision • 0.5% compared to pattern-based filtering • 1.5% for SF extracted only from LRD 14Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 15. (3) Filtering: Frequency Scores • For SFs extracted from anchor texts, frequencies are available  TF-IDF scores • Determining the threshold: 1.0 .. 8.0 values with a step of 0.2 evaluated •Two thresholds selected, highest values of F1: 1.8 and 2.6 •Threshold 0 (no filtering) used as baseline • Increase in precision •20% for popular subset, 10% for random subset * Filtering done on the dataset to which pattern- and Wikidata-based filters are already applied 15Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 16. SFs from Common Crawl • Common Crawl (CC) is the largest publicly available web corpus • Extraction done on Winter 2014 CC Corpus, in the context of the Web Data Commons project • http://webdatacommons.org/ -- extracting and providing for public download various types of structured data from CC • Data required a lot of cleaning • 3M SFs added to our LRD&WAT corpus • No annotated gold standard: left for future work • Available at http://data.dws.informatik.uni-mannheim.de/dbpedia/nlp2014/lrd-cc/ 16Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim
  • 17. Conclusion and Future Work • Main message • quality of Wikipedia-base surface forms is often overlooked! • Contributions • Gold standard SFs, made available • 3 filtering strategies: precision improved by > 20% for popular Wikipedia entities, for > 10% for random entities • Extracted SFs from Common Crawl corpus • All data publicly available • Future work directions • Task-based evaluation of the resource, further work on the gold standard 17Surface Forms for DBpedia Entities, Bryl, Bizer, Paulheim