The YAGO-SUMO integration incorporates millions of entities from YAGO, which is based on Wikipedia and WordNet, into the Suggested Upper Merged Ontology (SUMO), a highly axiomatized formal upper ontology. With the combined force of the two ontologies, an enormous, unprecedented corpus of formalized world knowledge is available for automated processing and reasoning, providing information about millions of entities such as people, cities, organizations, and companies.
Compared to the original YAGO, more advanced reasoning is possible due to the axiomatic knowledge delivered by SUMO. A reasoner can conclude e.g. that a child of a human must also be a human and cannot be born before its parents, or that two people sharing the same parents must be siblings.
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
1. Introduction
Approach
Conclusion
Integrating YAGO into the
Suggested Upper Merged Ontology
G. de Melo1, F. Suchanek1, A. Pease2
1: Max Planck Institute for Informatics, Germany
2: Articulate Software, USA
2008-11-03
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
2. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Outline
1 Introduction
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
2 Approach
Incorporation
Class Information
Statements
3 Conclusion
Ongoing Work
Summary
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
3. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Ontologies/KBs: provide
background knowledge for
intelligent applications
Schism:
formal ontologies vs. large KBs
Goal: Large-scale formal ontology
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
4. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Ontologies/KBs: provide
background knowledge for
intelligent applications
Schism:
formal ontologies vs. large KBs
Goal: Large-scale formal ontology
formal ontologies: complex axioms
(e.g. in FOL), but quite small
large-scale KBs (e.g. based on
Wikipedia): only simple facts
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
5. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Ontologies/KBs: provide
background knowledge for
intelligent applications
Schism:
formal ontologies vs. large KBs
Goal: Large-scale formal ontology
combine the best of both worlds!
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
6. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
large formal ontology (20,000 terms, 70,000 axioms)
axiomatization of general and domain-specific concepts
for applications requiring basic “common sense”
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
7. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
origins: IEEE standard upper ontology group
core owned by IEEE (basically Public Domain), portions GPL
e.g.: OpenCyc doesn’t include axioms of commercial Cyc
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
8. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
peer review, community of experts and users
formal verification with ATP systems
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
9. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
OWL without additional rules is not very expressive
KIF variant standardized as ISO/IEC IS 24707:2007
(Common Logic)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
12. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO Example
(=>
(and
(parent ?CHILD ?PARENT)
(subclass ?CLASS Organism)
(instance ?PARENT ?CLASS))
(instance ?CHILD ?CLASS))
This implies, for example, that a child of a Human is also a Human.
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
14. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
additional domain ontologies
however, SUMO is mainly an upper ontology
not enough instances and ground facts
e.g. for geography, finance, transportation
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
15. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
additional domain ontologies
however, SUMO is mainly an upper ontology
not enough instances and ground facts
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
16. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
additional domain ontologies
however, SUMO is mainly an upper ontology
not enough instances and ground facts
e.g. people, cities, books
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
17. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Extending Ontologies: Possible Approaches
Manual work
Information extraction from corpora / the Web
Import from existing databases
slow process, low coverage
Semantic Wikis not yet accepted enough
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
18. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Extending Ontologies: Possible Approaches
Manual work
Information extraction from corpora / the Web
Import from existing databases
low accuracy
not canonical / in line with upper ontology
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
19. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Extending Ontologies: Possible Approaches
Manual work
Information extraction from corpora / the Web
Import from existing databases
feasible, but not universal enough
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
20. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
YAGO
combine entities and facts from Wikipedia with an upper
ontology
original YAGO: WordNet for the upper level
New goal: integrate with SUMO
excellent coverage: around 2 million entities
millions of facts about them
high quality: e.g. birth dates of people, location of cities
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
21. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
YAGO
combine entities and facts from Wikipedia with an upper
ontology
original YAGO: WordNet for the upper level
New goal: integrate with SUMO
mainly a lexical knowledge base
e.g. hyponymic relationships do not strictly imply subsumptions
lack of formal axioms
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
22. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
YAGO
combine entities and facts from Wikipedia with an upper
ontology
original YAGO: WordNet for the upper level
New goal: integrate with SUMO
so the class information actually is meaningful
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
24. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Incorporation
Idea: most Wikipedia articles become new entities
Semi-automatic matching: although SUMO contains only
few instances, some degree of overlap exists
use weighted string similarity measure
additional manual validation
−→ equivalence table
Entity Generation: produce a new unique term name for
Wikipedia article not listed in equivalence table, subject to the
following desiderata:
prevent clashes with SUMO or other entities
conciseness
abide to KIF syntax (Wikipedia uses Unicode)
must be a proper entity (not: “List of ...”)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
25. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Incorporation
Idea: most Wikipedia articles become new entities
Semi-automatic matching: although SUMO contains only
few instances, some degree of overlap exists
use weighted string similarity measure
additional manual validation
−→ equivalence table
Entity Generation: produce a new unique term name for
Wikipedia article not listed in equivalence table, subject to the
following desiderata:
prevent clashes with SUMO or other entities
conciseness
abide to KIF syntax (Wikipedia uses Unicode)
must be a proper entity (not: “List of ...”)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
26. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Incorporation
Idea: most Wikipedia articles become new entities
Semi-automatic matching: although SUMO contains only
few instances, some degree of overlap exists
use weighted string similarity measure
additional manual validation
−→ equivalence table
Entity Generation: produce a new unique term name for
Wikipedia article not listed in equivalence table, subject to the
following desiderata:
prevent clashes with SUMO or other entities
conciseness
abide to KIF syntax (Wikipedia uses Unicode)
must be a proper entity (not: “List of ...”)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
27. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
28. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
categorization not transitive
members of subcategories often unrelated to parent category
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
29. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
30. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
35. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
check WordNet for premodifier + headword or headword only
disambiguate using frequency information
result: relationship to WordNet-derived class
American singers of German origin
becomes linked as a subclass to the
WordNet-derived class Person
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
36. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
Voting Procedure
problem:
regular polysemy, Wikipedia articles simultaneously cover
several metonymically related senses
e.g. Brown University is both a College and a
GroupOfPeople
will cause inconsistencies when the axioms are added
solution:
look at top-level branches for each proposed class (locations,
artifacts, abstract entities, etc.)
voting procedure to determine most salient branch (ties broken
arbitrarily)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
37. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
Voting Procedure
problem:
regular polysemy, Wikipedia articles simultaneously cover
several metonymically related senses
e.g. Brown University is both a College and a
GroupOfPeople
will cause inconsistencies when the axioms are added
solution:
look at top-level branches for each proposed class (locations,
artifacts, abstract entities, etc.)
voting procedure to determine most salient branch (ties broken
arbitrarily)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
41. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
From WordNet to SUMO
in further cases, the mappings yield a property or relation
−→ create new WordNet-based class, add axioms of the
form
(=>
(instance ?ENTITY Guitarist)
(property ?ENTITY Musician))
Then recursively move up WordNet’s class hierarchy adding
parent classes, until until a genuine parent class in SUMO is
available.
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
44. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
45. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
46. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
47. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
49. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
SUMO Integration
mapping rules
new relations added to SUMO when necessary
incl. additional rules for reasoning
(instance establishedOnDate BinaryRelation)
(domain 1 establishedOnDate Agent)
(domain 2 establishedOnDate TimeInterval)
(=> (establishedOnDate ?OBJ ?TIME)
(exists (?FOUNDING)
(and (instance ?FOUNDING Founding)
(result ?FOUNDING ?OBJ)
(overlapsTemporally (WhenFn ?FOUNDING) TIME))))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
50. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
SUMO Integration
mapping rules
new relations added to SUMO when necessary
incl. additional rules for reasoning
(instance establishedOnDate BinaryRelation)
(domain 1 establishedOnDate Agent)
(domain 2 establishedOnDate TimeInterval)
(=> (establishedOnDate ?OBJ ?TIME)
(exists (?FOUNDING)
(and (instance ?FOUNDING Founding)
(result ?FOUNDING ?OBJ)
(overlapsTemporally (WhenFn ?FOUNDING) TIME))))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
51. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Statements with Literals
proper encoding of literals with units:
e.g. (MeasureFn 3.0 SquareMeter)
date ranges are recast
(exists ?YEARNO ?MONTHNO ?YEARNO
(and
(birthdate HerveyDeStanton
(DayFn ?DAYNO
(MonthFn ?MONTHNO
(YearFn ?YEARNO))))
(greaterThanOrEqualTo ?YEARNO 1270)
(lessThanOrEqualTo ?YEARNO 1279)))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
52. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Statements with Literals
proper encoding of literals with units:
e.g. (MeasureFn 3.0 SquareMeter)
date ranges are recast
(exists ?YEARNO ?MONTHNO ?YEARNO
(and
(birthdate HerveyDeStanton
(DayFn ?DAYNO
(MonthFn ?MONTHNO
(YearFn ?YEARNO))))
(greaterThanOrEqualTo ?YEARNO 1270)
(lessThanOrEqualTo ?YEARNO 1279)))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
53. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Additional Grounding
statements of the form
(representsInLanguage
"Immanuel Kant" ImmanuelKant EnglishLanguage)
produce a greater level of formal grounding of the semantics
of term names
when names are ambiguous, providing such symbolic strings
for multiple languages can further reduce the range of possible
interpretations
classes are better-specified due to their extensional
characterization
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
54. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Additional Grounding
statements of the form
(representsInLanguage
"Immanuel Kant" ImmanuelKant EnglishLanguage)
produce a greater level of formal grounding of the semantics
of term names
when names are ambiguous, providing such symbolic strings
for multiple languages can further reduce the range of possible
interpretations
classes are better-specified due to their extensional
characterization
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
55. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Additional Grounding
statements of the form
(representsInLanguage
"Immanuel Kant" ImmanuelKant EnglishLanguage)
produce a greater level of formal grounding of the semantics
of term names
when names are ambiguous, providing such symbolic strings
for multiple languages can further reduce the range of possible
interpretations
classes are better-specified due to their extensional
characterization
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
60. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
61. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
62. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
63. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
64. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
65. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology