OTTHO (On the Tip of my THOught) is an information seeking system designed for solving a language game which demands knowledge covering a broad range of topics, such as movies, politics, literature, history, proverbs, and popular culture. OTTHO implements a knowledge infusion process in order to provide a background knowledge which allows a deeper understanding of the items it deals with. The knowledge infusion process consists of two steps: 1) extracting and modeling relationships between words extracted from several knowledge sources; 2) reasoning on the induced models in order to generate new knowledge. OTTHO extracts knowledge from several sources, such as a dictionary, news, Wikipedia, and various unstructured repositories and creates a memory of linguistic knowledge and world facts. Starting from some external stimuli (e.g. words) depending on the task to be accomplished, the reasoning mechanism allows retrieving some specific pieces of knowledge from the memory created in the previous step. OTTHO has a great potential for more practical applications besides solving a language game. It could be used for implementing an alternative paradigm for associative information retrieval, for computational advertising and recommender systems.
1. Semantic 1/128
Web
Access and
Personalization
research group
http://www.di.uniba.it/~swap
OTTHO: An Artificial Player for a
Complex Language Game
Giovanni Semeraro, Pasquale Lops,
Marco de Gemmis, Pierpaolo Basile
Popularize Artificial Intelligence
AI*IA Workshop and Prize for
celebrating 100th anniversary of Alan Turing's birth
Rome, 15th June, 2012
2. Knowledge Infusion (KI): Motivation
Humans typically have the linguistic and cultural
experience to comprehend the meaning of a text
• abstraction from words to concepts
• recall associations between concepts by exploiting
background knowledge (associative retrieval)
3. Knowledge Infusion
How to realize these capabilities into machines?
Knowledge Infusion (KI) = The process of providing a system
with the background knowledge which allows a deeper
understanding of the information it deals with
• which knowledge sources?
• which reasoning strategies?
KI implemented in the domain of language games
• fundamental role of word meanings and reasoning capabilities
OTTHO: On the Tip of my THOught [Sem09, Sem11]
• an artificial player based on KI for the Guillotine game
[Sem09] G. Semeraro, P. Lops, P. Basile, and M. de Gemmis. On the Tip of my Thought: Playing the Guillotine Game.
In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI 2009), 1543-1548, Morgan
Kaufmann, 2009.
[Sem11] G. Semeraro, M. de Gemmis, P. Lops, P. Basile. Knowledge Infusion from Open Knowledge Sources: an
Artificial Player for a Language Game, IEEE Intelligent Systems. In Press.
4. The game
SIN APPLE is the symbol of the original sin
in the Book of Genesis
NEWTON Isaac Newton discovered the gravity by
means of an APPLE
DOCTOR
is a proverb
TRY!
“an APPLE a day takes the doctor away”
PIE APPLE pie is a fruit cake
NEW YORK new york city is called “the big APPLE”
5. Knowledge Infusion: an NLP-AI task
NLP techniques process the unstructured information stored in several
(open) knowledge sources
• the memory of the system
Spreading Activation [And83] as the reasoning mechanism
• the brain of the system
Cultural and Linguistic
Background Knowledge
[And83] J. R. Anderson. A Spreading Activation Theory of Memory. Journal of Verbal Learning and Verbal
Behavior, 22:261–295, 1983.
6. Knowledge Sources
Encyclopedia: the Italian version of Wikipedia
Dictionary – the De Mauro Paravia Italian on-line dictionary
Movies: descriptions of Italian
movies crawled from IMDb
Books crawled from the
web
Songs crawled from the
web
Proverbs and Aphorisms: Compound forms: groups of words that often go together
the Italian version of having a specific meaning, e.g. “artificial intelligence” –
Wikiquote crawled from the web
7. Encoding a Knowledge Source as
Cognitive Unit Repository
Information in long term memory of human
beings is encoded as Cognitive Units – ACT
theory [And83]
Cognitive Unit (CU) = textual description of a
concept
• HEAD = words identifying the concept represented by the CU
• BODY = words describing the concept
• [HEAD | BODY]
[And83] J. R. Anderson. A Spreading Activation Theory of Memory. Journal of Verbal Learning and Verbal Behavior,
22:261–295, 1983.
8. Encoding a Knowledge Source as
Cognitive Unit Repository
HEAD
BODY
Artificial 0.77 AI 1.22 intelligence 1.10 computer 0.99
Intelligence 1.22 engineering 0.65 machine 0.55 mind 0.49
… … … …
9. CU repositories can be queried
Query: Machine Intelligence
Relevant
[artificial 0.77
intelligence 1.22 CUs
|
AI 1.22
intelligence 1.10
0.85
computer 0.99
0.52 relevance
engineering 0.65 score
machine 0.55
Cognitive mind 0.49
0.46
Units . . .
. . .
10. What does OTTHO know about clues?
CLUE#1 CLUE#2 CLUE#3 CLUE#4 CLUE#5
KNOWLEDGE REPOSITORY
...
Wikipedia Dictionary Movies Wikiquote
SOL-WORD1
SOL-WORD2
SPREADING … CANDIDATE
ACTIVATION NET SOLUTIONS LIST
11. Building the Spreading Activation
Network - SAN
Nodes represent CUs or words associated with
CUs
Links labeled with weights
• Link association between CU and words
• Weight strength of the association
SAN populated by running n expansion phases
starting from clues
12. SAN for 2 clues and 2 knowledge
sources
Newton Sin
OTTHO - KNOWLEDGE REPOSITORY
Wikipedia Dictionary
CU14 = [isaac 1.34 newton 1.55 | gravitation 1.66 apple 1.52] 0.92
CU16 = [newton 1.55 | unit 0.77 force 0.65 mechanics 0.35] 0.75
relevance
CU7 = [newton 1.87 | unit 1.02 force 0.75] 0.72
scores
CU2 = [sin 1.93 | Christianity 1.62 Genesis 1.53 apple 1.45] 0.65
CU24 = [sin 1.54 | transgression 0.54 divine 0.45 law 0.44] 0.55
13. Spreading over the SAN
newton sin
0.72
0.55
CU2 CU24
CU7 CU16 CU14
0.83 0.28
0.48 0.90
0.74 0.37
0.79 law
0.85
Christianity
0.91
unit force transgression
isaac 0.86
0.18 apple 0.29
gravitation
mechanics Genesis divine
14. Spreading over the SAN
newton sin
0.72
0.55
CU2 CU24
CU7 CU16 CU14
0.83 0.28
0.48 0.90
0.74 0.37
0.79 law
0.85
Christianity
0.91
unit force transgression
isaac 0.86
0.18 apple 0.29
gravitation
mechanics Genesis divine
15. Spreading over the SAN
newton sin
0.72
0.55
CU2 CU24
CU7 CU16 CU14
0.83 0.28
0.48 0.90
0.74 0.37
0.79 law
0.85
Christianity
0.91
unit force transgression
isaac 0.86
0.18 apple 0.29
gravitation
mechanics Genesis divine
16. Spreading over the SAN
newton sin
0.72
0.55
CU2 CU24
CU7 CU16 CU14
0.83 0.28
0.48 0.90
0.74 0.37
0.79
Christianity
law
0.91 0.85
unit force transgression
isaac 0.86
0.18 apple 0.29
gravitation
mechanics Genesis divine
STOP Labels of the most “active” nodes included in CSL
CSL = [apple, unit, gravitation, force, Christianity]
17. Conclusion
Knowledge Infusion modeled as associative
retrieval
• knowledge representation based on Cognitive
Units
• reasoning process performed by Spreading
Activation
TRY OTTHO during demo session!!!