SlideShare ist ein Scribd-Unternehmen logo
1 von 34
1Jarrar © 2014
Mustafa Jarrar
Sina Institute, University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
Lecture Notes on WordNet
University of Birzeit, Palestine
Fall Semester, 2014
WordNet
EuroWordNet, and Global WordNet
2Jarrar © 2014
Watch this lecture and download the slides from
http://jarrar-courses.blogspot.com/2011/11/artificial-intelligence-fall-2011.html
3Jarrar © 2014
Reading
Everything in these slides + everything I say
[MBC93] George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross,
and Katherine Miller: Introduction to WordNet: An On-line Lexical
Database. International Journal of Lexicography, Vol. 3, Nr. 4. Pages
235-244. (1990) http://wordnetcode.princeton.edu/5papers.pdf
[GGO02] Aldo Gangemi , Nicola Guarino , Alessandro Oltramari , Ro Oltramari ,
Stefano Borgo: Cleaning-up WordNet's Top-Level. In Proc. of the 1st
International WordNetConference (2002)
http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=C9962DFEDD7
93F3F839426B774BC9BAF?doi=10.1.1.11.4064&rep=rep1&type=pdf
4Jarrar © 2014
WordNet and Global WordNet
• Part 1: The English WordNet
• Part 2: Euro WordNet
• Part 3: Global WordNet
Lecture Keywords:
،‫مكنز‬ ،‫المفردات‬ ‫شبكة‬‫انطولوجيا‬،‫للغة‬،‫المعنى‬ ،‫الداللة‬ ،‫الداللة‬ ‫علم‬،‫المفهوم‬
،‫اللغات‬ ‫تعدد‬‫عالقات‬ ،‫المعاني‬ ‫تصنيف‬ ،‫التضاد‬ ،‫المعاني‬ ‫تعدد‬ ،‫اللغوي‬ ‫الترادف‬
‫جزء‬-‫كل‬
WordNet, Global WordNet, Thesaurus, Linguistic Ontology, Lexical Semantics, Semantics,
Meaning, Synset, Concept, Synonymy, Polysemy, Hyponymy, Meronymy, Antonymy,
5Jarrar © 2014
What is WordNet?
• In 1985 a group of psychologists and linguists at Princeton University
started to develop a “mental lexicon”.
• You may also call it:“electronic dictionary”, “Mental dictionary”, English,
“semantic Network”, hyperdimensional thesaurus, etc.
• Includes most frequent words (nouns, adjectives, adverbs, verbs).
• Organized by meaning: words in close proximity are semantically similar.
• Can be used by humans and machines.
• Human users and computers can browse WordNet and find words that
are meaningfully related to their queries.
• Available online, for downloading! http://wordnet.princeton.edu
6Jarrar © 2014
WordNet: Synonymy
WordNet gives information about two fundamental, universal
properties of human language: polysemy and synonymy.
• English words are grouped (roughly) into sets of synonyms.
• Each set of synonyms is called a Synset; and given a unique
SynsetID to identify it.
• Each synset expresses a distinct meaning/concept.
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
08283156
06501650
07955878
03410635
03018908
04615793
{work table}
A table designed…
7Jarrar © 2014
Exercise
List the different meanings of the words:
Table, Array, Matrix, Bureau
8Jarrar © 2014
WordNet: Polysemy
• Each word form-meaning pair is unique.
• A word that appears in n synsets is n-fold polysemous.
• For example: “Table” here is two-fold polysemous
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
9Jarrar © 2014
WordNet: Glosses
A short gloss is provided for each sysnet.
Glosses are examples of contexts for many word-sense pairs, telling us
how words with specific senses are being used in context.
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
10Jarrar © 2014
WordNet: Statistics
155 287 word forms, groups into
117 659 synsets
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
WordForms Synsets
noun 117,798 82,115
verb 11,529 13,767
adjective 21,479 18,156
adverb 4,481 3,621
Total 155,287 117,659
11Jarrar © 2014
WordNet Semantic Relations
Synsets are interconnected with semantic relations, forming a large
semantic network (graph).
Such Relations are:
• Hyponymy, also called “Is a” relation, or sub/superordinate.
• Meronymy, also called “part of” relation
{Container}
Any object that can
be used ..
{Drawer}
A boxlike container
in a..
{shelf}
A support that
consists…
{Support}
Any device that
bears..
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
12Jarrar © 2014
WordNet Relations: Hyponymy
• A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English
speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular
Array is a kind of Array, Array is a kind of Arrangement,…
• Hyponymy is transitive and asymmetrical. So as Hyponymy generates a
hierarchical semantic structure, a hyponym inherits all the features of the more
generic concept and adds at least one feature that distinguishes it from its
superordinate.
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
13Jarrar © 2014
WordNet Relations: Hyponymy
• A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English
speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular Array
is a kind of Array, Array is a kind of Arrangement,…
• Hyponymy is transitive and asymmetrical. So as Hyponymy generates a
hierarchical semantic structure, a hyponym inherits all the features of the more
generic concept and adds at least one feature that distinguishes it from its
superordinate. [2]
The WordNet hierarchy
is about 16 levels
{act, action, activity} {natural object }
{animal, fauna} {natural phenomenon }
{artifact } {person, human being}
{attribute, property } {plant, flora}
{body, corpus} {possession}
{cognition, knowledge} {process}
{communication} {quantity, amount}
{event, happening} {relation }
{feeling, emotion} {shape}
{food} {state, condition}
{group, collection} {substance}
{location, place } {time}
{motive}
Top Level Nouns (25 unique beginners)
14Jarrar © 2014
WordNet Relations: Meronymy
• A synset {x, x′, . . .} is meronym of the synset {y, y′, . . .} if native English
speakers accept sentences like y has an x (as a part) or An x is a part of y.
E. g., Finger is part of Hand , Hand is part of Arm, Arm is part of Body.
• Meronymy is transitive (with qualification) and asymmetrical relations, and
forms a part hierarchy..
• Synsets may have multiple hypernyms
{Container}
Any object that can
be used ..
{Drawer}
A boxlike container
in a..
{shelf}
A support that
consists…
{Support}
Any device that
bears..
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
15Jarrar © 2014
Exercise
Find the hyponyms and meronyms of this synset
{car, auto, automobile, machine, motorcar}
16Jarrar © 2014
WordNet Relations: Another Example
{car, auto, automobile, machine, motorcar}
{conveyance,transport}
{vehicle}
{motor vehicle, automotive vehicle}
{cruiser, squad car, patrol car,
police car, prowl car}
{cab, taxi, hack, taxicab}
{bumper}
{car door}
{car window}
{car mirror} {armrest}
{doorlock}
{hinge,
flexible joint}
hyper(o)nym
hyponym
meronyms
Hyponymy and meronymy relations are:
• transitive
• directed
[1]
17Jarrar © 2014
{Old}
Of long duration
WordNet Relations: Antonymy
• The antonym of a word x is sometimes not-x, but not always. For example, rich and poor
are antonyms, but to say that someone is not rich does not imply that they must be poor; many people
consider themselves neither rich nor poor.
• Antonymy, which seems to be a simple symmetric relation, is actually quite
complex, yet speakers of English have little difficulty recognizing antonyms when
they see them. For example, the meanings {rise, ascend } and {fall, descend} may be conceptual
opposites, but they are not antonyms; [rise/fall] are antonyms and so are [ascend/descend], but most
people hesitate and look thoughtful when asked if rise and descend, or ascend and fall, are antonyms
• Antonymy is a lexical relation between word forms, not a semantic relation between
word meanings. Or, some call it semantic relations between words [MPC93].
{Fall, Come Down, Go
Down, Descend}
Move downward and lower, but not
necessarily all the way
{Set, Go down, Go Under}
(astronomy) disappear beyond the horizon{Ascend, Come
up, Rise, Uprise}
(astronomy) come up, of
celestial bodies
{Ascend, Go up}
Travel up
{Rise, Uprise, Come up,
Go up, Move up, Lift}
Move upward
{Ascend, Move up, Rise}
Move to a better position in life …
{Hot}
Used of physical
heat; having..
{Cold}
Having a low or
inadequate..
{New}
Unaffected by use
or exposure
{New}
Not of long
duration; having..
{Worn}
Affected by wear;
damaged by …
{Young, Immature}
in an early period of life…
{Old}
having lived
for a relatively
18Jarrar © 2014
WordWeb
http://wordweb.info/free/
A nice and intuitive
interface for WordNet
19Jarrar © 2014
Other WordNet Relations
• Although the main interest of WordNet was on specifying semantic
relations but other lexical/morphological relations between word forms
were added.
• For example: stems, singular-plural, verb tenses, etc.
20Jarrar © 2014
Why do we need WordNet?
• Word sense disambiguation,
• Information retrieval,
• Automatic text classification,
• Automatic text summarization,
• Machine translation
• ….etc.
21Jarrar © 2014
Is WordNet a Thesaurus?
Yes:
• it groups together meaningfully related words
No:
• WN labels the relations
• The relations are limited
• Related words are linked to specific concepts (disambiguated);
thesaurus is a “bag of words”
• Many words linked in WordNet do not co-occur in the same
thesaurus entry
• WordNet allows one to measure and quantify the semantic
similarity or distance among words and concepts
[Fellbaum]
22Jarrar © 2014
Is WordNet an Ontology?
Meaning (called Ontological Precision):
WordNet: based on what native speakers agree roughly
Ontology: based on Scientific and philosophical findings.
Classification:
WordNet: based on what native speakers agree roughly (Student IsA person)
Ontology: based on strict formal methodologies (student IsA role)
Formal Specification:
WordNet: logically vague
Ontology: strictly formal
 I like to use WordNet as a linguistic ontology, though it needs lots of cleaning!
 Linguistic ontologies are difficult to build but they are immune to changes
23Jarrar © 2014
WordNet and Global WordNet
• Part 1: The English WordNet
• Part 2: Euro WordNet
• Part 3: Global WordNet
24Jarrar © 2014
EURO WordNet
• The development of a multilingual database with WordNets for several
European languages.
• Funded by the European Commission, DG XIII, LE2-4003 and LE4-8328
• March 1996 - September 1999 (2.5 Million EURO)
http://www.hum.uva.nl/~ewn
http://www.illc.uva.nl/EuroWordNet/finalresults-ewn.html
• Languages covered:
EuroWordNet-1 (LE2-4003): English, Dutch, Spanish, Italian
EuroWordNet-2 (LE4-8328): German, French, Czech, Estonian.
• Size of vocabulary:
EuroWordNet-1: 30,000 concepts - 50,000 word meanings.
EuroWordNet-2: 15,000 concepts- 25,000 word meaning.
• Type of vocabulary:
the most frequent words of the languages
all concepts needed to relate more specific concepts.
[1]
25Jarrar © 2014
EURO WordNet Model
I = Language Independent link
II = Link from Language Specific
to Inter lingual Index
III = Language Dependent Link
III
Lexical Items Table
cavalcare
andare
muoversi
III
guidare
ILI-record
{drive}
Inter-Lingual-Index
Ontology
2OrderEntity
Location Dynamic
Domains
Traffic
Air Road` III
Lexical Items Table
bewegen
gaan
rijden berijden
III
Lexical Items Table
driveride
move
go
III
III
Lexical Items Table
cabalgar
jinetear
III
conducir
mover
transitar
III
II
IIII
II
II
[1]
26Jarrar © 2014
The Multilingual Design
• Inter-Lingual-Index: unstructured fund of concepts to provide an
efficient mapping across the languages;
• Index-records are mainly based on WordNet synsets and consist of
synonyms, glosses and source references;
• Various types of complex equivalence relations are distinguished;
• Equivalence relations from synsets to index records: not on a word-to-
word basis;
• Indirect matching of synsets linked to the same index items;
[1]
27Jarrar © 2014
EURO WordNet Model
• WordNets are unique language-specific structures:
 same organizational principles: synset structure and same set of
semantic relations.
 different lexicalizations
 differences in synonymy and homonymy:
"decoration" in English versus "versiersel/versiering" in Dutch
"bank" in English (money/river) versus "bank" in Dutch
(money/furniture)
•BUT also different relations for similar synsets
[1]
28Jarrar © 2014
Some Downsides of the EuroWordNet Model
• Construction is not done uniformly
• Coverage differs
• Not all wordnets can communicate with one another, i.e. linked
to different versions of English wordnet
• Proprietary rights restrict free access and usage
• A lot of semantics is duplicated
• Complex and obscure equivalence relations due to linguistic
differences between English and other languages
[1]
29Jarrar © 2014
WordNet and Global WordNet
• Part 1: The English WordNet
• Part 2: Euro WordNet
• Part 3: Global WordNet
30Jarrar © 2014
From EuroWordNet to Global WordNet
EuroWordNet ended in 1999
Global Wordnet Association was founded in 2000 to maintain the
framework: http://www.globalwordnet.org
Currently, wordnets exist for more than 50 languages, including:
Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic,
Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit,
Tamil, Thai, Turkish, Zulu...
Many languages are genetically and typologically unrelated
http://www.globalwordnet.org
31Jarrar © 2014
From EuroWordNet to Global WordNet
• EuroWordNet ended in 1999
• Global Wordnet Association was founded in 2000 to maintain the
framework: http://www.globalwordnet.org
• Currently, wordnets exist for more than 50 languages, including:
Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic,
Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit,
Tamil, Thai, Turkish, Zulu...
• Many languages are genetically and typologically unrelated
 The Arabic WordNet extension was not successful, will be explained
later.
[1]
32Jarrar © 2014
Global WordNet Model
Construct separate wordnets for each language
Contributors from each language encode the same core set of concepts
plus culture/language-specific ones
Synsets (concepts) are mapped cross linguistically via an ontology
instead of just the English Wordnet
[1]
33Jarrar © 2014
Discussion
What would be a good database schema to store WordNet? Global
WordNEt?
What is the difference between Synset and Concept?
How precise the Hyponymy? And what is the difference between to
Hyponymy and subclass/subset?
34Jarrar © 2014
References
[1] Piek Vossen: Lecture Notes on The Global Wordnet Grid: anchoring languages to universal
meaning
http://www.authorstream.com/Presentation/Stentore-40555-WN-EWN-GWA-Koszalin-Global-Wordnet-Grid-
anchoring-languages-universal-meaning-kosz-Entertainment-ppt-powerpoint/
[2] Lyons, John. Semantics. Vol. 1. Cambridge: Cambridge UP, 1977. Print.

Weitere ähnliche Inhalte

Mehr von Mustafa Jarrar

Clustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisClustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisMustafa Jarrar
 
Classifying Processes and Basic Formal Ontology
Classifying Processes  and Basic Formal OntologyClassifying Processes  and Basic Formal Ontology
Classifying Processes and Basic Formal OntologyMustafa Jarrar
 
Discrete Mathematics Course Outline
Discrete Mathematics Course OutlineDiscrete Mathematics Course Outline
Discrete Mathematics Course OutlineMustafa Jarrar
 
Customer Complaint Ontology
Customer Complaint Ontology Customer Complaint Ontology
Customer Complaint Ontology Mustafa Jarrar
 
Subset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesSubset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesMustafa Jarrar
 
Schema Modularization in ORM
Schema Modularization in ORMSchema Modularization in ORM
Schema Modularization in ORMMustafa Jarrar
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineMustafa Jarrar
 
Lessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesLessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesMustafa Jarrar
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalMustafa Jarrar
 
Jarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsJarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsMustafa Jarrar
 
Habash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingHabash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingMustafa Jarrar
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsRiestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsMustafa Jarrar
 
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Mustafa Jarrar
 
Jarrar: Sparql Project
Jarrar: Sparql ProjectJarrar: Sparql Project
Jarrar: Sparql ProjectMustafa Jarrar
 
Jarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology EngineeringJarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology EngineeringMustafa Jarrar
 
Jarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesJarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesMustafa Jarrar
 
Jarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean MethodologyJarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean MethodologyMustafa Jarrar
 
Jarrar: Informed Search
Jarrar: Informed Search  Jarrar: Informed Search
Jarrar: Informed Search Mustafa Jarrar
 

Mehr von Mustafa Jarrar (20)

Clustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisClustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment Analysis
 
Classifying Processes and Basic Formal Ontology
Classifying Processes  and Basic Formal OntologyClassifying Processes  and Basic Formal Ontology
Classifying Processes and Basic Formal Ontology
 
Discrete Mathematics Course Outline
Discrete Mathematics Course OutlineDiscrete Mathematics Course Outline
Discrete Mathematics Course Outline
 
Customer Complaint Ontology
Customer Complaint Ontology Customer Complaint Ontology
Customer Complaint Ontology
 
Subset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesSubset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion Rules
 
Schema Modularization in ORM
Schema Modularization in ORMSchema Modularization in ORM
Schema Modularization in ORM
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in Palestine
 
Lessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesLessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online Courses
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-final
 
Jarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsJarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 Calls
 
Habash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingHabash: Arabic Natural Language Processing
Habash: Arabic Natural Language Processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsRiestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
 
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020
 
Jarrar: Sparql Project
Jarrar: Sparql ProjectJarrar: Sparql Project
Jarrar: Sparql Project
 
Jarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology EngineeringJarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology Engineering
 
Jarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesJarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing Ontologies
 
Jarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean MethodologyJarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean Methodology
 
Jarrar: Games
Jarrar: GamesJarrar: Games
Jarrar: Games
 
Jarrar: Informed Search
Jarrar: Informed Search  Jarrar: Informed Search
Jarrar: Informed Search
 

Kürzlich hochgeladen

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Kürzlich hochgeladen (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Jarrar: WordNet And Global WordNets

  • 1. 1Jarrar © 2014 Mustafa Jarrar Sina Institute, University of Birzeit mjarrar@birzeit.edu www.jarrar.info Lecture Notes on WordNet University of Birzeit, Palestine Fall Semester, 2014 WordNet EuroWordNet, and Global WordNet
  • 2. 2Jarrar © 2014 Watch this lecture and download the slides from http://jarrar-courses.blogspot.com/2011/11/artificial-intelligence-fall-2011.html
  • 3. 3Jarrar © 2014 Reading Everything in these slides + everything I say [MBC93] George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller: Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography, Vol. 3, Nr. 4. Pages 235-244. (1990) http://wordnetcode.princeton.edu/5papers.pdf [GGO02] Aldo Gangemi , Nicola Guarino , Alessandro Oltramari , Ro Oltramari , Stefano Borgo: Cleaning-up WordNet's Top-Level. In Proc. of the 1st International WordNetConference (2002) http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=C9962DFEDD7 93F3F839426B774BC9BAF?doi=10.1.1.11.4064&rep=rep1&type=pdf
  • 4. 4Jarrar © 2014 WordNet and Global WordNet • Part 1: The English WordNet • Part 2: Euro WordNet • Part 3: Global WordNet Lecture Keywords: ،‫مكنز‬ ،‫المفردات‬ ‫شبكة‬‫انطولوجيا‬،‫للغة‬،‫المعنى‬ ،‫الداللة‬ ،‫الداللة‬ ‫علم‬،‫المفهوم‬ ،‫اللغات‬ ‫تعدد‬‫عالقات‬ ،‫المعاني‬ ‫تصنيف‬ ،‫التضاد‬ ،‫المعاني‬ ‫تعدد‬ ،‫اللغوي‬ ‫الترادف‬ ‫جزء‬-‫كل‬ WordNet, Global WordNet, Thesaurus, Linguistic Ontology, Lexical Semantics, Semantics, Meaning, Synset, Concept, Synonymy, Polysemy, Hyponymy, Meronymy, Antonymy,
  • 5. 5Jarrar © 2014 What is WordNet? • In 1985 a group of psychologists and linguists at Princeton University started to develop a “mental lexicon”. • You may also call it:“electronic dictionary”, “Mental dictionary”, English, “semantic Network”, hyperdimensional thesaurus, etc. • Includes most frequent words (nouns, adjectives, adverbs, verbs). • Organized by meaning: words in close proximity are semantically similar. • Can be used by humans and machines. • Human users and computers can browse WordNet and find words that are meaningfully related to their queries. • Available online, for downloading! http://wordnet.princeton.edu
  • 6. 6Jarrar © 2014 WordNet: Synonymy WordNet gives information about two fundamental, universal properties of human language: polysemy and synonymy. • English words are grouped (roughly) into sets of synonyms. • Each set of synonyms is called a Synset; and given a unique SynsetID to identify it. • Each synset expresses a distinct meaning/concept. {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. 08283156 06501650 07955878 03410635 03018908 04615793 {work table} A table designed…
  • 7. 7Jarrar © 2014 Exercise List the different meanings of the words: Table, Array, Matrix, Bureau
  • 8. 8Jarrar © 2014 WordNet: Polysemy • Each word form-meaning pair is unique. • A word that appears in n synsets is n-fold polysemous. • For example: “Table” here is two-fold polysemous {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 9. 9Jarrar © 2014 WordNet: Glosses A short gloss is provided for each sysnet. Glosses are examples of contexts for many word-sense pairs, telling us how words with specific senses are being used in context. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 10. 10Jarrar © 2014 WordNet: Statistics 155 287 word forms, groups into 117 659 synsets {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed… WordForms Synsets noun 117,798 82,115 verb 11,529 13,767 adjective 21,479 18,156 adverb 4,481 3,621 Total 155,287 117,659
  • 11. 11Jarrar © 2014 WordNet Semantic Relations Synsets are interconnected with semantic relations, forming a large semantic network (graph). Such Relations are: • Hyponymy, also called “Is a” relation, or sub/superordinate. • Meronymy, also called “part of” relation {Container} Any object that can be used .. {Drawer} A boxlike container in a.. {shelf} A support that consists… {Support} Any device that bears.. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 12. 12Jarrar © 2014 WordNet Relations: Hyponymy • A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular Array is a kind of Array, Array is a kind of Arrangement,… • Hyponymy is transitive and asymmetrical. So as Hyponymy generates a hierarchical semantic structure, a hyponym inherits all the features of the more generic concept and adds at least one feature that distinguishes it from its superordinate. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 13. 13Jarrar © 2014 WordNet Relations: Hyponymy • A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular Array is a kind of Array, Array is a kind of Arrangement,… • Hyponymy is transitive and asymmetrical. So as Hyponymy generates a hierarchical semantic structure, a hyponym inherits all the features of the more generic concept and adds at least one feature that distinguishes it from its superordinate. [2] The WordNet hierarchy is about 16 levels {act, action, activity} {natural object } {animal, fauna} {natural phenomenon } {artifact } {person, human being} {attribute, property } {plant, flora} {body, corpus} {possession} {cognition, knowledge} {process} {communication} {quantity, amount} {event, happening} {relation } {feeling, emotion} {shape} {food} {state, condition} {group, collection} {substance} {location, place } {time} {motive} Top Level Nouns (25 unique beginners)
  • 14. 14Jarrar © 2014 WordNet Relations: Meronymy • A synset {x, x′, . . .} is meronym of the synset {y, y′, . . .} if native English speakers accept sentences like y has an x (as a part) or An x is a part of y. E. g., Finger is part of Hand , Hand is part of Arm, Arm is part of Body. • Meronymy is transitive (with qualification) and asymmetrical relations, and forms a part hierarchy.. • Synsets may have multiple hypernyms {Container} Any object that can be used .. {Drawer} A boxlike container in a.. {shelf} A support that consists… {Support} Any device that bears.. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 15. 15Jarrar © 2014 Exercise Find the hyponyms and meronyms of this synset {car, auto, automobile, machine, motorcar}
  • 16. 16Jarrar © 2014 WordNet Relations: Another Example {car, auto, automobile, machine, motorcar} {conveyance,transport} {vehicle} {motor vehicle, automotive vehicle} {cruiser, squad car, patrol car, police car, prowl car} {cab, taxi, hack, taxicab} {bumper} {car door} {car window} {car mirror} {armrest} {doorlock} {hinge, flexible joint} hyper(o)nym hyponym meronyms Hyponymy and meronymy relations are: • transitive • directed [1]
  • 17. 17Jarrar © 2014 {Old} Of long duration WordNet Relations: Antonymy • The antonym of a word x is sometimes not-x, but not always. For example, rich and poor are antonyms, but to say that someone is not rich does not imply that they must be poor; many people consider themselves neither rich nor poor. • Antonymy, which seems to be a simple symmetric relation, is actually quite complex, yet speakers of English have little difficulty recognizing antonyms when they see them. For example, the meanings {rise, ascend } and {fall, descend} may be conceptual opposites, but they are not antonyms; [rise/fall] are antonyms and so are [ascend/descend], but most people hesitate and look thoughtful when asked if rise and descend, or ascend and fall, are antonyms • Antonymy is a lexical relation between word forms, not a semantic relation between word meanings. Or, some call it semantic relations between words [MPC93]. {Fall, Come Down, Go Down, Descend} Move downward and lower, but not necessarily all the way {Set, Go down, Go Under} (astronomy) disappear beyond the horizon{Ascend, Come up, Rise, Uprise} (astronomy) come up, of celestial bodies {Ascend, Go up} Travel up {Rise, Uprise, Come up, Go up, Move up, Lift} Move upward {Ascend, Move up, Rise} Move to a better position in life … {Hot} Used of physical heat; having.. {Cold} Having a low or inadequate.. {New} Unaffected by use or exposure {New} Not of long duration; having.. {Worn} Affected by wear; damaged by … {Young, Immature} in an early period of life… {Old} having lived for a relatively
  • 18. 18Jarrar © 2014 WordWeb http://wordweb.info/free/ A nice and intuitive interface for WordNet
  • 19. 19Jarrar © 2014 Other WordNet Relations • Although the main interest of WordNet was on specifying semantic relations but other lexical/morphological relations between word forms were added. • For example: stems, singular-plural, verb tenses, etc.
  • 20. 20Jarrar © 2014 Why do we need WordNet? • Word sense disambiguation, • Information retrieval, • Automatic text classification, • Automatic text summarization, • Machine translation • ….etc.
  • 21. 21Jarrar © 2014 Is WordNet a Thesaurus? Yes: • it groups together meaningfully related words No: • WN labels the relations • The relations are limited • Related words are linked to specific concepts (disambiguated); thesaurus is a “bag of words” • Many words linked in WordNet do not co-occur in the same thesaurus entry • WordNet allows one to measure and quantify the semantic similarity or distance among words and concepts [Fellbaum]
  • 22. 22Jarrar © 2014 Is WordNet an Ontology? Meaning (called Ontological Precision): WordNet: based on what native speakers agree roughly Ontology: based on Scientific and philosophical findings. Classification: WordNet: based on what native speakers agree roughly (Student IsA person) Ontology: based on strict formal methodologies (student IsA role) Formal Specification: WordNet: logically vague Ontology: strictly formal  I like to use WordNet as a linguistic ontology, though it needs lots of cleaning!  Linguistic ontologies are difficult to build but they are immune to changes
  • 23. 23Jarrar © 2014 WordNet and Global WordNet • Part 1: The English WordNet • Part 2: Euro WordNet • Part 3: Global WordNet
  • 24. 24Jarrar © 2014 EURO WordNet • The development of a multilingual database with WordNets for several European languages. • Funded by the European Commission, DG XIII, LE2-4003 and LE4-8328 • March 1996 - September 1999 (2.5 Million EURO) http://www.hum.uva.nl/~ewn http://www.illc.uva.nl/EuroWordNet/finalresults-ewn.html • Languages covered: EuroWordNet-1 (LE2-4003): English, Dutch, Spanish, Italian EuroWordNet-2 (LE4-8328): German, French, Czech, Estonian. • Size of vocabulary: EuroWordNet-1: 30,000 concepts - 50,000 word meanings. EuroWordNet-2: 15,000 concepts- 25,000 word meaning. • Type of vocabulary: the most frequent words of the languages all concepts needed to relate more specific concepts. [1]
  • 25. 25Jarrar © 2014 EURO WordNet Model I = Language Independent link II = Link from Language Specific to Inter lingual Index III = Language Dependent Link III Lexical Items Table cavalcare andare muoversi III guidare ILI-record {drive} Inter-Lingual-Index Ontology 2OrderEntity Location Dynamic Domains Traffic Air Road` III Lexical Items Table bewegen gaan rijden berijden III Lexical Items Table driveride move go III III Lexical Items Table cabalgar jinetear III conducir mover transitar III II IIII II II [1]
  • 26. 26Jarrar © 2014 The Multilingual Design • Inter-Lingual-Index: unstructured fund of concepts to provide an efficient mapping across the languages; • Index-records are mainly based on WordNet synsets and consist of synonyms, glosses and source references; • Various types of complex equivalence relations are distinguished; • Equivalence relations from synsets to index records: not on a word-to- word basis; • Indirect matching of synsets linked to the same index items; [1]
  • 27. 27Jarrar © 2014 EURO WordNet Model • WordNets are unique language-specific structures:  same organizational principles: synset structure and same set of semantic relations.  different lexicalizations  differences in synonymy and homonymy: "decoration" in English versus "versiersel/versiering" in Dutch "bank" in English (money/river) versus "bank" in Dutch (money/furniture) •BUT also different relations for similar synsets [1]
  • 28. 28Jarrar © 2014 Some Downsides of the EuroWordNet Model • Construction is not done uniformly • Coverage differs • Not all wordnets can communicate with one another, i.e. linked to different versions of English wordnet • Proprietary rights restrict free access and usage • A lot of semantics is duplicated • Complex and obscure equivalence relations due to linguistic differences between English and other languages [1]
  • 29. 29Jarrar © 2014 WordNet and Global WordNet • Part 1: The English WordNet • Part 2: Euro WordNet • Part 3: Global WordNet
  • 30. 30Jarrar © 2014 From EuroWordNet to Global WordNet EuroWordNet ended in 1999 Global Wordnet Association was founded in 2000 to maintain the framework: http://www.globalwordnet.org Currently, wordnets exist for more than 50 languages, including: Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic, Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit, Tamil, Thai, Turkish, Zulu... Many languages are genetically and typologically unrelated http://www.globalwordnet.org
  • 31. 31Jarrar © 2014 From EuroWordNet to Global WordNet • EuroWordNet ended in 1999 • Global Wordnet Association was founded in 2000 to maintain the framework: http://www.globalwordnet.org • Currently, wordnets exist for more than 50 languages, including: Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic, Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit, Tamil, Thai, Turkish, Zulu... • Many languages are genetically and typologically unrelated  The Arabic WordNet extension was not successful, will be explained later. [1]
  • 32. 32Jarrar © 2014 Global WordNet Model Construct separate wordnets for each language Contributors from each language encode the same core set of concepts plus culture/language-specific ones Synsets (concepts) are mapped cross linguistically via an ontology instead of just the English Wordnet [1]
  • 33. 33Jarrar © 2014 Discussion What would be a good database schema to store WordNet? Global WordNEt? What is the difference between Synset and Concept? How precise the Hyponymy? And what is the difference between to Hyponymy and subclass/subset?
  • 34. 34Jarrar © 2014 References [1] Piek Vossen: Lecture Notes on The Global Wordnet Grid: anchoring languages to universal meaning http://www.authorstream.com/Presentation/Stentore-40555-WN-EWN-GWA-Koszalin-Global-Wordnet-Grid- anchoring-languages-universal-meaning-kosz-Entertainment-ppt-powerpoint/ [2] Lyons, John. Semantics. Vol. 1. Cambridge: Cambridge UP, 1977. Print.