Lecture 1: Semantic Analysis in Language Technology

Semantic Analysis in Language Technology
Lecture 1: Introduction
Course Website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

MARINA SANTINI
PROGRAM: COMPUTATIONAL LINGUISTICS AND LANGUAGE TECHNOLOGY

DEPT OF LINGUISTICS AND PHILOLOGY

UPPSALA UNIVERSITY, SWEDEN

12 NOV 2013

Acknowledgements
2

 Thanks to Mats Dahllöf for the many slides I

borrowed from his previous course and for
structuring such an interesting and comprehensive
content.


Practical Information
3
INTENDED LEARNING OUTCOMES
ASSIGNMENTS AND EXAMINATION
READING LIST
DEMOS


Course Website & Contact Details
4

 Course website:
 http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm
 Contact details:
 santinim@stp.lingfil.uu.se
 marinasantini.ms@gmail.com
 marinaromestockholm@gmail.com


Check the website regularly and make sure to refresh the page:
we are building up this course together, so this page will be continously
updated!
5


About the Course
6

 Introduction to Semantics in Language Techology

and NLP.

 Focus on methods used in Language Technology

and NLP for the perform the following tasks:





Sentiment Analysis (SA)
Information Extraction (IE)
Word Sense Disambiguation (WSD)
Predicate-Argument Extraction (PAS)


Intended Learning Outcomes
7

 In order to pass the course, a student must be able to:


describe systems that perform the following tasks, apply them to authentic
linguistic data, and evaluate the results:
1.

detect and extract attitudes and opinions from text, i.e. Sentiment
Analysis (SA);

2.

use semantic analysis in the context of Information Extraction (IE)

3.

disambiguate instances of polysemous lemmas, i.e. Word Sense
Disambiguation (WSD);

4.

use robust methods to extract the Predicate-Argument Structure (PAS).


Compulsory Readings
8

1.

Bing Liu (2012) Sentiment Analysis and Opinion Mining, Morgan & Claypool.

2.

Richard Johansson and Pierre Nugues. 2008. Dependency-based Syntactic–
Semantic Analysis with PropBank and NomBank, CoNLL 2008: Proceedings
of the 12th Conference on Computational Natural Language Learning.

3.

Daniel Jurafsky and James H. Martin (2009), Speech and Language
Processing: An Introduction to Natural Language Processing, Computational
Linguistics, and Speech Recognition. Second Edition, Pearson Education.

4.

Daniel Gildea and Daniel Jurafsky. 2002. Automatic Labeling of Semantic
Roles, Computational Linguistics 28:3, 245-288.

5.

M Palmer, D Gildea, P Kingsbury. 2005. The proposition bank: An annotated
corpus of semantic roles, Computational Linguistics 31 (1), 71-106.

6.

Additional suggested readings will be listed at the end of each lecture


Demos & Tutorials
9

 This list will be continuosly updated, also with your

contribution…


Assignments and Examination
10

 Four Assignments:
1.
Essay writing: independent study of a system, an approach, or a field within
semantics-oriented language technology. The study will be presented both as a written
essay and an oral presentation. The essay work will also include a feedback step where
the work of another group is reviewed.
2.
Assignment on Predicate-Argument Structure (PAS)
3.
Assignment on Sentiment Analysis (SA)
4.
Assignment on Word Sense Disambiguation (WSD)

 General Info:
 No lab sessions, supervision by email
 Essay and assignments must be submitted to santinim@stp.lingfil.uu.se
 Examination:
 Written report submitted for each assignment
 All four assignments necessary to pass the course
 Grade G will be given to students who pass each assignment. Grade VG to those who
pass the essay assignment and at least one of the other ones with distinction.

IMPORTANT!
11

 Start thinking about a topic you are interested in for

your essay writing assignment!


Practical Organization
12

 45min + 15 min break
 Lectures on Course webpage and SlideShare
 Email all your questions to me: santinim@stp.lingfil.uu.se
 IMPORTANT:


Send me an email to santinim@stp.lingfil.uu.se, so I make sure that I have
all the correct email addresses. If you do not get an acknowledgement of
receipt, please give me a shout!


Interaction and Cooperation
13

 Communicate with me and with your classmates to

exchange ideas, if you have problems in
understanding notions and concepts or practical
implementations.
 Recommemdation: share your knowledge with your

peers and steam off stress.
 Cheating is not permitted 


Semantics in Language Technology Overview
14
SEMANTICS IN LANGUAGE TECHNOLOGY
APPLICATIONS
LEXICAL SEMANTICS
REPRESENTATION OF MEANING
SUMMARY


15

Semantics in Language Technology


Logic and Semantics
16

 Aristotelian logic – important ever since.
 Syllogisms, e.g.:
 Premise: No reptiles have fur.
 Premise: All snakes are reptiles.
 Conclusion: No snakes have fur.
 Modern logic develops, late 19th Century – more

general and systematic.
 Formal semantics in linguistics and philosophy
based on logic (20th Century).


Formal and Computational Semantics
17

 Computational semantics “is the study of how to

automate the process of constructing and reasoning
with meaning representations of natural language
expressions.” (Wikipedia).
 Early systems rule-based, most famous example:
“Montague grammar” (1970). Sophisticated
mechanisms for translation of English into a very
rich logic.
 Language technology: Recent interest in data-driven
and machine learning-based methods.

Semantics in NLP
18

 NLP semantics is typically more limited in scope than NL

semantics as analysed in linguistics and philosophy.

 NLP applications often handle semantic aspects without

having explicitly semantic components, e.g. in machine
translation.

 Other aspects of language – morphology, syntax, etc. –

can be seen as support systems for semantics: The
purpose of language lies in the use of expressions as
carriers of semantic meaning. And that is what many
NLP systems have to respect, e.g. MT, retrieval,
classification, etc.


Semantics and Truth (i)
19

Semantics, meanings and states of affairs:
 What a sentence means: a structure involving

(lexical) concepts and relations among them.
Can be articulated as a semantic
representation.

E.g. I ate a turkey sandwich. in predicate logic:

 A sentence and the semantic representation of

a sentence is also the representation of a
possible state of affairs.


Semantics and Truth (ii)
20

 Correspondence theory of truth: If the content of a sentence

corresponds to an actual state of affairs if it is true; otherwise, it is
false.

 Ignoring philosophical complications, in many cases we can extract

knowledge from texts.

E.g. Warmer climate entails increased release of carbon
dioxide by inland lakes. (From uu.se press release.)
 Related issue: Which texts should we trust?
 Many sentences are difficult to formalize in logic. (Modality,

conditionality, vague quantification, tense, etc.)


21

Representation of Meaning


Formalizing Meaning
22

 Linguistic content has – at least to a certain degree – a logical

structure that can be formalized by means of logical calculi –
meaning representations.

 The representation languages should be simple and

unambiguous – in contrast to complex and ambiguous NL.

 Logical calculi come with accounts of logical inference. They

are useful for reasoning-based applications.

 Meaning formalization faces far-reaching conceptual and
 computational difficulties.

Compositionality
23

 Linguistic content is compositional: Simple

expressions have a given (lexical) meaning; the
meaning of complex expressions is determined by
the meanings of their constituents.
People produce and understand new phrases and
sentences all the time. (NLP must also deal with
these.)
 Compositionality is studied in detail in
compositional syntax-driven semantics. Work in
this field is typically about hand-coded rule systems
for small fragments of NL.

Compositional Aspects
24


Compositional Aspects – Argument Structure
25


Discourse-Related Aspects
26


Compositional semantics in Language Technology
27


First-Order Predicate Logic (i)
28

 “flexible, well-understood, and computationally






tractable approach to the representation of
knowledge [and] meaning” (J&M. 2009: 589)
expressive
verifiability against a knowledge base (related to
database languages)
inference
model-theoretic semantics


First-Order Predicate Logic (ii)
29

 Boolean operators: negation and connectives
 Existential/universal quantification
 Individual constants
 Predicates (taking a number of arguments)


When to assume compositionality?
30


Multi-Word Expressions
31

MWEs (a.k.a multiword units or MUs) are lexical units
encompassing a wide range of linguistic phenomena,
such as idioms (e.g. kick the bucket = to die), collocations
(e.g. cream tea = a small meal eaten in Britain, with
small cakes and tea), regular compounds (cosmetic
surgery), graphically unstable compounds (e.g. selfcontained <> self contained <> selfcontained - all
graphical variants have huge number of hits in Google),
light verbs (e.g. do a revision vs. revise), lexical bundles
(e.g. in my opinion), etc. While easily mastered by
native speakers, MWEs' correct interpretation remains
challenging both for non-native speakers and for
language technology (LT), due to their complex and often
unpredictable nature.

Cross-linguality
Use Case: Information Access
32
In multi-ethnic societies, like the Swedish society, it is common that many non-native speakers
use public websites – e.g. Arbetesförmedlingen or Pensionsmyndigheten websites – to access
information that are vital to their living and integration in the host country. National
regulations are often accompanied by special terminology and new coinages. For instance, the
Swedish expression /egenremiss/ (14,900 hits, Google.se April 2013) – or alternatively as an
MWE – /egen remiss/ (8,210 hits, Google.se April 2013) denotes a referral to a specialist doctor
written by patients themselves. This expression is made up from two common Swedish words
/egen/ `own (adj)' and /remiss/ `referral'. It is a recent expression (probably coined around
20101) and not yet recorded in any official dictionary nor in Wiktionary or other multilingual
online lexical resources. However, it is very frequent in query logs belonging to a Swedish public
health service website. When trying to implement a cross-lingual search based on the automatic
translation of query logs, it turned out that none of the existing multilingual lexical resources
contained this expression.


Use Case: Personal Use & Text Understanding
33

 The use of expressions that are marked for style, genre, domain, or

register (and/or other textual categories), or the use of expressions
which are misspelled or idiomatic for some textual category are
beyond the competence of a novice reader or a non-native speaker.
Additionally, in a web search or in social networks, one cannot tell if
the texts one reads are good or bad the way a firstlanguage readers
can. When readers/users read a language they do not know at all,
they can use automatic translation or online dictionaries or other
lexical resources. However, what they cannot determine well is the
*type* of text one is reading. They cannot tell if the text is
verbose, terse, formal, informal, stupid, funny, bad, or good.

 For instance, the phrase "es ist zum Kotzen" means this is

vernacular and unrefined text as well as a controversial expression.
The phrase "isch alle", instead, means that this line in the text is
spoken by a Berliner.


Semantics vs Pragmatics/Discourse (i)
34

 What does a word, a phrase, a text segment mean as an

NL expression? (“Linguistic meaning” – semantics.)
Conventional, static, systemic aspect of meaning.

 What does the author intend to convey by means of a

word, a phrase, a text segment? (“Speaker meaning” –
pragmatics/discourse.)
Contextual, dynamic aspect of meaning.

 The two aspects depend on each other, of course.

Semantics vs Pragmatics/Discourse (ii)
35


Semantics vs Pragmatics/Discourse (iii)
36


37

Applications


Semantics-oriented NLP applications
38

 Machine translation: The translation of a text segment should

mean the same as the original (to emphasize linguistic
meaning) or should convey the same content (to emphasize
speaker meaning).

 Information extraction is to extract components of the

information conveyed by a text.

 Question answering is extraction – combined with inference –

of an answer to a given question.

 Text classification, in typical cases, relates to the meanings of

the texts being classified.


Semantics and Generation
39

 Generation: semantic representation  NL. Less

challenging than analysis – the structure of the input
is under control. Needed in e.g. dialogue systems.
 Interlingua – semantic representation in machine
 translation:
Analysis: source language  interlingua.
Generation: interlingua  target language.
Would be economic if many languages are involved. The idea has
not proved very successful so far.


Reference
40

 Reference is very important – what statements are

about.
 Referring expressions are very common.
 Reference is a discourse phenomenon.
 Resolving reference is a crucial step in e.g.



extraction, e.g.in sentiment analysis
translation, e.g. to get agreement right


English it vs French il/elle vs Swedish den/det.


Reference –An Example
41


Kinds of Referring Expressions
42

 Indefinite noun phrases. E.g. a book. Introduce new






entities.
Pronouns. E.g. he. Typically coreferent with a
previous referring expression (antecedent).
Names. E.g. Bill Gates.
Demonstrative. E.g. this room.
Other definite noun phrases. E.g. the first chapter.
Reference to somehow known entity, often
previously mentioned.


Named Entity Recognition (NER)
43

 To identify expressions being used as names. (What

characterizes a “name”?)
 Also to identify what kind of name it is: E.g. of a
person, or a place, or a stretch of time, or a chemical
compound, or a gene, etc.
 “State-of-the-art NER systems for English produce
near-human performance. For example, the best
system entering MUC-7 scored 93.39% of F-measure
while human annotators scored 97.60% and 96.95%”
(Wikipedia).

Anaphora and Deixis Resolution
44

 Pronouns (they), pronominal adverbs (there, then), and

definite NP’s refer to entities by means of contextually
given information.
 E.g. by referring to previously mentioned referents –
anaphora.
 E.g. by reference based on the participants, time, and
place of the discourse – deixis (e.g. I, you, here,
yesterday).
 Anaphora and deixis resolution is much more
challenging task than NER. The reference of name-like
graph words is much more predictible. Compare Barack
Obama and he.

Sentiment Analysis – an extraction task
45

 What views do people express in blogs and reviews?

That’s interesting for politicans and marketing people.
 Opinions are often expressed in a personal and informal
way.
E.g. Peter bought me a Baileys marzipan chocolate thing
which I washed down with Gluehwein and that, in
combination with the bright lights and cheery faces really
made me feel warm inside! (From a blog post.)

 Sentiment analysis: to extract the referent of a

“sentiment” and the polarity positive–negative
associated with it.
E.g. Baileys marzipan chocolate – positive.


46

Lexical Semantics


Lexical Concepts
47

 Words are often grammatically simple, but carry a

structured conceptual content. Definitions “unpack”
the content of concepts:





friend – a person whom one knows well, is loyal to, etc.
turkey – a kind of animal, a bird, etc.
sandwich – a kind of food item, contains bread , etc.
eat – a relation (holding in/of an event) between an organism
and a food item, the food is chewed and ingested, etc.


Lexical Concepts - Decomposition
48


Lexical Concepts – Relations (i)
49


Lexical Concepts – Relations (ii)
50


Synonimy
51

Synonymy holds between two words (word tokens) which express the same
or similar concepts.
 Unsupervised detection of synonymy can be based on “The Distributional
Hypothesis: words with similar distributions have similar meanings.” =
The Distributional Hypothesis in linguistics is the theory that words
that occur in the same contexts tend to have similar meanings. The
underlying idea that "a word is characterized by the company it
keeps" was popularized by Firth.

“Random Indexing” is a method here. (“a high-dimensional model can be
projected into a space of lower dimensionality without compromising
distance metrics if the resulting dimensions are chosen appropriately”)
 Synonymy knowledge useful in e.g. translation, text classification, and

information extraction. Also “query expansion” in retrieval.


Lexical Ambiguity
52


Lexical Ambiguity - WSD
53


Word Ambiguity: Homography vs Polysemy (i)
54


Word Ambiguity: Homography vs Polysemy (ii)
55


Word Senses
56

 Discerning word senses (for a lemma) –

lexicographical task, matter of sophisticated
linguistic judgements.
 Theoretical principles. Practical purpose.
 Different dictionaries make different analyses.
 English: WordNet – a standard resource.


Senses of day in WordNet, for instance (i)
57


Senses of day in WordNet, for instance (ii)
58


Word Sense Disambiguation (WSD)
59

 A distributional hypothesis for WSD: words representing

the same sense have more similar distributions than
words representing different senses.
I.e. distribution similarity implies sense similiarity.
 We can use this for supervised learning of WSD.
 This requires data in the form of a sense-tagged corpus

(based on a given sense inventory, e.g. the one given by
WordNet).

Manual Sense-Tagging
60

 More difficult than typical grammatical tagging.
 As we saw in the day example, senses and their

distinctions can be quite subtle. Definitions and
examples are often far from obvious.
 Expensive: requires competent people and standardised
procedures.
 Quality measure: inter-annotator agreement. ” Ex:
Cohen's kappa coefficient is a statistical measure
of inter-rater agreement or inter-annotator
agreementfor qualitative (categorical) items. It is
generally thought to be a more robust measure than
simple percent agreement calculation since κ takes into
account the agreement occurring by chance ”

61

Summary


Conclusions (i)
62

 Logic-based semantics is a theoretical foundation for

NLP semantics, but implemented systems are
typically more coarse-grained and of a more limited
scope.

 Meaning depends both on literal content and

contextual information. This is a challenge for most
NLP tasks.

 Most NLP applications have to be highly sensitive to

semantics.


Conclusions (ii)
63

 Finding and interpreting names and other referential

expressions is a central issue for NLP semantics.
 Disambiguation of polysemous lexical tokens is also
a central issue for NLP semantics.
 Accessing the content of lexical tokens is also useful.
 Meaning representation involves predicateargument structure, which captures a basic aspect of
NL compositionality.


64

Start thinking about a Topic of interest for
your essay writing! Tell me your thoughts
next time…


Suggested Readings
65

 Term Logic (Wikipedia)

 Predicate Logic (Wikipedia)
 Jurafsky and Martin (2009):
 Ch. 17 ”Representation of Meaning”
 Ch. 18 ”Computational Semantics”
 Ch. 19 ”Lexical Semantics”
 Ch. 20 ”Compuational Lexical Semantics”
 Clark et al. (2010):
 Ch 15 ”Computational Semantics”

 Indurkhya and Damerau (2010):
 Ch 5 ”Semantic Analysis”


66

This is the end… Thanks for your attention !


Lecture 1: Semantic Analysis in Language Technology

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Lecture 1: Semantic Analysis in Language Technology

Ähnlich wie Lecture 1: Semantic Analysis in Language Technology (20)

Mehr von Marina Santini

Mehr von Marina Santini (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Lecture 1: Semantic Analysis in Language Technology

Hinweis der Redaktion