1. Natural Language Processing
Daniel Dahlmeier
NUS Graduate School for Integrative Sciences and Engineering
danielhe@comp.nus.edu.sg
CSTalks 2 November 2011
2. Acknowledgments
Examples and figures from Michael Collins’ lecture notes:
http://www.cs.columbia.edu/∼mcollins.
Some other figures are from Wikipedia: http://www.wikipedia.org.
The rest I randomly found on the web.
3. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Google translate
3/25
4. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
IBM’s Watson computer wins at Jeopardy!
4/25
5. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Siri
5/25
6. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
What is Natural Language Processing?
Natural Language Processing (NLP) or Computational Linguistics
Language processing that goes beyond a “bag of words” representation.
Example
Translate from one language into the other.
Answer natural language questions.
Parse the syntactic/semantic structure of a sentence.
The other NLP
NLP = neuro-linguistic programming.
6/25
7. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Background(s): Artificial Intelligence
Talk to your computer
Dave: Hello, HAL. Do you read me, HAL?
HAL: Affirmative, Dave. I read you.
Dave: Open the pod bay doors, HAL.
HAL: I’m sorry, Dave. I’m afraid I can’t do that.
The computer needs to ...
Understand the user : Natural Language Understanding.
Generate a well-formed reply : Natural Language Generation.
7/25
8. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Background(s): Artificial Intelligence (cont.)
Turing Test
Experimenter talks to two parties A and B via a terminal.
If C cannot distinguish which party is a computer and which is a
human, we should consider the computer to be intelligent.
Natural language is deeply intertwined with intelligence.
8/25
9. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Background(s): Linguistics
Generative Linguistics
Humans can produce and understand an infinite number of
sentences by means of a finite set of rules.
Language is produced through a generative, recursive process in the
human brain.
The principles that underlie this process are universal to all
languages (universal grammar). 9/25
10. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Background(s): the Web
“We are drowning in information but starved for knowledge.”
by Edward Osborne Wilson
Too much text to read...
Wikipedia: over 3.7 million articles (English).
PubMed: over 20 million citations.
WWW: billions of pages, trillions of words.
10/25
11. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Part-of-speech Tagging
Part-of-speech tagging
Input: a sentence.
Output: a part-of-speech tag sequence, e.g., noun, verb, adjective,...
Example
Profits/N soared/V at/P Boeing/N Co./N ,/, easily/ADV topping/V
forecasts/N on/P Wall/N Street/N ./.
11/25
12. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Named-entity recognition
Named-entity recognition
Input: a sentence.
Output: a BIO-named entity tag sequence, e.g., PERSON,
ORGANIZATION, OTHER.
Example
Profits/O soared/O at/O Boeing/B-ORG Co./I-ORG ,/O easily/O
topping/O forecasts/O on/O Wall/O Street/O ./O
12/25
13. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Word Sense Disambiguation
Word sense disambiguation
Input: a sentence.
Output: the sense of each word in the sentence.
Example
I/sense1 can/sense1 can/sense2 a/sense1 can sense3 .
13/25
14. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Parsing
Parsing
Input: a sentence.
Output: the syntactic tree structure of the sentence.
Example
Boeing is located in Seattle.
14/25
15. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Machine translation
Machine Translation
Input: a sentence in language F .
Output: the translated sentence in language E .
Example
Input: Syriens Pr¨sident Baschar al-Assad hat den Westen davor
a
gewarnt, sich in die Angelegenheiten seines Landes einzumischen.
Output: Syrian President Bashar al-Assad has warned the West against
interfering in the affairs of his country.
15/25
16. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Why is it hard? ( example from L.Lee)
“At last, a computer that understands you like your mother”
16/25
17. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Ambiguity of Natural Language
“At last, a computer that understands you like your mother”
This could mean...
1 It understands you as well as your mother understands you.
2 It understands (that) you like your mother.
3 It understands you as well as it understands your mother.
1 and 3: Does this mean well, or poorly?
17/25
18. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Ambiguity at the Acoustic Level
“At last, a computer that understands you like your mother”
This sounds like...
1 “... a computer that understands you like your mother.”
2 “... a computer that understands you lie cured mother.”
18/25
19. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Ambiguity at the Syntactic (structure) Level
“At last, a computer that understands you like your mother”
19/25
20. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Ambiguity at the Syntactic (structure) Level
“List all flights on Tuesday.”
20/25
21. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Ambiguity at the Semantic (meaning) Level
Definition of “mother”
1 a woman who has given birth to a child
2 a stringy slimy substance consisting of yeast cells and bacteria; is
added to cider or wine to produce vinegar.
More ambiguity
They put money in the bank (= buried in mud?).
I saw her duck with a telescope (= a duck carrying a telescope?).
21/25
22. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Ambiguity at the Discourse (multi-clause) Level
Anaphora resolution
Alice says they’ve built a computer that understands you like your
mother.
But she ...
... doesn’t know any details (Alice)
... doesn’t understand me at all (my mother)
22/25
23. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Related Stuff
Machine Learning
This really made large-scale, open domain NLP applications possible.
Information Retrieval
Both need to “understand” language.
Linguistics
Interested in the nature of language.
Psychology / Cognitive Science
Both interested in human cognitive capabilities.
23/25
24. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Conclusion
What I have told you...
What NLP is about.
Some NLP tasks that people work on.
Why it’s not that easy.
What I haven’t told you
How do you solve all these problems?
How well does it work?
What is left to be done?
24/25
25. Examples
What is NLP?
Background
NLP tasks
Why is it hard?
Related Stuff
Conclusion
Would you like to know more?
NLP courses at NUS
CS4248: natural language processing
CS6207: advanced natural language processing
Books
Jurafsky and Martin, Speech and Language Processing (2nd Edition)
25/25