Charlie Greenbacker, founder and co-organizer of the DC NLP meetup group, provides a "crash course" in Natural Language Processing techniques and applications.
2. Agenda
⢠Introduction & Motivation
⢠Famous Examples
⢠Basics
⢠Major Task Areas
⢠Protips
⢠Resources
3. Introduction
& Motivation
By âNLPâ we mean...
Natural Language Processing
(#NLProc)
aka Computational Linguistics,Text Analytics, etc.
not Neuro-linguistic Programming! (#NLP)
4. Introduction
& Motivation
Natural Language Processing is...
Using computers to process (i.e., analyze,
understand, generate, etc.) natural human
languages (e.g., English, Chinese, Klingon).
Hello, world! ä˝ ĺĽ˝ďźä¸çďź
5. That sounds hard... why should I care?
⢠Most of the knowledge created by humans
is unstructured text (information overload)
⢠Need some way to make sense of it all
⢠Enable quantitative analysis of text data
Introduction
& Motivation
6. Famous Examples
Siri (Apple, SRI, Nuance)
Speech Recognition/Generation
IBM Watson
Question Answering
Google Translate
MachineTranslation
16. Major Task Areas
Assistive Technologies
⢠Text simpliďŹcation
⢠Predictive text input
⢠Alternative interfaces
17. Major Task Areas
NLG + Automatic Summarization
⢠Generating text from data
⢠Extractive summarization
⢠Abstractive summarization
18. Major Task Areas
Machine Translation
⢠From source to target, and back!
⢠Single terms work... sometimes
⢠Idioms, metaphors, cultural references
19. Major Task Areas
Sentiment Analysis
⢠Polarity, intensity, direction
⢠"Easy" for movie/product reviews
⢠"Impossible" for nearly anything else
20. Protips
⢠Domain adaptation
(retrain your models, social media != news)
⢠Assume everything is in beta
(error rates compound, translate last,
consult the research literature)
⢠Evaluation is essential
(human judges,âgold standardâ data,
cross-validation, appropriate metrics)
22. Resources
(books)
Natural Language
Processing with Python
Bird, Klein, and Loper
Speech and Language______________
Processing______________
Jurafsky and Martin______________
Foundations of Statistical
Natural Language Processing
Manning and SchĂźtze