SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Natural Language Processing
Unit 1 – Introduction
Anantharaman Narayana Iyer
narayana dot anantharaman at gmail dot com
7th Aug 2015
Topics
• Motivation: Why NLP?
• Course Outline
• Grading Policy
What are the opportunities for NLP?
NLP is a hugely important topic for both industry and academia
Trends that accelerate NLP research
• Availability of web and social data
• Mobile devices as a source of data
• Need for natural language based I/O for
new devices
• ML techniques: eg deep learning
• Increasing availability of datasets in open
web e.g. Freebase, dbpedia
Motivation
• Google Search Engine
• Intelligently responding to the
query: eg, Where is India Gate?
• Predicting next word for
autocompletion
• Ability to do spelling corrections
• Segmenting words that may be
joined without space
• Ranking the search results
• Google translate
• Gmail
• Eg, Understand contents of an e-
mail through NLP and alert the
user
Speech/NLP
• What technologies
are involved here?
- Continuous Speech Recognition
- Keyword Spotting
- Text to speech
- Speech in Speech out systems
- Speaker identification
- Novel applications (to be explained on the board)
Disambiguation
• Consider an example below.
• We would like to collect tweets on a subject
(Say Rahul Gandhi) and analyse the
sentiment
• We can do a search on Twitter with the
Search API with key words: “Rahul Gandhi”
• This might miss tweets that have only the
term Rahul and not Gandhi.
• If we just search for the search terms:
[“Rahul”, “Gandhi”], we may get results that
match any Rahul (e.g Rahul Dravid or KL
Rahul)
• We can do an intelligent tweet search
using NLP techniques
Summarization
• The challenge we face is not the lack of
information but the overload.
• Summarization is a core technology that
can help address information overload
• Related Problems:
• How to validate the quality, correctness of
information?
• Summarizing multimedia
• How do we summarize social data, where:
• Data may have less signal, more noise!
• Data may be biased
• Data may not be factual
• Repetitive
• Can we autogenerate a (set of) Tweet(s)
from a news article?
Answer Evaluation
• Answer evaluation is a core
challenge for online
education systems.
• Wouldn’t it be nice if
questions can be both
descriptive as well as
objective?
• Can there be an automated
answer evaluation system
that doesn’t require peer
evaluation?
Sentiment Analysis
• Measurement of pulse of people
from social media
• Can measure sentiments against
a brand or product or events.
• Crowded space but not a fully
solved problem due to inherent
challenges in Natural Language
Processing
• Can we build a sentiment
analyser using RNNs and
evaluate the performance?
Plagiarism Detection
Dialog Systems
• Dialog systems that can be deployed
commercially?
• Natural Language Processing
• Natural language generation
Can we build a NLG library and make it open source?
Demo
• http://www.manifestation.com/neurotoys/eliza.php3
Course Structure
• Foundational
• Emerging
• Applications
Course Positioning
• Classical NLP techniques (such as Language Models, MaxEnt
classifiers, HMM, CRF etc) have proven to be effective in
addressing problems like Part of Speech tagging, Text
classification, Information Retrieval etc. However they are
inadequate when dealing with problems that involve more
semantics
• Modern approaches (such as deep learning) hold lot of
promise in addressing problems involving semantics. They
were also shown to produce results better than or equal to
classical techniques for typical NLP tasks.
• Internationally acclaimed courses like those offered by Dan
Jurafsky, Christopher Manning, Michael Collins on Coursera
and also those offered at Stanford are strong in the
traditional topics and somewhat light when discussing
emerging topics.
• The recent course by Socher at Stanford is heavy on
Recurrent network based approaches but assumes that the
student is familiar to a good extent with the traditional NLP
• Our course takes the best of both worlds and backs it up
with intense hands on work.
Key Topics
• Foundational
• Words, sentences: Tokenization, regular expressions, challenges of ambiguity, edit distance,
spelling corrections, string similarity, tf, tf-idf
• Stemming, Lemmatization
• Language models, smoothing, applications to speech, metrics
• Tagging problems: Viterbi Algorithm (HMM), POS, NER tagging, SRL
• Parsing: PCFG, CKY algorithm
• Information Retrieval, Information Extraction, Word Sense disambiguation, Summarization,
Q&A systems, Dialogue Systems
• Natural Language Generation
• Emerging Approaches:
• Deep Learning and Vector Space approaches to: Word representation, Sentence and text
compositionality, LM, Parsing, Parsing, Q&A Systems
• Applications:
• Modern approaches to many exciting applications including speech
Course Grading Policy
• Unit Evaluations (3 out of 5): 30%
• Lab sessions (2 out of 5): 10%
• T1: 15%
• Final Exam: 3 days, 6 to 8 hours per day of product development (Will
be run like a hackathon with a 90 minutes objective type written test
on day 1): 15% (for test) + 25% (for hands on)
• Attendance: 5%
Challenges: Why NLP is hard?
The central challenge of Natural Language Processing is ambiguity and
it exists at every level or stage of NLP
Poets and writers thrive on ambiguity in the language semantics while
most of us abhor ambiguity!
Can the NLP understand poetry or better still, can it generate one?
That seems to be the ultimate!
Another challenge is the representation: How to represent words?
Sentences? Large text? How to model the real world knowledge?
One prayer, 25 interpretations! (Ref: Raghuvamsa
by Kalidasa)
Vagarthaviva sampriktau vagarthah pratipattaye | Jagatah pitarau
vande parvathiparameshwarau || – Raghuvamsha 1.1
• Common Meaning: I pray parents of the world, Lord Shiva and
Mother Parvathi, who are inseparable as speech and its meaning to
gain knowledge of speech and its meaning.
Ambiguity – some examples
• Homophones: Words with same pronunciation but with different meanings
• Peace, piece: A spoken sentence like “The PM attended the peace summit” has an ambiguity at the term “peace”, as
a speech to text translation might translate this as “piece”
• Knew, new
• Weak, week
• Word boundary
• It’s all ready, looking great!
• It’s already looking great!
• Syntactic Ambiguity: Arises due to different parse trees for the same input
• Phrase boundary
• Ananth created the presentation with video from web: ‘with video’ can be attached as “Ananth created the presentation, ‘with video’ “ or to
“Ananth created the ‘presentation with video’”
• Semantic level ambiguity: Many ways to interpret a sentence
• John and Susan are married (to each other? Separately?)
• Ram had a smooth sailing.
• Prices have gone through the roof
• India says it can’t accept the proposal
Representation: Text, Images, Audio, Video
• What are the distinguishing characteristics of text data and what are the unique challenges?
• Text is made of words, images of pixels, audio with sampled and digitized audio signal, video with
image frames in motion
• How do we represent a piece of text in the computer?
• Let’s do a simple exercise: What are the thoughts, emotions that cross your mind when you hear
the following words?
• Kalam
• Brilliant
• Pleasant
• Destruction
• Perfume
• Code
• Test
• Run
• Signal
• Words can be used in different contexts and the context is key to interpreting the
meaning of the word

Weitere ähnliche Inhalte

Was ist angesagt?

Natural language processing
Natural language processingNatural language processing
Natural language processingHansi Thenuwara
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)VenkateshMurugadas
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLPAnuj Gupta
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingDavid Rostcheck
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar
 

Was ist angesagt? (20)

Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Nlp
NlpNlp
Nlp
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLP
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
NLP
NLPNLP
NLP
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
 

Andere mochten auch

Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionananth
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processingrohitnayak
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Natural language processing
Natural language processingNatural language processing
Natural language processingprashantdahake
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpananth
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingRishikese MR
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
 
L05 language model_part2
L05 language model_part2L05 language model_part2
L05 language model_part2ananth
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1ananth
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognitionananth
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
Finalpresentation
FinalpresentationFinalpresentation
FinalpresentationAndrea Hill
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing Adarsh Saxena
 
Natural Language Processing glossary for Coders
Natural Language Processing glossary for CodersNatural Language Processing glossary for Coders
Natural Language Processing glossary for CodersAravind Mohanoor
 

Andere mochten auch (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP
NLPNLP
NLP
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlp
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
L05 language model_part2
L05 language model_part2L05 language model_part2
L05 language model_part2
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Nlp
NlpNlp
Nlp
 
Finalpresentation
FinalpresentationFinalpresentation
Finalpresentation
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Natural Language Processing glossary for Coders
Natural Language Processing glossary for CodersNatural Language Processing glossary for Coders
Natural Language Processing glossary for Coders
 
ADO.NET Introduction
ADO.NET IntroductionADO.NET Introduction
ADO.NET Introduction
 

Ähnlich wie Natural Language Processing: L01 introduction

Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptxbuivantan_uneti
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and searchNathan McMinn
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEDiana Maynard
 
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...hajinouha0
 
Nlp presentation
Nlp presentationNlp presentation
Nlp presentationSurya Sg
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Building NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML GroupBuilding NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML Groupbotsplash.com
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptxBelay Alemayehu
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxSHIBDASDUTTA
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptxAmanBadesra1
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaDiana Maynard
 
Open Creativity Scoring Tutorial
Open Creativity Scoring TutorialOpen Creativity Scoring Tutorial
Open Creativity Scoring TutorialDenisDumas2
 
introduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).pptintroduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).pptTemesgenTolcha2
 
Introduction to nlp
Introduction to nlpIntroduction to nlp
Introduction to nlpAmaan Shaikh
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Technology that enhances classroom learning
Technology that enhances classroom learningTechnology that enhances classroom learning
Technology that enhances classroom learningCarrie Davenport
 

Ähnlich wie Natural Language Processing: L01 introduction (20)

Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptx
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and search
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
 
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
 
Nlp presentation
Nlp presentationNlp presentation
Nlp presentation
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Building NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML GroupBuilding NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML Group
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptx
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social media
 
Open Creativity Scoring Tutorial
Open Creativity Scoring TutorialOpen Creativity Scoring Tutorial
Open Creativity Scoring Tutorial
 
introduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).pptintroduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).ppt
 
Introduction to nlp
Introduction to nlpIntroduction to nlp
Introduction to nlp
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Technology that enhances classroom learning
Technology that enhances classroom learningTechnology that enhances classroom learning
Technology that enhances classroom learning
 

Mehr von ananth

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsananth
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architecturesananth
 
Foundations: Artificial Neural Networks
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networksananth
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models ananth
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier ananth
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligenceananth
 
Search problems in Artificial Intelligence
Search problems in Artificial IntelligenceSearch problems in Artificial Intelligence
Search problems in Artificial Intelligenceananth
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligenceananth
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Treesananth
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basicsananth
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learningananth
 
MaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overviewananth
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)ananth
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distanceananth
 
L05 word representation
L05 word representationL05 word representation
L05 word representationananth
 

Mehr von ananth (16)

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Foundations: Artificial Neural Networks
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networks
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
 
Search problems in Artificial Intelligence
Search problems in Artificial IntelligenceSearch problems in Artificial Intelligence
Search problems in Artificial Intelligence
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basics
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
 
MaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overview
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distance
 
L05 word representation
L05 word representationL05 word representation
L05 word representation
 

Kürzlich hochgeladen

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 

Kürzlich hochgeladen (20)

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 

Natural Language Processing: L01 introduction

  • 1. Natural Language Processing Unit 1 – Introduction Anantharaman Narayana Iyer narayana dot anantharaman at gmail dot com 7th Aug 2015
  • 2. Topics • Motivation: Why NLP? • Course Outline • Grading Policy
  • 3. What are the opportunities for NLP?
  • 4. NLP is a hugely important topic for both industry and academia
  • 5. Trends that accelerate NLP research • Availability of web and social data • Mobile devices as a source of data • Need for natural language based I/O for new devices • ML techniques: eg deep learning • Increasing availability of datasets in open web e.g. Freebase, dbpedia
  • 6. Motivation • Google Search Engine • Intelligently responding to the query: eg, Where is India Gate? • Predicting next word for autocompletion • Ability to do spelling corrections • Segmenting words that may be joined without space • Ranking the search results • Google translate • Gmail • Eg, Understand contents of an e- mail through NLP and alert the user
  • 7. Speech/NLP • What technologies are involved here? - Continuous Speech Recognition - Keyword Spotting - Text to speech - Speech in Speech out systems - Speaker identification - Novel applications (to be explained on the board)
  • 8. Disambiguation • Consider an example below. • We would like to collect tweets on a subject (Say Rahul Gandhi) and analyse the sentiment • We can do a search on Twitter with the Search API with key words: “Rahul Gandhi” • This might miss tweets that have only the term Rahul and not Gandhi. • If we just search for the search terms: [“Rahul”, “Gandhi”], we may get results that match any Rahul (e.g Rahul Dravid or KL Rahul) • We can do an intelligent tweet search using NLP techniques
  • 9. Summarization • The challenge we face is not the lack of information but the overload. • Summarization is a core technology that can help address information overload • Related Problems: • How to validate the quality, correctness of information? • Summarizing multimedia • How do we summarize social data, where: • Data may have less signal, more noise! • Data may be biased • Data may not be factual • Repetitive • Can we autogenerate a (set of) Tweet(s) from a news article?
  • 10. Answer Evaluation • Answer evaluation is a core challenge for online education systems. • Wouldn’t it be nice if questions can be both descriptive as well as objective? • Can there be an automated answer evaluation system that doesn’t require peer evaluation?
  • 11. Sentiment Analysis • Measurement of pulse of people from social media • Can measure sentiments against a brand or product or events. • Crowded space but not a fully solved problem due to inherent challenges in Natural Language Processing • Can we build a sentiment analyser using RNNs and evaluate the performance?
  • 13. Dialog Systems • Dialog systems that can be deployed commercially? • Natural Language Processing • Natural language generation Can we build a NLG library and make it open source?
  • 15. Course Structure • Foundational • Emerging • Applications
  • 16. Course Positioning • Classical NLP techniques (such as Language Models, MaxEnt classifiers, HMM, CRF etc) have proven to be effective in addressing problems like Part of Speech tagging, Text classification, Information Retrieval etc. However they are inadequate when dealing with problems that involve more semantics • Modern approaches (such as deep learning) hold lot of promise in addressing problems involving semantics. They were also shown to produce results better than or equal to classical techniques for typical NLP tasks. • Internationally acclaimed courses like those offered by Dan Jurafsky, Christopher Manning, Michael Collins on Coursera and also those offered at Stanford are strong in the traditional topics and somewhat light when discussing emerging topics. • The recent course by Socher at Stanford is heavy on Recurrent network based approaches but assumes that the student is familiar to a good extent with the traditional NLP • Our course takes the best of both worlds and backs it up with intense hands on work.
  • 17. Key Topics • Foundational • Words, sentences: Tokenization, regular expressions, challenges of ambiguity, edit distance, spelling corrections, string similarity, tf, tf-idf • Stemming, Lemmatization • Language models, smoothing, applications to speech, metrics • Tagging problems: Viterbi Algorithm (HMM), POS, NER tagging, SRL • Parsing: PCFG, CKY algorithm • Information Retrieval, Information Extraction, Word Sense disambiguation, Summarization, Q&A systems, Dialogue Systems • Natural Language Generation • Emerging Approaches: • Deep Learning and Vector Space approaches to: Word representation, Sentence and text compositionality, LM, Parsing, Parsing, Q&A Systems • Applications: • Modern approaches to many exciting applications including speech
  • 18. Course Grading Policy • Unit Evaluations (3 out of 5): 30% • Lab sessions (2 out of 5): 10% • T1: 15% • Final Exam: 3 days, 6 to 8 hours per day of product development (Will be run like a hackathon with a 90 minutes objective type written test on day 1): 15% (for test) + 25% (for hands on) • Attendance: 5%
  • 19. Challenges: Why NLP is hard? The central challenge of Natural Language Processing is ambiguity and it exists at every level or stage of NLP Poets and writers thrive on ambiguity in the language semantics while most of us abhor ambiguity! Can the NLP understand poetry or better still, can it generate one? That seems to be the ultimate! Another challenge is the representation: How to represent words? Sentences? Large text? How to model the real world knowledge?
  • 20. One prayer, 25 interpretations! (Ref: Raghuvamsa by Kalidasa) Vagarthaviva sampriktau vagarthah pratipattaye | Jagatah pitarau vande parvathiparameshwarau || – Raghuvamsha 1.1 • Common Meaning: I pray parents of the world, Lord Shiva and Mother Parvathi, who are inseparable as speech and its meaning to gain knowledge of speech and its meaning.
  • 21. Ambiguity – some examples • Homophones: Words with same pronunciation but with different meanings • Peace, piece: A spoken sentence like “The PM attended the peace summit” has an ambiguity at the term “peace”, as a speech to text translation might translate this as “piece” • Knew, new • Weak, week • Word boundary • It’s all ready, looking great! • It’s already looking great! • Syntactic Ambiguity: Arises due to different parse trees for the same input • Phrase boundary • Ananth created the presentation with video from web: ‘with video’ can be attached as “Ananth created the presentation, ‘with video’ “ or to “Ananth created the ‘presentation with video’” • Semantic level ambiguity: Many ways to interpret a sentence • John and Susan are married (to each other? Separately?) • Ram had a smooth sailing. • Prices have gone through the roof • India says it can’t accept the proposal
  • 22. Representation: Text, Images, Audio, Video • What are the distinguishing characteristics of text data and what are the unique challenges? • Text is made of words, images of pixels, audio with sampled and digitized audio signal, video with image frames in motion • How do we represent a piece of text in the computer? • Let’s do a simple exercise: What are the thoughts, emotions that cross your mind when you hear the following words? • Kalam • Brilliant • Pleasant • Destruction • Perfume • Code • Test • Run • Signal • Words can be used in different contexts and the context is key to interpreting the meaning of the word