SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Downloaden Sie, um offline zu lesen
MaxEnt Model
Overview
Anantharaman Narayana Iyer
30 Jan 2015
MaxEnt Classifier
• This is a powerful model that has equivalence to logistic regression
• Many NLP problems can be reformulated as classification problems
• E.g. Language Modelling, Tagging Problems
• MaxEnt is widely used for various text processing tasks.
• Task is to estimate the probability of a class given the context
• The term context may refer to a single word or group of words
• In a large text corpora contains information on the cooccurrence of
classes and specific contexts
Problem: MaxEnt (Refer paper by
Ratnaparkhi)
• Let p(a, b) be the probability of class a occurring with context b
• Given the sparsity of words in b and also limited training data, it is not
possible to completely specify p(a, b)
• Given the sparse evidence about a’s and b’s our goal is to estimate
the probability model p(a, b)
MaxEnt principle
Representing Evidence
• One way to represent evidence is to encode useful facts as features
and to impose constraints on the values of those feature expectations
• A feature is a binary valued function (indicator function):
• 𝑓𝑖: ε ⟶ 0, 1
• Given k features the constraints have the form:
• Expectation value of the model for the feature fj = Observed Expectation
value for the feature fj
• 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x = 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x
• The principle of maximum entropy requires:
Motivating Problems for Log-linear Models
• Language Model: Given the context (that is, words w1, w2, …, wi-1 ) predict the word wi
• Consider the examples:
• A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it has exactly two positive divisors, 1
and the number itself. Natural numbers greater than 1 that are not prime are called composite.
• Asked about the speculation that he may be inducted into the Cabinet, Parrikar said, “I can comment on it only after
meeting the Prime Minister. Let the Prime Minister who has invited me comment”
• Prime Minister Narendra Modi is likely to expand his Cabinet on Sunday, according to Times Now
• “The prime focus of this release of our product is to simplify the user interface”
• N-gram models
• Uses the context of previous (n-1) words to predict the nth word
• A trigram model approach uses 2 previous words
• Sometimes the accuracy can be improved if other features of the input are taken in to consideration as opposed to
using only a very limited context
• The n-gram LM techniques are not flexible enough to include additional features, such as the total length of sentence,
presence of certain specific words, identity of the author etc. Note: One might include extra features like author’s
name etc and compute conditional probabilities but such extensions to the conventional trigram approach becomes
quickly unwieldy
• Log-linear models can be used to include the additional features and improve the performance
The general problem
• We have an input domain X
• For example: A sequence of words
• There is a finite label set Y
• For example: The space of all possible words – that is the vocabulary
• Our goal is to determine P(y|x) for any x, y where x is in the input
space and y is in the space of labels
• For example: Given an input sentence (that is x, a sequence of words),
determine the next word in the sequence - that is P(wi | w1..wn)
Feature Vector
• A feature is a function fk(x, y) ∈ ℝ
• Often the features used in Log-linear models for typical NLP
applications are binary functions that are also called indicator
functions: fk(x, y) ∈ {0, 1}
• If we have m features then a feature vector f(x, y) ∈ ℝ 𝑚
• The number and choice of features for a given input is arbitrary. The
system developer can design these with an intuition of the problem
space he is addressing.
Features in Log-Linear Models
• Features are pieces of elementary pieces of evidence that link aspects
of what we observe x with a label y that we want to predict (Ref: C
Manning)
• A feature is a function with a bounded real value
𝑓: 𝑋 ∗ 𝑌 → ℝ
• Example:
• Consider a sentence: “Gandhi was born on 2 October 1869 in Porbandar”
• f1(x, y) = [y = PERSON and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2 =
VERB]
• f2(x, y) = [y = LOCATION and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2
= VERB]
• f3(x, y) = [y = DATE and wi = CD and wi-1 = (“on”) and wi-2 = VERB]
Feature Vector Representations
• Consider the examples:
• A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called
a prime number (or a prime) if it has exactly two
positive divisors, 1 and the number itself.
Natural numbers greater than 1 that are
not prime are called composite.
• Asked about the speculation that he may be inducted
into the Cabinet, Parrikar said, “I can comment on it
only after meeting the Prime Minister. Let the Prime
Minister who has invited me comment”
• Prime Minister Narendra Modi is likely to expand his
Cabinet on Sunday, according to Times Now
• “The prime focus of this release of our product is to
simplify the user interface”
• Exercise:
• What are the possible features we may consider for
representing the Trigram LM problem?
• How do we extend this set of trigram features in to a
more powerful set of features?
Parameter Vector
• Given the feature vector f(x, y) ∈ ℝ 𝑚 we can define the parameter
vector v ∈ ℝ 𝑚
• Each (x, y) is mapped to a score which is the dot product of the
parameter vector and the feature vector:
𝑣. 𝑓 𝑥, 𝑦 =
𝑘=1
𝑚
𝑣 𝑘 𝑓𝑘
Log-linear model - definition
• Let the Input domain X and label space Y
• Our goal is to determine P(y|x)
• A feature is a function: 𝑓: 𝑋 × 𝑌 → ℝ
• We have m features that constitute a feature vector: 𝑓 𝑥, 𝑦 ∈ ℝ 𝑚
• We also have the parameter vector: 𝑣 ∈ ℝ 𝑚
• We define the log-linear model as:
𝒑 𝒚 𝒙; 𝒗 =
𝒆 𝒗.𝒇 𝒙,𝒚
𝒚′∈𝒀
𝒆 𝒗.𝒇 𝒙,𝒚′
Refer: Coursera Notes Prof Michael Collins

Weitere ähnliche Inhalte

Was ist angesagt?

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsananth
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier ananth
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Treesananth
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligenceananth
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
 
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...hyunsung lee
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovBhaskar Mitra
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017Shuai Zhang
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
H transformer-1d paper review!!
H transformer-1d paper review!!H transformer-1d paper review!!
H transformer-1d paper review!!taeseon ryu
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksJonathan Mugan
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLPVijay Ganti
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyMarina Santini
 

Was ist angesagt? (20)

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
H transformer-1d paper review!!
H transformer-1d paper review!!H transformer-1d paper review!!
H transformer-1d paper review!!
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 

Andere mochten auch

Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum EntropyJiawang Liu
 
Max Entropy
Max EntropyMax Entropy
Max Entropyjianingy
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysisharit66
 
Stanford - Statistical Learning
Stanford - Statistical LearningStanford - Statistical Learning
Stanford - Statistical LearningRavi Sankar Varma
 
Hierarchichal species distributions model and Maxent
Hierarchichal species distributions model and MaxentHierarchichal species distributions model and Maxent
Hierarchichal species distributions model and Maxentrichardchandler
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOtuxette
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisAnkur Tyagi
 
Stanford Statistical Learning
Stanford Statistical LearningStanford Statistical Learning
Stanford Statistical LearningKurt Holst
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningNihar Suryawanshi
 
L05 word representation
L05 word representationL05 word representation
L05 word representationananth
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learningtuxette
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionananth
 
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...Decision and Policy Analysis Program
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learningmahutte
 
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...Toshiki Sakai
 

Andere mochten auch (20)

Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
 
Max Entropy
Max EntropyMax Entropy
Max Entropy
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
MaxEnt 2009 talk
MaxEnt 2009 talkMaxEnt 2009 talk
MaxEnt 2009 talk
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Stanford - Statistical Learning
Stanford - Statistical LearningStanford - Statistical Learning
Stanford - Statistical Learning
 
Hierarchichal species distributions model and Maxent
Hierarchichal species distributions model and MaxentHierarchichal species distributions model and Maxent
Hierarchichal species distributions model and Maxent
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Stanford Statistical Learning
Stanford Statistical LearningStanford Statistical Learning
Stanford Statistical Learning
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
 
L05 word representation
L05 word representationL05 word representation
L05 word representation
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
 
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
 

Ähnlich wie MaxEnt Classifier Overview

Text Processing Framework for Hindi
Text Processing Framework for HindiText Processing Framework for Hindi
Text Processing Framework for HindiUtsav Chokshi
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesValentina Paunovic
 
Scaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningScaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningVo Viet Anh
 
Introduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptxIntroduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptxMMCOE, Karvenagar, Pune
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
 
Discovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceDiscovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceKeita (Del Valle) Wangari
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxRuchitaMaaran
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkSaurav Jha
 
Introduction to programming with python
Introduction to programming with pythonIntroduction to programming with python
Introduction to programming with pythonPorimol Chandro
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfMAJDABDALLAH3
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...DrkhanchanaR
 
Module 3,4.pptx
Module 3,4.pptxModule 3,4.pptx
Module 3,4.pptxSandeepR95
 
Practical Natural language processing
Practical Natural language processing Practical Natural language processing
Practical Natural language processing Kim Ming Teh
 
Tokenization and how to use it from scratch
Tokenization and how to use it from scratchTokenization and how to use it from scratch
Tokenization and how to use it from scratchMahmoud Yasser
 

Ähnlich wie MaxEnt Classifier Overview (20)

Text Processing Framework for Hindi
Text Processing Framework for HindiText Processing Framework for Hindi
Text Processing Framework for Hindi
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
 
Scaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningScaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine Learning
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
 
Introduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptxIntroduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptx
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Discovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceDiscovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search Interface
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
PL Lecture 03 - Types
PL Lecture 03 - TypesPL Lecture 03 - Types
PL Lecture 03 - Types
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural Network
 
LLM.pdf
LLM.pdfLLM.pdf
LLM.pdf
 
Introduction to programming with python
Introduction to programming with pythonIntroduction to programming with python
Introduction to programming with python
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdf
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
 
Module 3,4.pptx
Module 3,4.pptxModule 3,4.pptx
Module 3,4.pptx
 
Practical Natural language processing
Practical Natural language processing Practical Natural language processing
Practical Natural language processing
 
Tokenization and how to use it from scratch
Tokenization and how to use it from scratchTokenization and how to use it from scratch
Tokenization and how to use it from scratch
 

Mehr von ananth

Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architecturesananth
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
Search problems in Artificial Intelligence
Search problems in Artificial IntelligenceSearch problems in Artificial Intelligence
Search problems in Artificial Intelligenceananth
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligenceananth
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognitionananth
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1ananth
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)ananth
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpananth
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 wordsananth
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 

Mehr von ananth (10)

Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Search problems in Artificial Intelligence
Search problems in Artificial IntelligenceSearch problems in Artificial Intelligence
Search problems in Artificial Intelligence
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlp
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

MaxEnt Classifier Overview

  • 2. MaxEnt Classifier • This is a powerful model that has equivalence to logistic regression • Many NLP problems can be reformulated as classification problems • E.g. Language Modelling, Tagging Problems • MaxEnt is widely used for various text processing tasks. • Task is to estimate the probability of a class given the context • The term context may refer to a single word or group of words • In a large text corpora contains information on the cooccurrence of classes and specific contexts
  • 3. Problem: MaxEnt (Refer paper by Ratnaparkhi) • Let p(a, b) be the probability of class a occurring with context b • Given the sparsity of words in b and also limited training data, it is not possible to completely specify p(a, b) • Given the sparse evidence about a’s and b’s our goal is to estimate the probability model p(a, b)
  • 5. Representing Evidence • One way to represent evidence is to encode useful facts as features and to impose constraints on the values of those feature expectations • A feature is a binary valued function (indicator function): • 𝑓𝑖: ε ⟶ 0, 1 • Given k features the constraints have the form: • Expectation value of the model for the feature fj = Observed Expectation value for the feature fj • 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x = 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x • The principle of maximum entropy requires:
  • 6. Motivating Problems for Log-linear Models • Language Model: Given the context (that is, words w1, w2, …, wi-1 ) predict the word wi • Consider the examples: • A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it has exactly two positive divisors, 1 and the number itself. Natural numbers greater than 1 that are not prime are called composite. • Asked about the speculation that he may be inducted into the Cabinet, Parrikar said, “I can comment on it only after meeting the Prime Minister. Let the Prime Minister who has invited me comment” • Prime Minister Narendra Modi is likely to expand his Cabinet on Sunday, according to Times Now • “The prime focus of this release of our product is to simplify the user interface” • N-gram models • Uses the context of previous (n-1) words to predict the nth word • A trigram model approach uses 2 previous words • Sometimes the accuracy can be improved if other features of the input are taken in to consideration as opposed to using only a very limited context • The n-gram LM techniques are not flexible enough to include additional features, such as the total length of sentence, presence of certain specific words, identity of the author etc. Note: One might include extra features like author’s name etc and compute conditional probabilities but such extensions to the conventional trigram approach becomes quickly unwieldy • Log-linear models can be used to include the additional features and improve the performance
  • 7. The general problem • We have an input domain X • For example: A sequence of words • There is a finite label set Y • For example: The space of all possible words – that is the vocabulary • Our goal is to determine P(y|x) for any x, y where x is in the input space and y is in the space of labels • For example: Given an input sentence (that is x, a sequence of words), determine the next word in the sequence - that is P(wi | w1..wn)
  • 8. Feature Vector • A feature is a function fk(x, y) ∈ ℝ • Often the features used in Log-linear models for typical NLP applications are binary functions that are also called indicator functions: fk(x, y) ∈ {0, 1} • If we have m features then a feature vector f(x, y) ∈ ℝ 𝑚 • The number and choice of features for a given input is arbitrary. The system developer can design these with an intuition of the problem space he is addressing.
  • 9. Features in Log-Linear Models • Features are pieces of elementary pieces of evidence that link aspects of what we observe x with a label y that we want to predict (Ref: C Manning) • A feature is a function with a bounded real value 𝑓: 𝑋 ∗ 𝑌 → ℝ • Example: • Consider a sentence: “Gandhi was born on 2 October 1869 in Porbandar” • f1(x, y) = [y = PERSON and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2 = VERB] • f2(x, y) = [y = LOCATION and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2 = VERB] • f3(x, y) = [y = DATE and wi = CD and wi-1 = (“on”) and wi-2 = VERB]
  • 10. Feature Vector Representations • Consider the examples: • A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it has exactly two positive divisors, 1 and the number itself. Natural numbers greater than 1 that are not prime are called composite. • Asked about the speculation that he may be inducted into the Cabinet, Parrikar said, “I can comment on it only after meeting the Prime Minister. Let the Prime Minister who has invited me comment” • Prime Minister Narendra Modi is likely to expand his Cabinet on Sunday, according to Times Now • “The prime focus of this release of our product is to simplify the user interface” • Exercise: • What are the possible features we may consider for representing the Trigram LM problem? • How do we extend this set of trigram features in to a more powerful set of features?
  • 11. Parameter Vector • Given the feature vector f(x, y) ∈ ℝ 𝑚 we can define the parameter vector v ∈ ℝ 𝑚 • Each (x, y) is mapped to a score which is the dot product of the parameter vector and the feature vector: 𝑣. 𝑓 𝑥, 𝑦 = 𝑘=1 𝑚 𝑣 𝑘 𝑓𝑘
  • 12. Log-linear model - definition • Let the Input domain X and label space Y • Our goal is to determine P(y|x) • A feature is a function: 𝑓: 𝑋 × 𝑌 → ℝ • We have m features that constitute a feature vector: 𝑓 𝑥, 𝑦 ∈ ℝ 𝑚 • We also have the parameter vector: 𝑣 ∈ ℝ 𝑚 • We define the log-linear model as: 𝒑 𝒚 𝒙; 𝒗 = 𝒆 𝒗.𝒇 𝒙,𝒚 𝒚′∈𝒀 𝒆 𝒗.𝒇 𝒙,𝒚′
  • 13. Refer: Coursera Notes Prof Michael Collins