SlideShare ist ein Scribd-Unternehmen logo
1 von 72
Under the guidance ofUnder the guidance of
Dr. Manish Shrivastava
By:
Bhuvnesh Pratap Singh(2006017)
Surya Prakash Rai(2006064)
Contents in the Store
What is Dependency Parsing ?
Problem Definition
Motivation
Objective
Conceptual Tour
Exploring Algorithms
Methodology
Results
Scope for further work
Q & A
What is Parsing ?
Parsing is the process of deducing the syntactic structure of aParsing is the process of deducing the syntactic structure of a
string. It a prerequisite for many natural language processing
tasks. It is used in applications such as Information Extraction
& Machine translation.
Dependency Parsing ?
Dependency parsing is a way of parsing where the parsing of a
sentence is performed by relating each word to other words in
the sentence which depend on it .
In terminology , a dependency relation holds between a Head and a
Dependent .
Alternative terms in the literature are Governor and regent for headAlternative terms in the literature are Governor and regent for head
and modifier for dependent .
Problem Definition
Till date four major dependency parsing algorithms have been
proposed viz. Covington projective , Covington non-projective , Nivre
arc-eager & Nivre arc-standard. Now the problem to be worked up onarc-eager & Nivre arc-standard. Now the problem to be worked up on
is the comparison of these different dependency parsing algorithms
on terms of accuracy and time complexity
Motivation
In some machine translation and natural language processing systems,
human languages are parsed by computer programs. Human
sentences are not easily parsed by programs, as there is substantial
ambiguity in the structure of human language.ambiguity in the structure of human language.
It is difficult to prepare formal rules to describe informal behaviour even
though it is clear that some rules are being followed
Moreover the task of data dependency parsing becomes highly
imperative in the light of the presence of free word order
languages around us just for example our own native languagelanguages around us just for example our own native language
Hindi
Objective
First understand the present major dependency parsing algorithms and
then implement the same on some platform so that a exhaustive
comparison can be drawn between these algorithms on the basis of
the time taken during learning and testing phase and then thethe time taken during learning and testing phase and then the
accuracy shown over the testing /validation data where the data used
for the purpose is English data in the conll format.
Conceptual Tour
Major concepts:
Dependency GrammarDependency Grammar
Dependency Tree
The Notion of Dependency
Types of Dependencies
Types of Dependency parsing
Dependency Grammar
The tradition of dependency grammar is based on the assumption
that syntactic structure consists of lexical elements linked by binary
asymmetrical relations called Dependencies.
Dependency Tree
The Notion of Dependency
The fundamental notion of dependency is based on the idea that theThe fundamental notion of dependency is based on the idea that the
syntactic structures of a sentence consists of binary asymmetrical
relations between the words of the sentence.
Tesniere said(1959)
The sentence is an organized whole, the constituent elements of
which are words. Every word that belongs to a sentence ceases bywhich are words. Every word that belongs to a sentence ceases by
itself to be isolated as in dictionary . Between the words & its
neighbors , the mind perceives connections , the totality of which
forms the structure of the sentence . The structural connections
establish dependency relations between the words . Each
connection in principle unites a superior term and a inferior term.
Criteria for identifying a syntactic relation
between a head H and a dependent D in a
construction C
H determines the syntactic category of C and can often replace C
H determines the semantic category of C; D gives semantic
representationrepresentation
H is obligatory ; D may be optional
H selects D and determines whether D is obligatory or optional
The form of D depends on H
Types of Dependencies
Endocentric ConstructionsEndocentric Constructions
Exocentric Constructions
Endocentric constructions
In an endocentric construction the Head can replace the whole
without disrupting the syntactic structure.without disrupting the syntactic structure.
“Economic news had little effect on [financial] markets”
Exocentric Constructions
In exocentric dependencies it’s not possible for the Head to replace
the whole with out disrupting the syntactic structure
“Economic news had little [effect] on financial markets”
Types of Dependency Parsing
Grammar – driven dependency parsing
Data – driven dependency parsing
Data driven dependency parsing
The methodology is based on three essential components:
1. Deterministic parsing algorithms for building dependency graphs
2. History-based feature models for predicting the next parser action
3. Discriminative machine learning to map histories to parser actions
Architecture for Data Driven Dependency
Parsing
The architecture consists of three main components:
ParserParser
Guide
Learner
Feature Representation
Exploring the different Dependency
Parsing Algorithms
Assumptions for Parsing
Unity
-Single tree with unique root
Uniqueness
-each word has only one head
One word at a time
Single left to right pass
Projectivity
-No crossing branches
Projective
Non Projective
Description of Algorithms Used
Parsing Algorithm
-Covington (projective ,non projective)
-Nivre (arc-eager ,arc standard)
Learning Algorithm
-SVM(LIBSVM)
Covington Algorithm
There are two parsing strategy basically
1.Brute-force search:
Examine each pair of words in the entire sentence, linking them as
head-to-dependent or dependent-to-head if the grammar permits.
If n words then n(n-1) pair
If backtracking allowed then complexity increases
2.Exhaustive left-to-right search:
Accept words one by one starting at the beginning of the sentence, and
try linking each word as head or dependent of every previous word.
Non Projective Covington Algorithm
ESH Algorithm:
Given an n-word sentence:
[1] for i := 1 to n do
[2] begin
[3] for j := i − 1 down to 1 do
[4] begin
[5] If the grammar permits,
link word j as head of word i;
[6] If the grammar permits,
link word j as dependent of word i
[7] end
[8] end
ESD Algorithm:
Given an n-word sentence:
[1] for i := 1 to n do
[2] begin
[3] for j := i − 1 down to 1 do
[4] begin
[5] If the grammar permits,
link word j as dependent of word I
[6] If the grammar permits,
link word j as head of word i;
[7] end
[8] end
inefficient Algorithm
Violation of unity ,uniqueness and projectivity
Use specific principle for uniqueness
So there are three variations
Algorithm ESHU
[1] for i := 1 to n do /*given n word sentence*/
[2] begin
[3] for j := i − 1 down to 1 do
[4] begin
[5] If no word has been[5] If no word has been
linked as head of word i, then
[6] if the grammar permits,
link word j as head of word i;
[7] If word j is not a dependent
of some other word, then
[8] if the grammar permits,
link word j as dependent of word i
[9] end
[10] end
Algorithm ESDU
[1] for i := 1 to n do/* Given an n-word sentence*/
[2] begin
[3] for j := i − 1 down to 1 do
[4] begin
[5] If word j is not a dependent[5] If word j is not a dependent
of some other word, then
[6] if the grammar permits,
link word j as dependent of word i
[7] If no word has been
linked as head of word i, then
[8] if the grammar permits,
link word j as head of word i;
[9] end
[10] end
Algorithm LSU
Headlist := [] /*Contains list of words that has no head*/
Wordlist := []/*All words encountered so for */
while (!end-of-sentence)
W := next input word;
for each D in Headlist
if HEAD?(W,D)
LINK(W,D);
delete D from Headlist;
end
for each H in Wordlist
if HEAD?(H,W)
LINK(H,W);
terminate this for each loop;
end
if no head for W was found then
Headlist := W + Headlist;
end
Wordlist := W + Wordlist;
end
Projective Covington Algorithm
/*we have two list head and word*/
Headlist := []; (Words that do not yet have heads)
Wordlist := []; (All words encountered so far)
repeat
(Accept a word and add it to Wordlist)
W := the next word to be parsed;
Wordlist := W + Wordlist;
(Look for dependents of W; they can only be
consecutive elements of Headlist
starting with the most recently added)
Contd…
for D := each element of Headlist,
starting with the first
begin
if D can depend on W then
begin
link D as dependent of W;
delete D from Headlist
end
else
terminate this for loop
end;
(Look for the head of W; it must
comprise the word preceding W)
Contd…
H := the word immediately preceding W
in the input string;
loop
if W can depend on H then
begin
link W as dependent of H;link W as dependent of H;
terminate the loop
end;
if H is independent then terminate the loop;
H := the head of H
end loop;
if no head for W was found then
Headlist := W + Headlist;
until all words have been parsed.
Nivre’s Algorithm
Configuration: C = 〈S, I, A〉
S = Stack
I = Input (remaining)
A = Arc relation (current)
Initialization:
〈nil, W, ∅〉
Termination:
〈S, nil, A〉 for any S, A
Acceptance:
〈S, nil, A〉 if (W, A) is connected
Transitions
• Left-Arc (LA):
〈wi|S, wj|I, A〉 → 〈S, wj|I, A ∪ {(wj, wi)}〉
if ¬∃a : a ∈ A ∧ dep(a) = wi
• Right-Arc (RA):
〈w |S, w |I, A〉 → 〈w |w |S, I, A ∪ {(w , w )}〉〈wi|S, wj|I, A〉 → 〈wj|wi|S, I, A ∪ {(wi, wj)}〉
if ¬∃a : a ∈ A ∧ dep(a) = wj
• Reduce (RE):
〈wi|S, I, A〉 → 〈S, I, A〉
if ∃a : a ∈ A ∧ dep(a) = wi
• Shift (SH):
〈S, wi|I, A〉 → 〈wi|S, I, A〉
An illustrative Example
Methodology
We as a group first studied the art of dependency parsing and then the
next task was to explore the various algorithms for which we have
made our time and accuracy analysis. After going through thesemade our time and accuracy analysis. After going through these
steps we created the LIBSVM Learning models (where LIBSVM is a
machine learning package for support vector machines with different
kernels) for each of these algorithms when a learning data from a
Treebank was given as input for the learning phase on the Malt
Parser platform
Results
Algorithms Covered
The Input data Treebank
Learning data :- WSJ_Train.conll
Testing data:- WSJ_Test.conll
No of tokens in WSJ_Train.conll:- 81651
No of tokens in WSJ_Test.conll:-16320
Learning Time Trade Offs
Parsing Time Trade offs
Accuracy trade off
We have described accuracy trade off in terms of Precision
where,where,
Precision = (No. of correct parses) / (Gold Standard parses)
The original Testing data in Conll format
The output data by Covington projective
Accuracy Trade offs
Scope for further work
Comparison between these algorithms for Hindi language data is quite
captivating but that depends on the ease of the availability of qualitycaptivating but that depends on the ease of the availability of quality
manually annotated Treebank data. In case a comprehensive Hindi
Treebank data is available the same comparison can be carried out
on Hindi which is something to look in to.
Thank you ☺☺☺☺Thank you ☺☺☺☺

Weitere ähnliche Inhalte

Was ist angesagt?

Word sense dissambiguation
Word sense dissambiguationWord sense dissambiguation
Word sense dissambiguationAshwin Perti
 
Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...
Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...
Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...Ashish Duggal
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingMarina Santini
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNetSeid Hassen
 
CBAS: CONTEXT BASED ARABIC STEMMER
CBAS: CONTEXT BASED ARABIC STEMMERCBAS: CONTEXT BASED ARABIC STEMMER
CBAS: CONTEXT BASED ARABIC STEMMERijnlc
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translationasimuop
 
Ai lecture 09(unit03)
Ai lecture  09(unit03)Ai lecture  09(unit03)
Ai lecture 09(unit03)vikas dhakane
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Marina Santini
 
A word sense disambiguation technique for sinhala
A word sense disambiguation technique  for sinhalaA word sense disambiguation technique  for sinhala
A word sense disambiguation technique for sinhalaVijayindu Gamage
 

Was ist angesagt? (9)

Word sense dissambiguation
Word sense dissambiguationWord sense dissambiguation
Word sense dissambiguation
 
Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...
Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...
Artificial Intelligence (AI) | Prepositional logic (PL)and first order predic...
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
 
CBAS: CONTEXT BASED ARABIC STEMMER
CBAS: CONTEXT BASED ARABIC STEMMERCBAS: CONTEXT BASED ARABIC STEMMER
CBAS: CONTEXT BASED ARABIC STEMMER
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
Ai lecture 09(unit03)
Ai lecture  09(unit03)Ai lecture  09(unit03)
Ai lecture 09(unit03)
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)
 
A word sense disambiguation technique for sinhala
A word sense disambiguation technique  for sinhalaA word sense disambiguation technique  for sinhala
A word sense disambiguation technique for sinhala
 

Andere mochten auch

Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translationAkshaya Arunan
 
Transition-based Dependency Parsing with Selectional Branching
Transition-based Dependency Parsing with Selectional BranchingTransition-based Dependency Parsing with Selectional Branching
Transition-based Dependency Parsing with Selectional BranchingJinho Choi
 
K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...
K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...
K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...Jinho Choi
 
Dependency Parsing
Dependency ParsingDependency Parsing
Dependency ParsingJinho Choi
 
Discrete-Chapter 11 Graphs Part I
Discrete-Chapter 11 Graphs Part IDiscrete-Chapter 11 Graphs Part I
Discrete-Chapter 11 Graphs Part IWongyos Keardsri
 
Discrete Mathematics & Its Applications (Graphs)
Discrete Mathematics & Its Applications (Graphs)Discrete Mathematics & Its Applications (Graphs)
Discrete Mathematics & Its Applications (Graphs)Fahrul Usman
 
An Introduction to Java Compiler and Runtime
An Introduction to Java Compiler and RuntimeAn Introduction to Java Compiler and Runtime
An Introduction to Java Compiler and RuntimeOmar Bashir
 
Graph theory 1
Graph theory 1Graph theory 1
Graph theory 1Tech_MX
 
Sorting Things Out: An Introduction to Card Sorting
Sorting Things Out: An Introduction to Card SortingSorting Things Out: An Introduction to Card Sorting
Sorting Things Out: An Introduction to Card SortingStephen Anderson
 
introduction to graph theory
introduction to graph theoryintroduction to graph theory
introduction to graph theoryChuckie Balbuena
 
Graphs - CH10 - Discrete Mathematics
Graphs - CH10 - Discrete MathematicsGraphs - CH10 - Discrete Mathematics
Graphs - CH10 - Discrete MathematicsOmnia A. Abdullah
 
Compiler Design
Compiler DesignCompiler Design
Compiler DesignMir Majid
 

Andere mochten auch (20)

Ch5b
Ch5bCh5b
Ch5b
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
 
Transition-based Dependency Parsing with Selectional Branching
Transition-based Dependency Parsing with Selectional BranchingTransition-based Dependency Parsing with Selectional Branching
Transition-based Dependency Parsing with Selectional Branching
 
K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...
K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...
K -best, Locally Pruned, Transition-based Dependency Parsing Using Robust Ris...
 
Dependency Parsing
Dependency ParsingDependency Parsing
Dependency Parsing
 
Discrete-Chapter 11 Graphs Part I
Discrete-Chapter 11 Graphs Part IDiscrete-Chapter 11 Graphs Part I
Discrete-Chapter 11 Graphs Part I
 
Discrete Mathematics & Its Applications (Graphs)
Discrete Mathematics & Its Applications (Graphs)Discrete Mathematics & Its Applications (Graphs)
Discrete Mathematics & Its Applications (Graphs)
 
An Introduction to Java Compiler and Runtime
An Introduction to Java Compiler and RuntimeAn Introduction to Java Compiler and Runtime
An Introduction to Java Compiler and Runtime
 
Syntaxdirected (1)
Syntaxdirected (1)Syntaxdirected (1)
Syntaxdirected (1)
 
Compiler unit 2&3
Compiler unit 2&3Compiler unit 2&3
Compiler unit 2&3
 
Graph theory 1
Graph theory 1Graph theory 1
Graph theory 1
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Relations in Discrete Math
Relations in Discrete MathRelations in Discrete Math
Relations in Discrete Math
 
Sorting Things Out: An Introduction to Card Sorting
Sorting Things Out: An Introduction to Card SortingSorting Things Out: An Introduction to Card Sorting
Sorting Things Out: An Introduction to Card Sorting
 
introduction to graph theory
introduction to graph theoryintroduction to graph theory
introduction to graph theory
 
Graphs - CH10 - Discrete Mathematics
Graphs - CH10 - Discrete MathematicsGraphs - CH10 - Discrete Mathematics
Graphs - CH10 - Discrete Mathematics
 
Ch5a
Ch5aCh5a
Ch5a
 
Code generation
Code generationCode generation
Code generation
 
Sorting
SortingSorting
Sorting
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 

Ähnlich wie Dependency Parsing Algorithms Analysis - Major Project

AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONijnlc
 
EasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfEasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfNohaGhoweil
 
Principles of parameters
Principles of parametersPrinciples of parameters
Principles of parametersVelnar
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER cscpconf
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZERDESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZERcsandit
 
Design of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizerDesign of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizercsandit
 
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION cscpconf
 
Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...csandit
 
Query Translation for Data Sources with Heterogeneous Content Semantics
Query Translation for Data Sources with Heterogeneous Content Semantics Query Translation for Data Sources with Heterogeneous Content Semantics
Query Translation for Data Sources with Heterogeneous Content Semantics Jie Bao
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioDeep Learning Italia
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
SETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGSETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGkevig
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language ProcessingVsevolod Dyomkin
 
FURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICON
FURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICONFURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICON
FURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICONkevig
 

Ähnlich wie Dependency Parsing Algorithms Analysis - Major Project (20)

New word analogy corpus
New word analogy corpusNew word analogy corpus
New word analogy corpus
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
EasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfEasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdf
 
Principles of parameters
Principles of parametersPrinciples of parameters
Principles of parameters
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZERDESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER
 
Design of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizerDesign of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizer
 
Exempler approach
Exempler approachExempler approach
Exempler approach
 
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
 
N410197100
N410197100N410197100
N410197100
 
Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...
 
Query Translation for Data Sources with Heterogeneous Content Semantics
Query Translation for Data Sources with Heterogeneous Content Semantics Query Translation for Data Sources with Heterogeneous Content Semantics
Query Translation for Data Sources with Heterogeneous Content Semantics
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglio
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
SETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGSETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGING
 
NLP
NLPNLP
NLP
 
NLP
NLPNLP
NLP
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language Processing
 
FURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICON
FURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICONFURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICON
FURTHER INVESTIGATIONS ON DEVELOPING AN ARABIC SENTIMENT LEXICON
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Dependency Parsing Algorithms Analysis - Major Project

  • 1. Under the guidance ofUnder the guidance of Dr. Manish Shrivastava By: Bhuvnesh Pratap Singh(2006017) Surya Prakash Rai(2006064)
  • 2. Contents in the Store What is Dependency Parsing ? Problem Definition Motivation Objective Conceptual Tour Exploring Algorithms Methodology Results Scope for further work Q & A
  • 3. What is Parsing ? Parsing is the process of deducing the syntactic structure of aParsing is the process of deducing the syntactic structure of a string. It a prerequisite for many natural language processing tasks. It is used in applications such as Information Extraction & Machine translation.
  • 4. Dependency Parsing ? Dependency parsing is a way of parsing where the parsing of a sentence is performed by relating each word to other words in the sentence which depend on it .
  • 5. In terminology , a dependency relation holds between a Head and a Dependent . Alternative terms in the literature are Governor and regent for headAlternative terms in the literature are Governor and regent for head and modifier for dependent .
  • 6. Problem Definition Till date four major dependency parsing algorithms have been proposed viz. Covington projective , Covington non-projective , Nivre arc-eager & Nivre arc-standard. Now the problem to be worked up onarc-eager & Nivre arc-standard. Now the problem to be worked up on is the comparison of these different dependency parsing algorithms on terms of accuracy and time complexity
  • 7. Motivation In some machine translation and natural language processing systems, human languages are parsed by computer programs. Human sentences are not easily parsed by programs, as there is substantial ambiguity in the structure of human language.ambiguity in the structure of human language. It is difficult to prepare formal rules to describe informal behaviour even though it is clear that some rules are being followed
  • 8. Moreover the task of data dependency parsing becomes highly imperative in the light of the presence of free word order languages around us just for example our own native languagelanguages around us just for example our own native language Hindi
  • 9. Objective First understand the present major dependency parsing algorithms and then implement the same on some platform so that a exhaustive comparison can be drawn between these algorithms on the basis of the time taken during learning and testing phase and then thethe time taken during learning and testing phase and then the accuracy shown over the testing /validation data where the data used for the purpose is English data in the conll format.
  • 10. Conceptual Tour Major concepts: Dependency GrammarDependency Grammar Dependency Tree The Notion of Dependency Types of Dependencies Types of Dependency parsing
  • 11. Dependency Grammar The tradition of dependency grammar is based on the assumption that syntactic structure consists of lexical elements linked by binary asymmetrical relations called Dependencies.
  • 13. The Notion of Dependency The fundamental notion of dependency is based on the idea that theThe fundamental notion of dependency is based on the idea that the syntactic structures of a sentence consists of binary asymmetrical relations between the words of the sentence.
  • 14. Tesniere said(1959) The sentence is an organized whole, the constituent elements of which are words. Every word that belongs to a sentence ceases bywhich are words. Every word that belongs to a sentence ceases by itself to be isolated as in dictionary . Between the words & its neighbors , the mind perceives connections , the totality of which forms the structure of the sentence . The structural connections establish dependency relations between the words . Each connection in principle unites a superior term and a inferior term.
  • 15. Criteria for identifying a syntactic relation between a head H and a dependent D in a construction C H determines the syntactic category of C and can often replace C H determines the semantic category of C; D gives semantic representationrepresentation H is obligatory ; D may be optional H selects D and determines whether D is obligatory or optional The form of D depends on H
  • 16. Types of Dependencies Endocentric ConstructionsEndocentric Constructions Exocentric Constructions
  • 17. Endocentric constructions In an endocentric construction the Head can replace the whole without disrupting the syntactic structure.without disrupting the syntactic structure. “Economic news had little effect on [financial] markets”
  • 18. Exocentric Constructions In exocentric dependencies it’s not possible for the Head to replace the whole with out disrupting the syntactic structure “Economic news had little [effect] on financial markets”
  • 19. Types of Dependency Parsing Grammar – driven dependency parsing Data – driven dependency parsing
  • 20. Data driven dependency parsing The methodology is based on three essential components: 1. Deterministic parsing algorithms for building dependency graphs 2. History-based feature models for predicting the next parser action 3. Discriminative machine learning to map histories to parser actions
  • 21. Architecture for Data Driven Dependency Parsing The architecture consists of three main components: ParserParser Guide Learner
  • 22.
  • 24. Exploring the different Dependency Parsing Algorithms
  • 25. Assumptions for Parsing Unity -Single tree with unique root Uniqueness -each word has only one head One word at a time Single left to right pass Projectivity -No crossing branches
  • 27. Description of Algorithms Used Parsing Algorithm -Covington (projective ,non projective) -Nivre (arc-eager ,arc standard) Learning Algorithm -SVM(LIBSVM)
  • 28. Covington Algorithm There are two parsing strategy basically 1.Brute-force search: Examine each pair of words in the entire sentence, linking them as head-to-dependent or dependent-to-head if the grammar permits. If n words then n(n-1) pair If backtracking allowed then complexity increases 2.Exhaustive left-to-right search: Accept words one by one starting at the beginning of the sentence, and try linking each word as head or dependent of every previous word.
  • 29. Non Projective Covington Algorithm ESH Algorithm: Given an n-word sentence: [1] for i := 1 to n do [2] begin [3] for j := i − 1 down to 1 do [4] begin [5] If the grammar permits, link word j as head of word i; [6] If the grammar permits, link word j as dependent of word i [7] end [8] end
  • 30. ESD Algorithm: Given an n-word sentence: [1] for i := 1 to n do [2] begin [3] for j := i − 1 down to 1 do [4] begin [5] If the grammar permits, link word j as dependent of word I [6] If the grammar permits, link word j as head of word i; [7] end [8] end
  • 31. inefficient Algorithm Violation of unity ,uniqueness and projectivity Use specific principle for uniqueness So there are three variations
  • 32. Algorithm ESHU [1] for i := 1 to n do /*given n word sentence*/ [2] begin [3] for j := i − 1 down to 1 do [4] begin [5] If no word has been[5] If no word has been linked as head of word i, then [6] if the grammar permits, link word j as head of word i; [7] If word j is not a dependent of some other word, then [8] if the grammar permits, link word j as dependent of word i [9] end [10] end
  • 33. Algorithm ESDU [1] for i := 1 to n do/* Given an n-word sentence*/ [2] begin [3] for j := i − 1 down to 1 do [4] begin [5] If word j is not a dependent[5] If word j is not a dependent of some other word, then [6] if the grammar permits, link word j as dependent of word i [7] If no word has been linked as head of word i, then [8] if the grammar permits, link word j as head of word i; [9] end [10] end
  • 34. Algorithm LSU Headlist := [] /*Contains list of words that has no head*/ Wordlist := []/*All words encountered so for */ while (!end-of-sentence) W := next input word; for each D in Headlist if HEAD?(W,D) LINK(W,D); delete D from Headlist; end for each H in Wordlist if HEAD?(H,W) LINK(H,W);
  • 35. terminate this for each loop; end if no head for W was found then Headlist := W + Headlist; end Wordlist := W + Wordlist; end
  • 36. Projective Covington Algorithm /*we have two list head and word*/ Headlist := []; (Words that do not yet have heads) Wordlist := []; (All words encountered so far) repeat (Accept a word and add it to Wordlist) W := the next word to be parsed; Wordlist := W + Wordlist; (Look for dependents of W; they can only be consecutive elements of Headlist starting with the most recently added)
  • 37. Contd… for D := each element of Headlist, starting with the first begin if D can depend on W then begin link D as dependent of W; delete D from Headlist end else terminate this for loop end; (Look for the head of W; it must comprise the word preceding W)
  • 38. Contd… H := the word immediately preceding W in the input string; loop if W can depend on H then begin link W as dependent of H;link W as dependent of H; terminate the loop end; if H is independent then terminate the loop; H := the head of H end loop; if no head for W was found then Headlist := W + Headlist; until all words have been parsed.
  • 39. Nivre’s Algorithm Configuration: C = 〈S, I, A〉 S = Stack I = Input (remaining) A = Arc relation (current) Initialization: 〈nil, W, ∅〉 Termination: 〈S, nil, A〉 for any S, A Acceptance: 〈S, nil, A〉 if (W, A) is connected
  • 40. Transitions • Left-Arc (LA): 〈wi|S, wj|I, A〉 → 〈S, wj|I, A ∪ {(wj, wi)}〉 if ¬∃a : a ∈ A ∧ dep(a) = wi • Right-Arc (RA): 〈w |S, w |I, A〉 → 〈w |w |S, I, A ∪ {(w , w )}〉〈wi|S, wj|I, A〉 → 〈wj|wi|S, I, A ∪ {(wi, wj)}〉 if ¬∃a : a ∈ A ∧ dep(a) = wj • Reduce (RE): 〈wi|S, I, A〉 → 〈S, I, A〉 if ∃a : a ∈ A ∧ dep(a) = wi • Shift (SH): 〈S, wi|I, A〉 → 〈wi|S, I, A〉
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59. Methodology We as a group first studied the art of dependency parsing and then the next task was to explore the various algorithms for which we have made our time and accuracy analysis. After going through thesemade our time and accuracy analysis. After going through these steps we created the LIBSVM Learning models (where LIBSVM is a machine learning package for support vector machines with different kernels) for each of these algorithms when a learning data from a Treebank was given as input for the learning phase on the Malt Parser platform
  • 61. The Input data Treebank Learning data :- WSJ_Train.conll Testing data:- WSJ_Test.conll No of tokens in WSJ_Train.conll:- 81651 No of tokens in WSJ_Test.conll:-16320
  • 63.
  • 65.
  • 66. Accuracy trade off We have described accuracy trade off in terms of Precision where,where, Precision = (No. of correct parses) / (Gold Standard parses)
  • 67. The original Testing data in Conll format
  • 68. The output data by Covington projective
  • 70.
  • 71. Scope for further work Comparison between these algorithms for Hindi language data is quite captivating but that depends on the ease of the availability of qualitycaptivating but that depends on the ease of the availability of quality manually annotated Treebank data. In case a comprehensive Hindi Treebank data is available the same comparison can be carried out on Hindi which is something to look in to.
  • 72. Thank you ☺☺☺☺Thank you ☺☺☺☺