SlideShare a Scribd company logo
1 of 20
Download to read offline
Transition-based Semantic Role Labeling
                    Using Predicate Argument Clustering

                          Workshop on Relational Models of Semantics


                                Jinho D. Choi & Martha Palmer
                               University of Colorado at Boulder
                                        June 23rd, 2011



Thursday, June 23, 2011
Dependency-based SRL
             •       Semantic role labeling
                   -      Task of identifying arguments of each predicate and labeling
                          them with semantic roles in relation to the predicate.

             •       Dependency-based semantic role labeling
                   -      Advantages over constituent-based semantic role labeling.
                          •   Dependency parsing is faster (2.29 milliseconds / sentence).

                          •   Dependency structure is more similar to predicate argument
                              structure.

                   -      Labels headwords instead of phrases.
                          •   Still can recover the original semantic chunks for the most of time
                              (Choi and Palmer, LAW 2010).


                                                         2
Thursday, June 23, 2011
Dependency-based SRL
             •       Constituent-based vs. dependency-based SRL
               Agent      S              Theme         Instrument            Temporal
                          NP     VP
                                          NP               PP                    PP
                                                                    NP                NP
                          He    opened    the    door     with      his   foot   at   ten



                          He                     door     with                   at
                                          the                             foot        ten
                                                                    his

                                                   3
Thursday, June 23, 2011
Dependency-based SRL
             •       Constituent-based vs. dependency-based SRL
                                                  opened

                                 SBJ        OBJ         ADV            TMP

                          He             door           with                 at
                                   the                               foot     ten
                                                               his
                                                  opened

                                ARG0       ARG1         ARG2           TMP

                          He      the door              with his foot        at ten


                                                    4
Thursday, June 23, 2011
Motivations
             •       Do argument identification and classification need to be
                     in separate steps?
                   -      They may require two different feature sets.

                   -      Training them in a pipeline takes less time than as a joint-
                          inference task.

                   -      We have seen advantages of dealing with them as a joint-
                          inference task in dependency parsing, why not in SRL?




                                                     5
Thursday, June 23, 2011
Transition-based SRL
             •       Dependency parsing vs. dependency-based SRL
                   -      Both try to find relations between word pairs.

                   -      Dep-based SRL is a special kind of dep. parsing.
                          •   It restricts the search only to top-down relations between
                              predicate (head) and argument (dependent) pairs.

                          •   It allows multiple predicates for each argument.

             •       Transition-based SRL algorithm
                   -      Top-down, bidirectional search. → More suitable for SRL

                   -      Easier to develop a joint-inference system between
                          dependency parsing and semantic role labeling.



                                                         6
Thursday, June 23, 2011
Transition-based SRL
             •       Parsing states

                   -      (λ1, λ2, p, λ3, λ4, A)

                   -      p - index of the current predicate candidate.

                   -      λ1 - indices of lefthand-side argument candidates.

                   -      λ4 - indices of righthand-side argument candidates.

                   -      λ2,3 - indices of processed tokens.

                   -      A - labeled arcs with semantic roles

             •       Initialization: ([ ], [ ], 1, [ ], [2, ..., n], ∅)

             •       Termination: (λ1, λ2, ∄, [ ], [ ], A)


                                                         7
Thursday, June 23, 2011
Transition-based SRL
             •       Transitions

                   -      No-Pred - finds the next predicate candidate.

                   -      No-Arc← - rejects the lefthand-side argument candidate.

                   -      No-Arc→ - rejects the righthand-side argument candidate.

                   -      Left-Arc← - accepts the lefthand-side argument candidate.

                   -      Right-Arc→ - accepts the righthand-side argument candidate.




                                                   8
Thursday, June 23, 2011
A0            A1

                          John1        wants2        to3             buy4         a5     car6
                                            A0                                    A1

                                           John
                                           wants                            wants
                                            to                 car           to           buy → car
                   to         John          buy                 a           buy           John ← buy
                wants         wants                            car
                                                               buy            a           wants → to
                  John        John
                               to                              to
                                                                a            car         John ← wants

                  λ1           λ2                              λ3            λ4                 A
                          •   No-Pred                          • No-Pred
                          •   Left-Arc : John ← wants          • No-Arc x 2
                          •   Right-Arc : wants → to           • Left-Arc : John ← buy
                          •   No-Arc x 3                       • No-Arc
                          •   Shift                            • Right-Arc : buy → car
                                                           9
Thursday, June 23, 2011
Features
             •       Baseline features
                   -      N-gram and binary features
                          (similar to ones in Johansson and Nugues, EMNLP 2008).

                   -      Structural features.
                                     wants                    Subcategorization of “wants”
                               SBJ           OPRD

                          PRP:John            TO:to                 SBJ ← V → OPRD
                                                    IM
                                             VB:buy            Path from “John” to “buy”
                                                                  PRP ↑ LCA ↓ TO ↓ VB
                      Depth from “John” to “buy”
                                                                SBJ ↑ LCA ↓ OPRD ↓ IM
                                1 ↑ LCA ↓ 2


                                                         10
Thursday, June 23, 2011
Features
             •       Dynamic features
                   -      Derived from previously identified arguments.

                   -      Previously identified argument label of warg.

                                  A0            A1

                          John1        wants2         to3        buy4   a5   car6
                                            A0                          A1



                   -      Label of the very last predicted numbered argument of wpred.

                   -      These features can narrow down the scope of expected
                          arguments of wpred.



                                                            11
Thursday, June 23, 2011
Experiments
             •       Corpora
                   -      CoNLL’09 English data.

                   -      In-domain task: the Wall Street Journal.

                   -      Out-of-domain task: the Brown corpus.

             •       Input to our semantic role labeler
                   -      Automatically generated dependency trees.

                   -      Used our open-source dependency parser, ClearParser.

             •       Machine learning algorithm
                   -      Liblinear L2-L1 SVM.


                                                    12
Thursday, June 23, 2011
Experiments
             •       Results
                   -      AI - Argument Identification.

                   -      AC - Argument Classification.

                                               In-domain                Out-of-domain
                                Task
                                         P        R         F1      P        R          F1
                                AI     92.57     88.44     90.46   90.96    81.57   86.01
                 Baseline
                              AI+AC    87.20     83.31     85.21   77.11    69.14   72.91
                    +      AI          92.38     88.76     90.54   90.90    82.25   86.36
                 Dynamic AI+AC         87.33     83.91     85.59   77.41    70.05   73.55
                    JN’08     AI+AC    88.46     83.55     85.93   77.67    69.63   73.43



                                                      13
Thursday, June 23, 2011
Summary
             •       Introduced a transition-based SRL algorithm, showing
                     near state-of-the-art results.
                   -      No need to design separate systems for argument
                          identification and classification.

                   -      Make it easier to develop a joint-inference system between
                          dependency parsing and semantic role labeling.

             •       Future work
                   -      Several techniques, designed to improve transition-based
                          parsing, can be applied (e.g., dynamic programming, k-best
                          ranking)

                   -      We can apply more features, such as clustering information,
                          to improve labeling accuracy.


                                                   14
Thursday, June 23, 2011
Predicate Argument Clustering
             •       Verb clusters can give more generalization to the
                     statistical models.
                   -      Clustering verbs using bag-of-words, syntactic structure.

                   -      Clustering verbs using predicate argument structure.

             •       Self-learning clustering
                   -      Cluster verbs in the test data using automatically generated
                          predicate argument structures.

                   -      Cluster verbs in the training data using the verb clusters
                          found in the test data as seeds.

                   -      Re-run our semantic role labeler on the test data using the
                          clustering information.


                                                    15
Thursday, June 23, 2011
Figure 2: Projecting the predicate argument structure of
                 each verb into vector space.
                            Predicate Argument Clustering
                   Figure 2: Projecting the predicate argument structure of
 rma-            higherverb into vector space.more important than the
                   each confidence, or are
  e la-
  orma-
 t al.,
           • others; e.g., ARG0 andorARG1moregenerally predicted
              Vector confidence, are are important than the
               higher
                       representation
                 with higher confidence than modifiers, nouns give
 ole la-
 ually
  et al.,
               -   others; e.g., ARG0 and semantic role labels predictedlemmas.
                     Semantic role labels, ARG1 are generally + word
                 more important information than some other gram-
                   with higher confidence than modifiers, nouns give
g, for           maticalVerb A0 A1 ... john:A0 assign each ex-
                          categories, etc. Instead, we to:A1           car:A1 ...
  sually           more important information than some other gram-
 seful           isting feature with a value computed by 1 follow-
                         want      1                           the
                                      1 etc. Instead, we assign each0ex- 0s
                                           0s      1
 ng, for           matical categories,
 g-of-           ing equations: 1 1 0s
  useful           isting buy                      1          0
                          feature with a value computed by the follow- 0s 1
 how-
  ag-of-           ing equations:                    1
 redi-
   how-
 cally
               -        s(lj |vi ) =
                                       1 + exp(−score(lj |vi ))
  predi-                                              1
                          s(lj |vi ) =
  s by
 tically                                 11 scorejof lj noun) jlabel of vi
                                               (w = being a
                                           + exp(−score(l |vi ))
mized                  s(mj , lj ) =             count(mj ,lj ) )
 rbs by                                   exp( (w count(mk ,lk )
                                             1     ∀k = noun)
  mized        -        s(mj , lj ) =
                                             exp(
                                                    j
                                                    count(mj ,lj ) )
 clus-                                               ∀k count(mk ,lk )
                 vi is the current verb, ljlikelihood of m co-occurring with l
                                             is the j’th label of vi , and
 eling                                max.                 j                  j
   clus-         mj is lj ’s corresponding lemma. score(lj |vi ) is a
m se-              vi is the current verb, lj is the j’th label of vi , and
 beling          score of lj being a correct argument label of vi ; this
 algo-             mj is lj ’s corresponding lemma. score(lj |vi ) is a
                                                    16
 rm se-          is always 1 for training data and is provided by our
e test             score of l being a correct argument label of v ; this
  Thursday, June 23, 2011
Predicate Argument Clustering
             •       Clustering verbs in the test data
                   -      K-best hierarchical agglomerative clustering.
                          •   Merges k-best pairs at each iteration.

                          •   Uses a threshold to dynamically determine the top k clusters.

                   -      We set another threshold for early break-out.

             •       Clustering verbs in the training data
                   -      K-means clustering.
                          •   Starts with centroids estimated from the clusters found in the test
                              data.

                          •   Uses a threshold to filter out verbs not close enough to any
                              cluster.


                                                        17
Thursday, June 23, 2011
Experiments
             •       Results

                                              In-domain                Out-of-domain
                               Task
                                       P         R         F1      P        R          F1
                               AI     92.57     88.44     90.46   90.96    81.57   86.01
                 Baseline
                            AI+AC     87.20     83.31     85.21   77.11    69.14   72.91
                    +      AI         92.38     88.76     90.54   90.90    82.25   86.36
                 Dynamic AI+AC        87.33     83.91     85.59   77.41    70.05   73.55
                     +         AI     92.62     88.90     90.72   90.87    82.43   86.44
                  Cluster   AI+AC     87.43     83.92     85.64   77.47    70.28   73.70
                    JN’08   AI+AC     88.46     83.55     85.93   77.67    69.63   73.43




                                                     18
Thursday, June 23, 2011
Conclusion
             •       Introduced self-learning clustering technique, potential
                     for improving labeling accuracy in the new domain.
                   -      Need to try with large scale data to see a clear impact of the
                          clustering.

                   -      Can also be improved by using different features or
                          clustering algorithms.


             •       ClearParser open-source project
                   -      http://code.google.com/p/clearparser/




                                                    19
Thursday, June 23, 2011
Acknowledgements
             •       We gratefully acknowledge the support of the National
                     Science Foundation Grants CISE-IIS- RI-0910992, Richer
                     Representations for Machine Translation, a subcontract
                     from the Mayo Clinic and Harvard Children’s Hospital
                     based on a grant from the ONC, 90TR0002/01, Strategic
                     Health Advanced Research Project Area 4: Natural
                     Language Processing, and a grant from the Defense
                     Advanced Research Projects Agency (DARPA/IPTO)
                     under the GALE program, DARPA/CMO Contract No.
                     HR0011-06-C-0022, subcontract from BBN, Inc. Any
                     opinions, findings, and conclusions or recommendations
                     expressed in this material are those of the authors and
                     do not necessarily reflect the views of the National
                     Science Foundation.

                                             20
Thursday, June 23, 2011

More Related Content

Viewers also liked

Semantics: Predicate, Predicators and Degree of Predicate
Semantics: Predicate, Predicators and Degree of Predicate Semantics: Predicate, Predicators and Degree of Predicate
Semantics: Predicate, Predicators and Degree of Predicate Shova Zakia
 
Unit 6 - Predicates, Referring Expressions, and Universe of Discourse
Unit 6 -  Predicates, Referring Expressions, and Universe of DiscourseUnit 6 -  Predicates, Referring Expressions, and Universe of Discourse
Unit 6 - Predicates, Referring Expressions, and Universe of DiscourseAshwag Al Hamid
 
Transition and Transfer Predicates
Transition and Transfer PredicatesTransition and Transfer Predicates
Transition and Transfer Predicateskinarossi
 

Viewers also liked (6)

Semantics: Predicate, Predicators and Degree of Predicate
Semantics: Predicate, Predicators and Degree of Predicate Semantics: Predicate, Predicators and Degree of Predicate
Semantics: Predicate, Predicators and Degree of Predicate
 
Unit 5 - Predicates
Unit 5 - PredicatesUnit 5 - Predicates
Unit 5 - Predicates
 
Semantic roles and semantic features
Semantic roles and semantic featuresSemantic roles and semantic features
Semantic roles and semantic features
 
Unit 6 - Predicates, Referring Expressions, and Universe of Discourse
Unit 6 -  Predicates, Referring Expressions, and Universe of DiscourseUnit 6 -  Predicates, Referring Expressions, and Universe of Discourse
Unit 6 - Predicates, Referring Expressions, and Universe of Discourse
 
Transition and Transfer Predicates
Transition and Transfer PredicatesTransition and Transfer Predicates
Transition and Transfer Predicates
 
Semantic Roles
Semantic RolesSemantic Roles
Semantic Roles
 

More from Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning RepresentationJinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingJinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet SimilaritiesJinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical RelationsJinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingJinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological SortJinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyJinho Choi
 

More from Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Transition-based Semantic Role Labeling Using Predicate Argument Clustering

  • 1. Transition-based Semantic Role Labeling Using Predicate Argument Clustering Workshop on Relational Models of Semantics Jinho D. Choi & Martha Palmer University of Colorado at Boulder June 23rd, 2011 Thursday, June 23, 2011
  • 2. Dependency-based SRL • Semantic role labeling - Task of identifying arguments of each predicate and labeling them with semantic roles in relation to the predicate. • Dependency-based semantic role labeling - Advantages over constituent-based semantic role labeling. • Dependency parsing is faster (2.29 milliseconds / sentence). • Dependency structure is more similar to predicate argument structure. - Labels headwords instead of phrases. • Still can recover the original semantic chunks for the most of time (Choi and Palmer, LAW 2010). 2 Thursday, June 23, 2011
  • 3. Dependency-based SRL • Constituent-based vs. dependency-based SRL Agent S Theme Instrument Temporal NP VP NP PP PP NP NP He opened the door with his foot at ten He door with at the foot ten his 3 Thursday, June 23, 2011
  • 4. Dependency-based SRL • Constituent-based vs. dependency-based SRL opened SBJ OBJ ADV TMP He door with at the foot ten his opened ARG0 ARG1 ARG2 TMP He the door with his foot at ten 4 Thursday, June 23, 2011
  • 5. Motivations • Do argument identification and classification need to be in separate steps? - They may require two different feature sets. - Training them in a pipeline takes less time than as a joint- inference task. - We have seen advantages of dealing with them as a joint- inference task in dependency parsing, why not in SRL? 5 Thursday, June 23, 2011
  • 6. Transition-based SRL • Dependency parsing vs. dependency-based SRL - Both try to find relations between word pairs. - Dep-based SRL is a special kind of dep. parsing. • It restricts the search only to top-down relations between predicate (head) and argument (dependent) pairs. • It allows multiple predicates for each argument. • Transition-based SRL algorithm - Top-down, bidirectional search. → More suitable for SRL - Easier to develop a joint-inference system between dependency parsing and semantic role labeling. 6 Thursday, June 23, 2011
  • 7. Transition-based SRL • Parsing states - (λ1, λ2, p, λ3, λ4, A) - p - index of the current predicate candidate. - λ1 - indices of lefthand-side argument candidates. - λ4 - indices of righthand-side argument candidates. - λ2,3 - indices of processed tokens. - A - labeled arcs with semantic roles • Initialization: ([ ], [ ], 1, [ ], [2, ..., n], ∅) • Termination: (λ1, λ2, ∄, [ ], [ ], A) 7 Thursday, June 23, 2011
  • 8. Transition-based SRL • Transitions - No-Pred - finds the next predicate candidate. - No-Arc← - rejects the lefthand-side argument candidate. - No-Arc→ - rejects the righthand-side argument candidate. - Left-Arc← - accepts the lefthand-side argument candidate. - Right-Arc→ - accepts the righthand-side argument candidate. 8 Thursday, June 23, 2011
  • 9. A0 A1 John1 wants2 to3 buy4 a5 car6 A0 A1 John wants wants to car to buy → car to John buy a buy John ← buy wants wants car buy a wants → to John John to to a car John ← wants λ1 λ2 λ3 λ4 A • No-Pred • No-Pred • Left-Arc : John ← wants • No-Arc x 2 • Right-Arc : wants → to • Left-Arc : John ← buy • No-Arc x 3 • No-Arc • Shift • Right-Arc : buy → car 9 Thursday, June 23, 2011
  • 10. Features • Baseline features - N-gram and binary features (similar to ones in Johansson and Nugues, EMNLP 2008). - Structural features. wants Subcategorization of “wants” SBJ OPRD PRP:John TO:to SBJ ← V → OPRD IM VB:buy Path from “John” to “buy” PRP ↑ LCA ↓ TO ↓ VB Depth from “John” to “buy” SBJ ↑ LCA ↓ OPRD ↓ IM 1 ↑ LCA ↓ 2 10 Thursday, June 23, 2011
  • 11. Features • Dynamic features - Derived from previously identified arguments. - Previously identified argument label of warg. A0 A1 John1 wants2 to3 buy4 a5 car6 A0 A1 - Label of the very last predicted numbered argument of wpred. - These features can narrow down the scope of expected arguments of wpred. 11 Thursday, June 23, 2011
  • 12. Experiments • Corpora - CoNLL’09 English data. - In-domain task: the Wall Street Journal. - Out-of-domain task: the Brown corpus. • Input to our semantic role labeler - Automatically generated dependency trees. - Used our open-source dependency parser, ClearParser. • Machine learning algorithm - Liblinear L2-L1 SVM. 12 Thursday, June 23, 2011
  • 13. Experiments • Results - AI - Argument Identification. - AC - Argument Classification. In-domain Out-of-domain Task P R F1 P R F1 AI 92.57 88.44 90.46 90.96 81.57 86.01 Baseline AI+AC 87.20 83.31 85.21 77.11 69.14 72.91 + AI 92.38 88.76 90.54 90.90 82.25 86.36 Dynamic AI+AC 87.33 83.91 85.59 77.41 70.05 73.55 JN’08 AI+AC 88.46 83.55 85.93 77.67 69.63 73.43 13 Thursday, June 23, 2011
  • 14. Summary • Introduced a transition-based SRL algorithm, showing near state-of-the-art results. - No need to design separate systems for argument identification and classification. - Make it easier to develop a joint-inference system between dependency parsing and semantic role labeling. • Future work - Several techniques, designed to improve transition-based parsing, can be applied (e.g., dynamic programming, k-best ranking) - We can apply more features, such as clustering information, to improve labeling accuracy. 14 Thursday, June 23, 2011
  • 15. Predicate Argument Clustering • Verb clusters can give more generalization to the statistical models. - Clustering verbs using bag-of-words, syntactic structure. - Clustering verbs using predicate argument structure. • Self-learning clustering - Cluster verbs in the test data using automatically generated predicate argument structures. - Cluster verbs in the training data using the verb clusters found in the test data as seeds. - Re-run our semantic role labeler on the test data using the clustering information. 15 Thursday, June 23, 2011
  • 16. Figure 2: Projecting the predicate argument structure of each verb into vector space. Predicate Argument Clustering Figure 2: Projecting the predicate argument structure of rma- higherverb into vector space.more important than the each confidence, or are e la- orma- t al., • others; e.g., ARG0 andorARG1moregenerally predicted Vector confidence, are are important than the higher representation with higher confidence than modifiers, nouns give ole la- ually et al., - others; e.g., ARG0 and semantic role labels predictedlemmas. Semantic role labels, ARG1 are generally + word more important information than some other gram- with higher confidence than modifiers, nouns give g, for maticalVerb A0 A1 ... john:A0 assign each ex- categories, etc. Instead, we to:A1 car:A1 ... sually more important information than some other gram- seful isting feature with a value computed by 1 follow- want 1 the 1 etc. Instead, we assign each0ex- 0s 0s 1 ng, for matical categories, g-of- ing equations: 1 1 0s useful isting buy 1 0 feature with a value computed by the follow- 0s 1 how- ag-of- ing equations: 1 redi- how- cally - s(lj |vi ) = 1 + exp(−score(lj |vi )) predi- 1 s(lj |vi ) = s by tically 11 scorejof lj noun) jlabel of vi (w = being a + exp(−score(l |vi )) mized s(mj , lj ) = count(mj ,lj ) ) rbs by exp( (w count(mk ,lk ) 1 ∀k = noun) mized - s(mj , lj ) = exp( j count(mj ,lj ) ) clus- ∀k count(mk ,lk ) vi is the current verb, ljlikelihood of m co-occurring with l is the j’th label of vi , and eling max. j j clus- mj is lj ’s corresponding lemma. score(lj |vi ) is a m se- vi is the current verb, lj is the j’th label of vi , and beling score of lj being a correct argument label of vi ; this algo- mj is lj ’s corresponding lemma. score(lj |vi ) is a 16 rm se- is always 1 for training data and is provided by our e test score of l being a correct argument label of v ; this Thursday, June 23, 2011
  • 17. Predicate Argument Clustering • Clustering verbs in the test data - K-best hierarchical agglomerative clustering. • Merges k-best pairs at each iteration. • Uses a threshold to dynamically determine the top k clusters. - We set another threshold for early break-out. • Clustering verbs in the training data - K-means clustering. • Starts with centroids estimated from the clusters found in the test data. • Uses a threshold to filter out verbs not close enough to any cluster. 17 Thursday, June 23, 2011
  • 18. Experiments • Results In-domain Out-of-domain Task P R F1 P R F1 AI 92.57 88.44 90.46 90.96 81.57 86.01 Baseline AI+AC 87.20 83.31 85.21 77.11 69.14 72.91 + AI 92.38 88.76 90.54 90.90 82.25 86.36 Dynamic AI+AC 87.33 83.91 85.59 77.41 70.05 73.55 + AI 92.62 88.90 90.72 90.87 82.43 86.44 Cluster AI+AC 87.43 83.92 85.64 77.47 70.28 73.70 JN’08 AI+AC 88.46 83.55 85.93 77.67 69.63 73.43 18 Thursday, June 23, 2011
  • 19. Conclusion • Introduced self-learning clustering technique, potential for improving labeling accuracy in the new domain. - Need to try with large scale data to see a clear impact of the clustering. - Can also be improved by using different features or clustering algorithms. • ClearParser open-source project - http://code.google.com/p/clearparser/ 19 Thursday, June 23, 2011
  • 20. Acknowledgements • We gratefully acknowledge the support of the National Science Foundation Grants CISE-IIS- RI-0910992, Richer Representations for Machine Translation, a subcontract from the Mayo Clinic and Harvard Children’s Hospital based on a grant from the ONC, 90TR0002/01, Strategic Health Advanced Research Project Area 4: Natural Language Processing, and a grant from the Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. 20 Thursday, June 23, 2011