SlideShare a Scribd company logo
1 of 12
Download to read offline
UNIBA: Super Sense Tagging at EVALITA 2011

                                           Pierpaolo Basile

                                           basilepp@di.uniba.it
                                    Department of Computer Science
                                  University of Bari “A. Moro” (ITALY)




                                   EVALITA 2011, Rome, 24-25 January 2012



Pierpaolo Basile (basilepp@di.uniba.it)         SST Uniba                24/01/2012   1 / 12
Motivation


Motivation




         Super Sense Tagging as sequence labelling problem [1]
         Supervised approach
                 lexical/linguistic features
                 distributional features
         Main motivation: tackle the data sparseness problem using word
         similarity in a WordSpace




Pierpaolo Basile (basilepp@di.uniba.it)         SST Uniba        24/01/2012   2 / 12
WordSpace


WordSpace




                You shall know a word
                by the company it
                keeps! [5]
                Words are represented as
                points in a geometric
                space
                Words are related if they
                are close in that space




Pierpaolo Basile (basilepp@di.uniba.it)         SST Uniba   24/01/2012   3 / 12
WordSpace   Random Indexing


WordSpace: Random Indexing


Random Indexing [4]
         builds WordSpace using document as context
         no matrix factorization required
         word-vectors are inferred using an incremental strategy
            1    a random vector is assigned to each context
                         sparse, high-dimensional and ternary ({-1, 0, 1})
                         a small number of randomly distributed non-zero elements
            2    random vectors are accumulated incrementally by analyzing contexts in
                 which terms occur
                         word-vector assigned to each word is the sum of the random vectors of
                         the contexts in which the term occur




Pierpaolo Basile (basilepp@di.uniba.it)         SST Uniba                      24/01/2012   4 / 12
WordSpace   Random Indexing


WordSpace: Random Indexing

Formally Random Indexing is based on Random Projection [2]

                                          An,m · R m,k = B n,k      k<m                    (1)

where An,m is, for example, a term-doc matrix




After projection the distance between points is preserved: d = c· dr

Pierpaolo Basile (basilepp@di.uniba.it)               SST Uniba               24/01/2012   5 / 12
WordSpace     Random Indexing


WordSpace: context

        Two WordSpaces using a
        different definition of context
                Wikipediap : a random                            Table: WordSpaces info
                vector is assigned to
                each Wikipedia page
                                                              WordSpace            C         D
                Wikipediac : a random
                vector is assigned to                         Wikipediap    1,617,449     4,000
                each Wikipedia                                Wikipediac       98,881     1,000
                category
                        categories can identify
                        more general concepts                     C=number of contexts
                        in the same way of                        D=vector dimension
                        super-senses



Pierpaolo Basile (basilepp@di.uniba.it)           SST Uniba                       24/01/2012   6 / 12
Methodology


Methodology



         Learning method: LIBLINEAR (SVM) [3]
         Features
            1    word, lemma, PoS-tag, the first letter of the PoS-tag
            2    the super-sense assigned to the most frequent sense of the word
                 computed according to sense frequency in MultiSemCor
            3    word starts with an upper-case character
            4    grammatical conjugation (e.g. -are, -ere and -ire for Italian verbs)
            5    distributional features: word-vector in the WordSpace




Pierpaolo Basile (basilepp@di.uniba.it)           SST Uniba                 24/01/2012   7 / 12
Evaluation


Evaluation




                                          Table: Results of the evaluation
                   System                       A             P        R        F
                   close                     0.8696        0.7485   0.7583   0.7534
                   no distr feat             0.8822        0.7728   0.7818   0.7773
                   Wikipediac                0.8877        0.7719   0.8020   0.7866
                   Wikipediap                0.8864        0.7700   0.7998   0.7846




Pierpaolo Basile (basilepp@di.uniba.it)                 SST Uniba                24/01/2012   8 / 12
Final Remarks


Final Remarks




         Main motivation: distributional features tackle data sparseness
         problem in SST task
                 increment in recall proves our idea
         Further work: try a different supervised approach more suitable for
         sequence labelling task
                 in this first attempt we are not interested in the learning method
                 performance itself




Pierpaolo Basile (basilepp@di.uniba.it)            SST Uniba              24/01/2012   9 / 12
Final Remarks




                               That’s all folks!




Pierpaolo Basile (basilepp@di.uniba.it)            SST Uniba   24/01/2012   10 / 12
Final Remarks


For Further Reading I


        Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and
        information extraction with a supersense sequence tagger. In:
        Proceedings of the 2006 Conference on Empirical Methods in Natural
        Language Processing. pp. 594–602. Association for Computational
        Linguistics (2006)
        Dasgupta, S., Gupta, A.: An elementary proof of a theorem of
        Johnson and Lindenstrauss. Random Structures & Algorithms 22(1),
        60–65 (2003)
        Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: Liblinear: A library
        for large linear classification. The Journal of Machine Learning
        Research 9, 1871–1874 (2008)



Pierpaolo Basile (basilepp@di.uniba.it)            SST Uniba     24/01/2012   11 / 12
Final Remarks


For Further Reading II



        Sahlgren, M.: An introduction to random indexing. In: Methods and
        Applications of Semantic Indexing Workshop at the 7th International
        Conference on Terminology and Knowledge Engineering, TKE. vol. 5
        (2005)
        Sahlgren, M.: The Word-Space Model: Using distributional analysis to
        represent syntagmatic and paradigmatic relations between words in
        high-dimensional vector spaces. Ph.D. thesis, Stockholm: Stockholm
        University, Faculty of Humanities, Department of Linguistics (2006)




Pierpaolo Basile (basilepp@di.uniba.it)            SST Uniba   24/01/2012   12 / 12

More Related Content

Similar to Sst evalita2011 basile_pierpaolo

Semantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course EquivalenciesSemantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course Equivalencies
Beibei Yang
 
2.7.king
2.7.king2.7.king
2.7.king
afacct
 
CALL 2012 conference presentation
CALL 2012 conference presentationCALL 2012 conference presentation
CALL 2012 conference presentation
Takeshi Sato
 
GloCALL 2013 conference presentation
GloCALL 2013 conference presentationGloCALL 2013 conference presentation
GloCALL 2013 conference presentation
Takeshi Sato
 
Continuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfContinuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdf
devangmittal4
 
Vocabulary Instruction in Science_Burzynski.pptx
Vocabulary Instruction in Science_Burzynski.pptxVocabulary Instruction in Science_Burzynski.pptx
Vocabulary Instruction in Science_Burzynski.pptx
TcherReaQuezada
 

Similar to Sst evalita2011 basile_pierpaolo (20)

Semantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course EquivalenciesSemantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course Equivalencies
 
2.7.king
2.7.king2.7.king
2.7.king
 
AsiaCALL 2017 presentation
AsiaCALL 2017 presentationAsiaCALL 2017 presentation
AsiaCALL 2017 presentation
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
Semeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic SimilaritySemeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic Similarity
 
Verbal or written metadiscourse presentation-mexico
Verbal or written metadiscourse presentation-mexicoVerbal or written metadiscourse presentation-mexico
Verbal or written metadiscourse presentation-mexico
 
CLaSIC 2016 presentation
CLaSIC 2016 presentationCLaSIC 2016 presentation
CLaSIC 2016 presentation
 
Towards Learning a Semantically Relevant Dictionary for Visual Category Recog...
Towards Learning a Semantically Relevant Dictionary for Visual Category Recog...Towards Learning a Semantically Relevant Dictionary for Visual Category Recog...
Towards Learning a Semantically Relevant Dictionary for Visual Category Recog...
 
esbb presentation
esbb presentationesbb presentation
esbb presentation
 
Cognitive Agents with Commonsense - Invited Talk at Istituto Italiano di Tecn...
Cognitive Agents with Commonsense - Invited Talk at Istituto Italiano di Tecn...Cognitive Agents with Commonsense - Invited Talk at Istituto Italiano di Tecn...
Cognitive Agents with Commonsense - Invited Talk at Istituto Italiano di Tecn...
 
Expanding the metadiscourse concept
Expanding the metadiscourse conceptExpanding the metadiscourse concept
Expanding the metadiscourse concept
 
Multiple representations talk, Middlesex University. February 23, 2018
Multiple representations talk, Middlesex University. February 23, 2018Multiple representations talk, Middlesex University. February 23, 2018
Multiple representations talk, Middlesex University. February 23, 2018
 
Text Structures and Thinking Maps
Text Structures and Thinking MapsText Structures and Thinking Maps
Text Structures and Thinking Maps
 
CALL 2012 conference presentation
CALL 2012 conference presentationCALL 2012 conference presentation
CALL 2012 conference presentation
 
Design-based Research
Design-based ResearchDesign-based Research
Design-based Research
 
GloCALL 2013 conference presentation
GloCALL 2013 conference presentationGloCALL 2013 conference presentation
GloCALL 2013 conference presentation
 
Continuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfContinuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdf
 
Communicative-discursive models and cognitive linguistics
Communicative-discursive models and cognitive linguisticsCommunicative-discursive models and cognitive linguistics
Communicative-discursive models and cognitive linguistics
 
Towards Scientific Collaboration in a Semantic Wiki
Towards Scientific Collaboration in a Semantic WikiTowards Scientific Collaboration in a Semantic Wiki
Towards Scientific Collaboration in a Semantic Wiki
 
Vocabulary Instruction in Science_Burzynski.pptx
Vocabulary Instruction in Science_Burzynski.pptxVocabulary Instruction in Science_Burzynski.pptx
Vocabulary Instruction in Science_Burzynski.pptx
 

More from Pierpaolo Basile

More from Pierpaolo Basile (19)

Diachronic analysis of entities by exploiting wikipedia page revisions
Diachronic analysis of entities by exploiting wikipedia page revisionsDiachronic analysis of entities by exploiting wikipedia page revisions
Diachronic analysis of entities by exploiting wikipedia page revisions
 
Come l'industria tecnologica ha cancellato le donne dalla storia
Come l'industria tecnologica ha cancellato le donne dalla storiaCome l'industria tecnologica ha cancellato le donne dalla storia
Come l'industria tecnologica ha cancellato le donne dalla storia
 
EVALITA 2018 NLP4FUN - Solving language games
EVALITA 2018 NLP4FUN - Solving language gamesEVALITA 2018 NLP4FUN - Solving language games
EVALITA 2018 NLP4FUN - Solving language games
 
Buon appetito! Analyzing Happiness in Italian Tweets
Buon appetito! Analyzing Happiness in Italian TweetsBuon appetito! Analyzing Happiness in Italian Tweets
Buon appetito! Analyzing Happiness in Italian Tweets
 
Detecting semantic shift in large corpora by exploiting temporal random indexing
Detecting semantic shift in large corpora by exploiting temporal random indexingDetecting semantic shift in large corpora by exploiting temporal random indexing
Detecting semantic shift in large corpora by exploiting temporal random indexing
 
Bi-directional LSTM-CNNs-CRF for Italian Sequence Labeling
Bi-directional LSTM-CNNs-CRF for Italian Sequence LabelingBi-directional LSTM-CNNs-CRF for Italian Sequence Labeling
Bi-directional LSTM-CNNs-CRF for Italian Sequence Labeling
 
INSERT COIN - Storia dei videogame: da Spacewar a Street Fighter
INSERT COIN - Storia dei videogame: da Spacewar a Street FighterINSERT COIN - Storia dei videogame: da Spacewar a Street Fighter
INSERT COIN - Storia dei videogame: da Spacewar a Street Fighter
 
QuestionCube DigithON 2017
QuestionCube DigithON 2017QuestionCube DigithON 2017
QuestionCube DigithON 2017
 
Diachronic Analysis of the Italian Language exploiting Google Ngram
Diachronic Analysis of the Italian Language exploiting Google NgramDiachronic Analysis of the Italian Language exploiting Google Ngram
Diachronic Analysis of the Italian Language exploiting Google Ngram
 
Diachronic Analysis
Diachronic AnalysisDiachronic Analysis
Diachronic Analysis
 
(Open) data hacking
(Open) data hacking(Open) data hacking
(Open) data hacking
 
La macchina più geek dell’universo The Turing Machine
La macchina più geek dell’universo The Turing MachineLa macchina più geek dell’universo The Turing Machine
La macchina più geek dell’universo The Turing Machine
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
 
Building WordSpaces via Random Indexing from simple to complex spaces
Building WordSpaces via Random Indexing from simple to complex spacesBuilding WordSpaces via Random Indexing from simple to complex spaces
Building WordSpaces via Random Indexing from simple to complex spaces
 
Analysing Word Meaning over Time by Exploiting Temporal Random Indexing
Analysing Word Meaning over Time by Exploiting Temporal Random IndexingAnalysing Word Meaning over Time by Exploiting Temporal Random Indexing
Analysing Word Meaning over Time by Exploiting Temporal Random Indexing
 
A Study on Compositional Semantics of Words in Distributional Spaces
A Study on Compositional Semantics of Words in Distributional SpacesA Study on Compositional Semantics of Words in Distributional Spaces
A Study on Compositional Semantics of Words in Distributional Spaces
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question Answering
 
AI*IA 2012 PAI Workshop OTTHO
AI*IA 2012 PAI Workshop OTTHOAI*IA 2012 PAI Workshop OTTHO
AI*IA 2012 PAI Workshop OTTHO
 
Encoding syntactic dependencies by vector permutation
Encoding syntactic dependencies by vector permutationEncoding syntactic dependencies by vector permutation
Encoding syntactic dependencies by vector permutation
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

Sst evalita2011 basile_pierpaolo

  • 1. UNIBA: Super Sense Tagging at EVALITA 2011 Pierpaolo Basile basilepp@di.uniba.it Department of Computer Science University of Bari “A. Moro” (ITALY) EVALITA 2011, Rome, 24-25 January 2012 Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 1 / 12
  • 2. Motivation Motivation Super Sense Tagging as sequence labelling problem [1] Supervised approach lexical/linguistic features distributional features Main motivation: tackle the data sparseness problem using word similarity in a WordSpace Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 2 / 12
  • 3. WordSpace WordSpace You shall know a word by the company it keeps! [5] Words are represented as points in a geometric space Words are related if they are close in that space Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 3 / 12
  • 4. WordSpace Random Indexing WordSpace: Random Indexing Random Indexing [4] builds WordSpace using document as context no matrix factorization required word-vectors are inferred using an incremental strategy 1 a random vector is assigned to each context sparse, high-dimensional and ternary ({-1, 0, 1}) a small number of randomly distributed non-zero elements 2 random vectors are accumulated incrementally by analyzing contexts in which terms occur word-vector assigned to each word is the sum of the random vectors of the contexts in which the term occur Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 4 / 12
  • 5. WordSpace Random Indexing WordSpace: Random Indexing Formally Random Indexing is based on Random Projection [2] An,m · R m,k = B n,k k<m (1) where An,m is, for example, a term-doc matrix After projection the distance between points is preserved: d = c· dr Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 5 / 12
  • 6. WordSpace Random Indexing WordSpace: context Two WordSpaces using a different definition of context Wikipediap : a random Table: WordSpaces info vector is assigned to each Wikipedia page WordSpace C D Wikipediac : a random vector is assigned to Wikipediap 1,617,449 4,000 each Wikipedia Wikipediac 98,881 1,000 category categories can identify more general concepts C=number of contexts in the same way of D=vector dimension super-senses Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 6 / 12
  • 7. Methodology Methodology Learning method: LIBLINEAR (SVM) [3] Features 1 word, lemma, PoS-tag, the first letter of the PoS-tag 2 the super-sense assigned to the most frequent sense of the word computed according to sense frequency in MultiSemCor 3 word starts with an upper-case character 4 grammatical conjugation (e.g. -are, -ere and -ire for Italian verbs) 5 distributional features: word-vector in the WordSpace Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 7 / 12
  • 8. Evaluation Evaluation Table: Results of the evaluation System A P R F close 0.8696 0.7485 0.7583 0.7534 no distr feat 0.8822 0.7728 0.7818 0.7773 Wikipediac 0.8877 0.7719 0.8020 0.7866 Wikipediap 0.8864 0.7700 0.7998 0.7846 Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 8 / 12
  • 9. Final Remarks Final Remarks Main motivation: distributional features tackle data sparseness problem in SST task increment in recall proves our idea Further work: try a different supervised approach more suitable for sequence labelling task in this first attempt we are not interested in the learning method performance itself Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 9 / 12
  • 10. Final Remarks That’s all folks! Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 10 / 12
  • 11. Final Remarks For Further Reading I Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. pp. 594–602. Association for Computational Linguistics (2006) Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms 22(1), 60–65 (2003) Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: Liblinear: A library for large linear classification. The Journal of Machine Learning Research 9, 1871–1874 (2008) Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 11 / 12
  • 12. Final Remarks For Further Reading II Sahlgren, M.: An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE. vol. 5 (2005) Sahlgren, M.: The Word-Space Model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Stockholm: Stockholm University, Faculty of Humanities, Department of Linguistics (2006) Pierpaolo Basile (basilepp@di.uniba.it) SST Uniba 24/01/2012 12 / 12