SlideShare ist ein Scribd-Unternehmen logo
1 von 200
Language Independent Methods of Clustering Similar Contexts (with applications) Ted Pedersen University of Minnesota, Duluth  [email_address] http://www.d.umn.edu/~tpederse/SCTutorial.html
Language Independent Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A Note on Tokenization  ,[object Object],[object Object],[object Object],[object Object]
Clustering Similar Contexts ,[object Object],[object Object],[object Object],[object Object],[object Object]
Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Tutorial Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SenseClusters ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Many thanks… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Background and Motivations
Headed and Headless Contexts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Headed Contexts (input) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Headed Contexts (output) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Headless Contexts (input) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Headless Contexts (output) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Web Search as Application ,[object Object],[object Object],[object Object],[object Object]
Name Discrimination
George Millers!
 
 
 
 
Email Foldering as Application ,[object Object],[object Object],[object Object],[object Object],[object Object]
Clustering News as Application ,[object Object],[object Object],[object Object],[object Object]
What is it to be “similar”? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
General Methodology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Identifying Lexical Features Measures of Association and  Tests of Significance
What are features? ,[object Object],[object Object],[object Object]
Feature Selection Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Feature Selection Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Lexical Features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Lexical Features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bigrams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Co-occurrences ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bigrams and Co-occurrences ,[object Object],[object Object],[object Object],[object Object],[object Object]
“ occur together more often than expected by chance…” ,[object Object],[object Object],[object Object],[object Object]
2x2 Contingency Table 100,000 300 not Artificial 400 100 Artificial not Intelligence Intelligence
2x2 Contingency Table 100,000 99,700 300 99,600 99,400 200 not Artificial 400 300 100 Artificial not Intelligence Intelligence
2x2 Contingency Table 100,000 99,700 300 99,600 99,400.0 99,301.2 200.0 298.8 not Artificial 400 300.0 398.8 100.0 000.12 Artificial not Intelligence Intelligence
Measures of Association
Measures of Association
Interpreting the Scores… ,[object Object],[object Object]
 
Interpreting the Scores… ,[object Object],[object Object],[object Object]
Measures of Association ,[object Object],[object Object],[object Object],[object Object]
Measures Supported in NSP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Context Representations First and Second Order Methods
Once features selected… ,[object Object],[object Object],[object Object]
Possible Representations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
First Order Representation Native SenseClusters ,[object Object],[object Object],[object Object],[object Object]
Contexts ,[object Object],[object Object],[object Object],[object Object]
Unigram Features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
First Order Vectors of Unigrams 1 0 1 0 1 x4 0 0 0 0 0 x3 1 1 0 1 0 x2 1 1 1 1 1 x1 child magic curse black island
Bigram Feature Set ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
First Order Vectors of Bigrams 1 0 1 1 0 x4 0 1 1 0 0 x3 1 0 0 0 1 x2 1 0 0 1 1 x1 voodoo child serious error military might  island curse  black magic
First Order Vectors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Second Order Features ,[object Object],[object Object],[object Object],[object Object],[object Object]
Second Order Representation Native SenseClusters ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Word by Word Matrix 120.0 0 69.4 0 0 voodoo 0 89.2 0 21.2 0 serious 0 54.9 100.3 0 0 military 73.2 0 0 189.2 0 island 43.2 0 0 0 123.5 black child error might curse magic
Word by Word Matrix ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
There was an  island  curse of  black  magic cast by that  voodoo  child.  120.0 0 69.4 0 0 voodoo 73.2 0 0 189.2 0 island 43.2 0 0 0 123.5 black child error might curse magic
Second Order Co-Occurrences ,[object Object],[object Object]
Second Order Representation ,[object Object],[object Object],[object Object]
There was an  island  curse of  black  magic cast by that  voodoo  child.  78.8 0 24.4 63.1 41.2 x1 child error might curse magic
Second Order Representation Native SenseClusters ,[object Object],[object Object]
Second Order Representation Latent Semantic Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
First Order Vectors of Unigrams 1 0 1 0 1 x4 0 0 0 0 0 x3 1 1 0 1 0 x2 1 1 1 1 1 x1 child magic curse black island
Transposed  1 0 1 1 child 0 0 1 1 magic 1 0 0 1 curse 0 0 1 1 black 1 0 0 1 island x4 x3 x2 x1
harold a known  voodoo   child  was gifted in the arts of  black   magic  1 0 1 1 child 0 0 1 1 magic 0 0  1 1 black x4 x3 x2 x1
Second Order Representation ,[object Object],[object Object],[object Object]
x2: harold a known  voodoo   child  was gifted in the arts of  black   magic .3 0 1 1 x2 x4 x3 x2 x1
Second Order Representation Latent Semantic Analysis ,[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Dimensionality Reduction Singular Value Decomposition
Motivation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Many Methods  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Effect of SVD ,[object Object],[object Object]
Effect of SVD ,[object Object],[object Object],[object Object]
How can SVD be used? ,[object Object],[object Object],[object Object],[object Object]
Word by Word Matrix native SenseClusters 2 nd  order 4 2 0 0 0 3 0 1 box 0 1 2 2 1 2 0 0 memory 0 0 0 1 0 0 2 0 organ 0 2 0 3 2 0 0 0 debt 0 1 0 3 1 0 0 2 linux 0 1 0 3 2 0 0 0 sales 3 0 2 2 0 3 0 0 lab 1 0 2 0 0 1 2 0 petri 0 1 0 0 2 0 0 1 disk 1 0 2 0 0 0 3 0 body 0 0 0 3 1 0 0 2 pc plasma graphics tissue data ibm cells blood apple
Singular Value Decomposition A=UDV’
U -.52 .39 -.48 .02 .09 .41 -.09 .40 -.30 .08 .31 .43 -.26 -.39 -.6 .20 .00 -.00 -.00 -.02 -.01 .00 -.02 -.00 -.07 -.3 .14 -.49 -.07 .30 .25 .56 -.01 .08 .05 -.01 .24 -.08 .11 .46 .08 .03 -.04 .72 .09 -.31 -.01 .37 -.07 .01 -.21 -.31 -.34 -.45 -.68 .29 .00 .05 .83 .17 -.02 .25 -.45 .08 .03 .20 -.22 .31 -.60 .39 .13 .35 -.01 -.04 -.44 .08 .44 .59 -.49 .05 -.02 .63 .02 -.09 .52 -.2 .09 .35
D 0.00 0.00 0.00 0.66 1.26 2.30 2.52 3.25 3.99 6.36 9.19
V -.20 .22 -.07 -.10 -.87 -.07 -.06 .17 .19 -.26 .04 .03 .17 -.32 .02 .13 -.26 -.17 .06 -.04 .86 .50 -.58 .12 .09 -.18 -.27 -.18 -.12 -.47 .11 -.03 .12 .31 -.32 -.04 .64 -.45 -.14 -.23 .28 .07 -.23 -.62 -.59 .05 .02 -.12 .15 .11 .25 -.71 -.31 -.04 .08 .29 -.05 .05 .20 -.51 .09 -.03 .12 .31 -.01 .02 -.45 -.32 .50 .27 .49 -.02 .08 .21 -.06 .08 -.09 .52 -.45 -.01 .63 .03 -.12 -.31 .71 -.13 .39 -.12 .12 .15 .37 .07 .58 -.41 .15 .17 -.30 -.32 -.27 -.39 .11 .44 .25 .03 -.02 .26 .23 .39 .57 -.37 .04 .03 -.12 -.31 -.05 -.05 .04 .28 -.04 .08 .21
Word by Word Matrix After SVD 1.1 1.0 .98 1.7 .86 .72 .85 .77 memory .00 .00 .17 1.2 .77 .00 .84 .00 organ .00 1.5 .00 3.2 2.1 .00 .00 1.2 debt .13 1.1 .03 2.7 1.7 .16 .00 .96 linux .41 .85 .35 2.2 1.3 .39 .15 .73 sales 2.3 .18 2.5 1.7 .35 2.0 1.7 .21 lab 1.4 .00 1.5 .49 .00 1.2 1.1 .00 germ .00 .91 .00 2.1 1.3 .01 .00 .76 disk 1.5 .00 1.6 .33 .00 1.3 1.2 .00 body .09 .86 .01 2.0 1.3 .11 .00 .73 pc plasma graphics tissue data ibm cells blood apple
Second Order Co-Occurrences ,[object Object],[object Object],[object Object],[object Object],1.0 .72 memory .00 .00 organ .13 1.1 .03 2.7 1.7 .16 .00 .96 linux .00 .91 .00 2.1 1.3 .01 .00 .76 disk Plasma graphics tissue data ibm cells blood apple
References ,[object Object],[object Object],[object Object],[object Object]
Clustering Partitional Methods Cluster Stopping Cluster Labeling
Many many methods… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
General Methodology ,[object Object],[object Object],[object Object],[object Object],[object Object]
Agglomerative Clustering ,[object Object],[object Object],[object Object]
Measuring Similarity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Agglomerative Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
  Average Link Clustering 1 2 4 S3 1 2 4 S3 0 2 S4 0 3 S2 2 3 S1 S4 S2 S1 0 S4 0 S2 S1S3 S4 S2 S1S3 S4 S1S3S2 S4 S1S3S2
Partitional Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Partitional Methods ,[object Object],[object Object],[object Object],[object Object],[object Object]
Vectors to be clustered
Random Initial Centroids (k=2)
Assignment of Clusters
Recalculation of Centroids
Reassignment of Clusters
Recalculation of Centroid
Reassignment of Clusters
Partitional Criterion Functions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Intra Cluster Similarity ,[object Object],[object Object],[object Object],[object Object]
Contexts to be Clustered
Ball of String  (I1 Internal Criterion Function)
Flower (I2 Internal Criterion Function)
Inter Cluster Similarity ,[object Object],[object Object],[object Object]
The Fan (E1 External Criterion Function)
Hybrid Criterion Functions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Cluster Stopping
Cluster Stopping ,[object Object],[object Object]
Criterion Functions Can Help ,[object Object],[object Object],[object Object],[object Object]
H2 versus k T. Blair – V. Putin – S. Hussein
PK2 ,[object Object],[object Object],[object Object]
PK2 predicts 3 senses T. Blair – V. Putin – S. Hussein
PK3 ,[object Object],[object Object],[object Object],[object Object]
PK3 predicts 3 senses T. Blair – V. Putin – S. Hussein
Adapted Gap Statistic ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Adapted Gap Statistic
Gap predicts 3 senses T. Blair – V. Putin – S. Hussein
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Cluster Labeling
Cluster Labeling ,[object Object],[object Object]
Results of Clustering ,[object Object],[object Object],[object Object],[object Object]
Label Types ,[object Object],[object Object]
George Miller Labels ,[object Object],[object Object],[object Object]
Evaluation Techniques Comparison to gold standard data
Evaluation ,[object Object],[object Object],[object Object],[object Object]
Evaluation ,[object Object],[object Object],[object Object],[object Object]
Evaluation ,[object Object],[object Object],[object Object]
Baseline Algorithm ,[object Object],[object Object]
Baseline Performance ,[object Object],170 55 35 80 Totals 170 55 35 80 C3 0 0 0 0 C2 0 0 0 0 C1 Totals S3 S2 S1 170 80 35 55 Totals 170 80 35 55 C3 0 0 0 0 C2 0 0 0 0 C1 Totals S1 S2 S3
Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],170 55 35 80 Totals 65 10 5 50 C3 60 40 0 20 C2 45 5 30 10 C1 Totals S3 S2 S1
Evaluation ,[object Object],[object Object],[object Object],170 80 55 35 Totals 65 50 10 5 C3 60 20 40 0 C2 45 10 5 30 C1 Totals S1 S3 S2
Analysis ,[object Object],[object Object],[object Object],[object Object]
Hands on Experience Experiments with SenseClusters
Things to Try ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experimental Data ,[object Object],[object Object],[object Object],[object Object]
Creating Experimental Data ,[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
 
Thank you! ,[object Object],[object Object],[object Object],[object Object],[object Object]
Target Word Clustering SenseClusters Native Mode ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Target Word Clustering Latent Semantic Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Feature Clustering Latent Semantic Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
 
 
 
 
 
 

Weitere ähnliche Inhalte

Was ist angesagt?

Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Bhaskar Mitra
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Text summarization
Text summarizationText summarization
Text summarizationkareemhashem
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Quinsulon Israel
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalBhaskar Mitra
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibConceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibEl Habib NFAOUI
 
The vector space model
The vector space modelThe vector space model
The vector space modelpkgosh
 
Email Data Cleaning
Email Data CleaningEmail Data Cleaning
Email Data Cleaningfeiwin
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarizationAbdelaziz Al-Rihawi
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approachdinesh_joshy
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingConstantin Orasan
 
Scalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large FoldersScalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large Foldersfeiwin
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 

Was ist angesagt? (20)

Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Text summarization
Text summarizationText summarization
Text summarization
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibConceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
 
The vector space model
The vector space modelThe vector space model
The vector space model
 
Email Data Cleaning
Email Data CleaningEmail Data Cleaning
Email Data Cleaning
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
Tries
TriesTries
Tries
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Cc35451454
Cc35451454Cc35451454
Cc35451454
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
Scalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large FoldersScalable Discovery Of Hidden Emails From Large Folders
Scalable Discovery Of Hidden Emails From Large Folders
 
Topic modelling
Topic modellingTopic modelling
Topic modelling
 
Term weighting
Term weightingTerm weighting
Term weighting
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 

Andere mochten auch (7)

Presentation.Pit.2011 02 04.Lat.Dianak
Presentation.Pit.2011 02 04.Lat.DianakPresentation.Pit.2011 02 04.Lat.Dianak
Presentation.Pit.2011 02 04.Lat.Dianak
 
Conll
ConllConll
Conll
 
Amia06
Amia06Amia06
Amia06
 
Catalog Price 2009 Usd
Catalog Price 2009 UsdCatalog Price 2009 Usd
Catalog Price 2009 Usd
 
Catalog Price 2009 Eur
Catalog Price 2009 EurCatalog Price 2009 Eur
Catalog Price 2009 Eur
 
Measuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and ConceptsMeasuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and Concepts
 
Catalog Price 2009 Usd
Catalog Price 2009 UsdCatalog Price 2009 Usd
Catalog Price 2009 Usd
 

Ähnlich wie Ijcai 2007 Pedersen

Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic ComputingMeena Nagarajan
 
Information retrieval chapter 2-Text Operations.ppt
Information retrieval chapter 2-Text Operations.pptInformation retrieval chapter 2-Text Operations.ppt
Information retrieval chapter 2-Text Operations.pptSamuelKetema1
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Chunyang Chen
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics Ibutest
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningIJSRD
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantinimaxfalc
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search ComponentMario Flecha
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEkevig
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEijnlc
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.pptBereketAraya
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional SemanticsAndre Freitas
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...CITE
 

Ähnlich wie Ijcai 2007 Pedersen (20)

Icon 2007 Pedersen
Icon 2007 PedersenIcon 2007 Pedersen
Icon 2007 Pedersen
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
 
Information retrieval chapter 2-Text Operations.ppt
Information retrieval chapter 2-Text Operations.pptInformation retrieval chapter 2-Text Operations.ppt
Information retrieval chapter 2-Text Operations.ppt
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text mining
 
The Semantic Quilt
The Semantic QuiltThe Semantic Quilt
The Semantic Quilt
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Textmining
TextminingTextmining
Textmining
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantini
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
Ontology
OntologyOntology
Ontology
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt
 
Class14
Class14Class14
Class14
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Measuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and ConceptsMeasuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and Concepts
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
 

Mehr von University of Minnesota, Duluth

Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...University of Minnesota, Duluth
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? University of Minnesota, Duluth
 
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?University of Minnesota, Duluth
 
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection University of Minnesota, Duluth
 
Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...University of Minnesota, Duluth
 
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...University of Minnesota, Duluth
 
Puns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyPuns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyUniversity of Minnesota, Duluth
 
The horizon isn't found in a dictionary : Identifying emerging word senses a...
The horizon isn't found in a  dictionary : Identifying emerging word senses a...The horizon isn't found in a  dictionary : Identifying emerging word senses a...
The horizon isn't found in a dictionary : Identifying emerging word senses a...University of Minnesota, Duluth
 
Duluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyDuluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyUniversity of Minnesota, Duluth
 
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...University of Minnesota, Duluth
 
What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)University of Minnesota, Duluth
 

Mehr von University of Minnesota, Duluth (20)

Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
 
Automatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social MediaAutomatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social Media
 
What Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshopWhat Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshop
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it?
 
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?
 
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
 
Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...
 
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
 
Puns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyPuns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and weary
 
The horizon isn't found in a dictionary : Identifying emerging word senses a...
The horizon isn't found in a  dictionary : Identifying emerging word senses a...The horizon isn't found in a  dictionary : Identifying emerging word senses a...
The horizon isn't found in a dictionary : Identifying emerging word senses a...
 
Screening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSDScreening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSD
 
Duluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyDuluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of Lexicography
 
Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014
 
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
 
What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)
 
Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25
 
Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24
 
Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013
 
Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012
 
Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1
 

Kürzlich hochgeladen

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Kürzlich hochgeladen (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Ijcai 2007 Pedersen

  • 1. Language Independent Methods of Clustering Similar Contexts (with applications) Ted Pedersen University of Minnesota, Duluth [email_address] http://www.d.umn.edu/~tpederse/SCTutorial.html
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 18.  
  • 19.  
  • 20.  
  • 21.  
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. Identifying Lexical Features Measures of Association and Tests of Significance
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. 2x2 Contingency Table 100,000 300 not Artificial 400 100 Artificial not Intelligence Intelligence
  • 37. 2x2 Contingency Table 100,000 99,700 300 99,600 99,400 200 not Artificial 400 300 100 Artificial not Intelligence Intelligence
  • 38. 2x2 Contingency Table 100,000 99,700 300 99,600 99,400.0 99,301.2 200.0 298.8 not Artificial 400 300.0 398.8 100.0 000.12 Artificial not Intelligence Intelligence
  • 41.
  • 42.  
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48. Context Representations First and Second Order Methods
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54. First Order Vectors of Unigrams 1 0 1 0 1 x4 0 0 0 0 0 x3 1 1 0 1 0 x2 1 1 1 1 1 x1 child magic curse black island
  • 55.
  • 56. First Order Vectors of Bigrams 1 0 1 1 0 x4 0 1 1 0 0 x3 1 0 0 0 1 x2 1 0 0 1 1 x1 voodoo child serious error military might island curse black magic
  • 57.
  • 58.
  • 59.
  • 60. Word by Word Matrix 120.0 0 69.4 0 0 voodoo 0 89.2 0 21.2 0 serious 0 54.9 100.3 0 0 military 73.2 0 0 189.2 0 island 43.2 0 0 0 123.5 black child error might curse magic
  • 61.
  • 62. There was an island curse of black magic cast by that voodoo child. 120.0 0 69.4 0 0 voodoo 73.2 0 0 189.2 0 island 43.2 0 0 0 123.5 black child error might curse magic
  • 63.
  • 64.
  • 65. There was an island curse of black magic cast by that voodoo child. 78.8 0 24.4 63.1 41.2 x1 child error might curse magic
  • 66.
  • 67.
  • 68. First Order Vectors of Unigrams 1 0 1 0 1 x4 0 0 0 0 0 x3 1 1 0 1 0 x2 1 1 1 1 1 x1 child magic curse black island
  • 69. Transposed 1 0 1 1 child 0 0 1 1 magic 1 0 0 1 curse 0 0 1 1 black 1 0 0 1 island x4 x3 x2 x1
  • 70. harold a known voodoo child was gifted in the arts of black magic 1 0 1 1 child 0 0 1 1 magic 0 0 1 1 black x4 x3 x2 x1
  • 71.
  • 72. x2: harold a known voodoo child was gifted in the arts of black magic .3 0 1 1 x2 x4 x3 x2 x1
  • 73.
  • 74.
  • 75.
  • 76. Dimensionality Reduction Singular Value Decomposition
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82. Word by Word Matrix native SenseClusters 2 nd order 4 2 0 0 0 3 0 1 box 0 1 2 2 1 2 0 0 memory 0 0 0 1 0 0 2 0 organ 0 2 0 3 2 0 0 0 debt 0 1 0 3 1 0 0 2 linux 0 1 0 3 2 0 0 0 sales 3 0 2 2 0 3 0 0 lab 1 0 2 0 0 1 2 0 petri 0 1 0 0 2 0 0 1 disk 1 0 2 0 0 0 3 0 body 0 0 0 3 1 0 0 2 pc plasma graphics tissue data ibm cells blood apple
  • 84. U -.52 .39 -.48 .02 .09 .41 -.09 .40 -.30 .08 .31 .43 -.26 -.39 -.6 .20 .00 -.00 -.00 -.02 -.01 .00 -.02 -.00 -.07 -.3 .14 -.49 -.07 .30 .25 .56 -.01 .08 .05 -.01 .24 -.08 .11 .46 .08 .03 -.04 .72 .09 -.31 -.01 .37 -.07 .01 -.21 -.31 -.34 -.45 -.68 .29 .00 .05 .83 .17 -.02 .25 -.45 .08 .03 .20 -.22 .31 -.60 .39 .13 .35 -.01 -.04 -.44 .08 .44 .59 -.49 .05 -.02 .63 .02 -.09 .52 -.2 .09 .35
  • 85. D 0.00 0.00 0.00 0.66 1.26 2.30 2.52 3.25 3.99 6.36 9.19
  • 86. V -.20 .22 -.07 -.10 -.87 -.07 -.06 .17 .19 -.26 .04 .03 .17 -.32 .02 .13 -.26 -.17 .06 -.04 .86 .50 -.58 .12 .09 -.18 -.27 -.18 -.12 -.47 .11 -.03 .12 .31 -.32 -.04 .64 -.45 -.14 -.23 .28 .07 -.23 -.62 -.59 .05 .02 -.12 .15 .11 .25 -.71 -.31 -.04 .08 .29 -.05 .05 .20 -.51 .09 -.03 .12 .31 -.01 .02 -.45 -.32 .50 .27 .49 -.02 .08 .21 -.06 .08 -.09 .52 -.45 -.01 .63 .03 -.12 -.31 .71 -.13 .39 -.12 .12 .15 .37 .07 .58 -.41 .15 .17 -.30 -.32 -.27 -.39 .11 .44 .25 .03 -.02 .26 .23 .39 .57 -.37 .04 .03 -.12 -.31 -.05 -.05 .04 .28 -.04 .08 .21
  • 87. Word by Word Matrix After SVD 1.1 1.0 .98 1.7 .86 .72 .85 .77 memory .00 .00 .17 1.2 .77 .00 .84 .00 organ .00 1.5 .00 3.2 2.1 .00 .00 1.2 debt .13 1.1 .03 2.7 1.7 .16 .00 .96 linux .41 .85 .35 2.2 1.3 .39 .15 .73 sales 2.3 .18 2.5 1.7 .35 2.0 1.7 .21 lab 1.4 .00 1.5 .49 .00 1.2 1.1 .00 germ .00 .91 .00 2.1 1.3 .01 .00 .76 disk 1.5 .00 1.6 .33 .00 1.3 1.2 .00 body .09 .86 .01 2.0 1.3 .11 .00 .73 pc plasma graphics tissue data ibm cells blood apple
  • 88.
  • 89.
  • 90. Clustering Partitional Methods Cluster Stopping Cluster Labeling
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96. Average Link Clustering 1 2 4 S3 1 2 4 S3 0 2 S4 0 3 S2 2 3 S1 S4 S2 S1 0 S4 0 S2 S1S3 S4 S2 S1S3 S4 S1S3S2 S4 S1S3S2
  • 97.
  • 98.
  • 99. Vectors to be clustered
  • 106.
  • 107.
  • 108. Contexts to be Clustered
  • 109. Ball of String (I1 Internal Criterion Function)
  • 110. Flower (I2 Internal Criterion Function)
  • 111.
  • 112. The Fan (E1 External Criterion Function)
  • 113.
  • 115.
  • 116.
  • 117. H2 versus k T. Blair – V. Putin – S. Hussein
  • 118.
  • 119. PK2 predicts 3 senses T. Blair – V. Putin – S. Hussein
  • 120.
  • 121. PK3 predicts 3 senses T. Blair – V. Putin – S. Hussein
  • 122.
  • 124. Gap predicts 3 senses T. Blair – V. Putin – S. Hussein
  • 125.
  • 127.
  • 128.
  • 129.
  • 130.
  • 131. Evaluation Techniques Comparison to gold standard data
  • 132.
  • 133.
  • 134.
  • 135.
  • 136.
  • 137.
  • 138.
  • 139.
  • 140. Hands on Experience Experiments with SenseClusters
  • 141.
  • 142.
  • 143.
  • 144.  
  • 145.  
  • 146.  
  • 147.  
  • 148.
  • 149.
  • 150.  
  • 151.  
  • 152.  
  • 153.  
  • 154.  
  • 155.  
  • 156.  
  • 157.  
  • 158.  
  • 159.  
  • 160.  
  • 161.  
  • 162.  
  • 163.  
  • 164.  
  • 165.  
  • 166.  
  • 167.  
  • 168.  
  • 169.  
  • 170.
  • 171.  
  • 172.  
  • 173.  
  • 174.  
  • 175.  
  • 176.  
  • 177.  
  • 178.  
  • 179.  
  • 180.  
  • 181.  
  • 182.  
  • 183.  
  • 184.  
  • 185.  
  • 186.  
  • 187.  
  • 188.  
  • 189.  
  • 190.  
  • 191.
  • 192.  
  • 193.  
  • 194.  
  • 195.  
  • 196.  
  • 197.  
  • 198.  
  • 199.  
  • 200.  

Hinweis der Redaktion

  1. The main idea is to assume a null hypothesis of single cluster and to see if the alternative hypothesis of k>1 clusters is able to refute the null hypothesis.