1. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Associative thesauri: structure and analysis
Brown bag seminar
Ekaterina Vylomova
Fulbright scholar at Montclair State University
February 21, 2014
E. Vylomova
Associative thesauri
2. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Brief bio
Brief Bio
2011: MSc, Bauman Moscow State Technical University
E. Vylomova
Associative thesauri
3. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Brief bio
Brief Bio
2011: MSc, Bauman Moscow State Technical University
2009: BSc, Bauman Moscow State Technical University
E. Vylomova
Associative thesauri
4. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Brief bio
Brief Bio
2011: MSc, Bauman Moscow State Technical University
2009: BSc, Bauman Moscow State Technical University
2009: Yandex School of Data Analysis (Moscow Institute of
Physics & Technology)
E. Vylomova
Associative thesauri
5. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's AE?
Associative Experiments
What's AE?
Associative experiment is one of methods of psycholinguistics. It's
based on method of free associations.
Sir Francis Galton conducted the rst experiment in 1879.
E. Vylomova
Associative thesauri
6. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's AE?
Associative Experiments
What's AE?
Associative experiment is one of methods of psycholinguistics. It's
based on method of free associations.
Sir Francis Galton conducted the rst experiment in 1879.
Types of AE
Single Free Association
Multiple Free Associations
Single Controlled Association (synonym, noun, verb, hyponym,
etc.)
Multiple Controlled Associations
E. Vylomova
Associative thesauri
7. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
What's associative thesaurus?
E. Vylomova
Associative thesauri
8. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
Example of data
EAT Word Associations
CAT stimulated the following associations:
DOG 49 0.52
MOUSE 8 0.08
BLACK 4 0.04
MAT 3 0.03
ANIMAL 2 0.02
EYES 2 0.02
GUT 2 0.02
KITTEN 2 0.02
E. Vylomova
Associative thesauri
9. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
AT for dierent languages
English
The Structure of Associations in Language and Thought
(Deese, 1965)
Word association (Cramer, 1968)
An associative thesaurus of English and its computer analysis
(Kiss et al., 1973)
Word Association, rhyme and fragment norms (Nelson,
McEvoy Schreiber, 1999)
E. Vylomova
Associative thesauri
10. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
AT for dierent languages
Dutch
Word association norms with response times (De Groot, 1988)
Word associations: Norms for 1,424 Dutch words in a
continuous task (De Deyne Storms, 2008)
Swedish
A Swedish Associative Thesaurus (Lonngren, 1998)
E. Vylomova
Associative thesauri
11. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
AT for dierent languages
Japanese
Construction of associative concept dictionary with distance
information, and comparison with electronic concept dictionary
(Okamoto Ishizaki, 2001)
Building a word association database for basic Japanese
vocabulary (Joyce, 2005)
Korean
Network analysis of Korean Word Associations(Jung et al.,
2010)
E. Vylomova
Associative thesauri
12. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
AT for dierent languages
Czech
Volne slovni parove asociace v cestine (Novak, 1988)
Hebrew
Free association norms in the Hebrew language (Rubinsten,
2005)
E. Vylomova
Associative thesauri
13. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
What's associative thesaurus?
Example of data
AT for dierent languages
Slavic Associative Thesauri
Slavic Associative Thesauri
Dictionary of associative norms in Russian (Leontiev,1973)
Russian Associative Thesaurus (Karaulov et al.,2002)
Slavic Associative Thesaurus(Russian, Belorussian,Bulgarian,
Ukrainian) (Umtseva et al., 2004)
Normas asociativas del espanol y del ruso(Sanchez
Puig,Karaulov,Cherkasova, 2000)
E. Vylomova
Associative thesauri
14. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Data
Research
Russian associative experiment description
Time frame: 1988-1998
Participants: 11,000 1st-3rd year students; 34 specialities
Stimuli: 6,624(initial list: 1,277)
Associative pairs:1,032,522 (dierent - 462,500)
Reactions:102,926
Subset used for analysis
Stimuli: 6,577
Reactions:21,312
Associative pairs:102,516
Dataset
Set of triplets: c , r , w , where w =
i
j
ij
E. Vylomova
ij
freqij
n
1 freqij
Associative thesauri
j=
.
15. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Data
Research
Comparison with frequency dictionary of Russian language
Frequency dictionary
Frequency dictionary of modern Russian language (Lyashevskaya,
Sharov, 2009).
Based on the texts from Russian National Corpus
(www.ruscorpora.ru) and includes information about 20,000 most
common words in Russian language.
RAT Lemmatisation
RAT-MyStem(Segalovich, 2003)-lemmas
E. Vylomova
Associative thesauri
16. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Data
Research
Comparison with frequency dictionary of Russian language
TOP-11 Nouns
RAT
FreqDict
E. Vylomova
Associative thesauri
Human
Home, House
Money
Day
Friend
Home
Male
Fool
Business
Life
Illness
Year
Human
Time
Business
Life
Day
Hand
Work
Word
Place
Friend
17. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Data
Research
Comparison with frequency dictionary of Russian language
Semantic primes?
Concept Human: human child friend male
Concept Time: day time
Adjectives: good bad big.
These concepts don't change over the time.
Positive correlation with semantic primes (Wierzbicka)
E. Vylomova
Associative thesauri
18. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Description
Associative Network based on RAT'98
Network analysis
Description
Nodes correspond to words(lemmas)
Edges correspond to associations
Edge's weight correspond to association strength
E. Vylomova
Associative thesauri
19. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Description
Associative Network based on RAT'98
Network analysis
Main characteristics of the network
Nodes: |V | = 23, 195, among them:
nodes with outgoing edges(stimuli): |S | = 1, 883
nodes with incoming edges(reactions): |R | = 16, 618
nodes with both types of edges: |SR | = 4, 694
Edges: |E | = 102, 516
E. Vylomova
Associative thesauri
20. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Description
Associative Network based on RAT'98
Network analysis
Table of network characteristics
Sign
N
L
D
k
ψ
Description
Number of nodes
average shortest path length
Diameter
Average node degree
Degree distribution (P(k)) par.
Directed
23,195
3.98
9
4.42
2.2
Directed to undirected
w
ij
= wji = wij + wji
Degree distribution function
P (k ) ≈ k −ψ
E. Vylomova
Associative thesauri
Undirected
23,195
3.83
8
8.83
1.85
21. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Description
Associative Network based on RAT'98
Network analysis
Small-world networks
Denition
Introduced by Milgram, 1967 (The small world problem)
L ∝ log (N ),i.e. distance L between two randomly chosen nodes
grows proportionally to the logarithm of the number of nodes N in
the network
Also known as Six degrees of separation
Examples
World Wide Web (WWW; Adamic, 1999; Albert, Jeong,
Barabasi, 1999), networks of scientic collaboration (Newman,
2001),metabolic networks in biology (Jeong, Tombor, Albert,
Oltval, Barabasi, 2000)
E. Vylomova
Associative thesauri
22. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Description
Associative Network based on RAT'98
Network analysis
Scale-free networks
Description
Amaral, Scala et al., 2000 studied small-world networks and
compared degree distribution function P (k ).
2 types of distribution:
exponential(power grid system in USA, neural system of
C.elegans)
power law(WWW, metabolic networks): P (k ) = k −ψ ,
ψ ∈ (2..4)
Scale-free networks provide better signal propagation.
E. Vylomova
Associative thesauri
23. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Description
Associative Network based on RAT'98
Network analysis
Scale-free networks
Other examples
Similar results were obtained for Roget thesaurus(Roget,
1911),WordNet and associative networks(Steyvers and Tenenbaum,
2005).
E. Vylomova
Associative thesauri
24. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Three Models of Associative Network
Concept-based model
Vector-based models
Multidimensional scaling(Torgerson,1958)
Latent Semantic analysis(Landauer, Dumais, 1997)
E. Vylomova
Associative thesauri
25. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Data
Core of the network: 4,692 lemmas with 59,392 connections
The structure is similar to associative network(nodes-lemmas, edges
- associations)
Activity accumulation
1. Initial state: random activity
2. Spreading of activation: S = S −1 +
w S −1 , where S is
activity of neuron i at the moment t .
3. Activation exceeds the threshold = produce the reaction.
S = 0.
t
t
i
i
t
j
ij
t
i
E. Vylomova
Associative thesauri
t
j
i
26. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Pros and cons
Pros
very simple model
easy to understand
easy to modify(no need in reevaluation of the model)
Cons
unclear how to choose the threshold value(we did series of
experiment to nd optimal value)
once activation is released, should we also do modication for
neighbouring neurons?
E. Vylomova
Associative thesauri
27. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Multidimensional scaling
From concept to vector
Distance matrix:
δ1,1 δ1,2 · · ·
δ2,1 δ2,2 · · ·
= .
. ..
.
.
.
.
.
δI ,1 δI ,2 · · ·
δ1,I
δ2,I
.
.
.
δ I ,I
where I means number of objects(words).
Our goal is to nd such vectors x1 , ..., x ∈ R that
for all i , j ∈ I .
In other words:
2
min 1 ,..,
( x −x −δ ) .
I
x
xI
i
j
i
j
N
ij
E. Vylomova
Associative thesauri
x
i
− xj ≈ δij
28. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Latent Semantic Analysis
From concept to vector-2
Technique of analysing relationships between a set of documents
and the terms they contain by producing a set of concepts related
to the documents and terms.
In my case
Terms are lemmas, document is a set of associations for a given
stimulus.
Inputs: term-document matrix with TF*IDF values
Term frequency: TF = w =
, Inverse document
freqij
ij
N
1 freqij
j=
frequency: IDF = log
|S | is a total number of
stimuli.Singular Value Decomposition = vector representations.
|S |
|s ∈ S : r ∈ s | ,
E. Vylomova
Associative thesauri
29. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Clustering
k-means
So, we've got vectors. What's next?
Let's evaluate similarity:
First, set a distance metric, e.g. d =
=1 |x − x |
And use it with k-means clustering:
min =1 ∈ (x − µ )2 ,
where k is a number of clusters, S are evaluated clusters,µ are
centers of the clusters.
So, the technique is based on nding the nearest cluster.
r
ij
N
k
r
ik
jk
k
i
xj
Si
j
i
i
E. Vylomova
i
Associative thesauri
31. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Pros and cons
Pros
easy to operate with vectors: add, multiple, subtract, etc.
possible to set preferred dimensionality and visualize
Cons
problem with storage: matrices are huge
complexity: MDS and LSA are based on SVD; it takes O (n3 )
choosing optimal number of clusters and dimensionality
E. Vylomova
Associative thesauri
32. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Tip of the tongueapplication
DataMethod
Data: RAT+Abramov's synonym dictionary
Method: LSA+k-means
E. Vylomova
Associative thesauri
33. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
Models of Associative Network
Concept-based model
Tip of the tongueapplication
DataMethod
Usage of associative thesauri for solving tasks related
to the “tip of the tongue” phenomenon
Ekaterina Vylomova
Bauman Moscow State Technical University, Moscow State University of Printing Arts
DATA
INTRODUCTION
The tip-of-the-tongue(TOT) phenomenon is the failure to retrieve a word
from memory, combined with partial recall and the feeling that retrieval is
imminent. People in a tip-of-the-tongue state can often recall one or more
features of the target word, such as the first letter, its syllabic stress, and
words similar in sound and/or meaning.
•TOT appears to be universal (Brennenet al. 2007)
•An occasional tip-of-the-tongue state is normal for people of all ages
•TOT becomes more frequent as people age.
R. Braun, D. McNeill and A. Luria consider the processes of
recalling and naming the words as processes of probabilistic choice of a
word from involuntary associations’ chain and relate them to the
construction of human semantic memory.
Abramov. Dictionary of Russian synonyms and similar expressions,
1890-1999
19,297 words phrases
18,136 synonym articles
Karaulov Y.,, Tarasov E., Sorokin Y., Ufimtseva N., Cherkasova G..
1999. Associative thesaurus of modern russian language. RAS,
Moscow.
56,540 associative pairs
50,923 associative pairs (after lemmatization)
26,803 lemmas
Overall (synonym and associative pairs combined together)
316,018
METHODOLOGY
RAT
RAT
Abramov
RAT
Abramov
dict.
lemmas
Lemmatization
Abramov
RAT
dict.
Lemmatization using Yandex
mystem stemmer
LSA k-NN
Apply Latent Semantic Algorithm to
get vector representation of words
and k-nearest neighbours for
clustering
Clusters containing similar by
meaning and association words
EXAMPLE
Hmm...What's the name
Hmm...What's the name
of that Ukranian food?
of that Ukranian food?
Associative thesauri+Abramov dictionary:
Комильфо - приличие
After clustering:
не выходить из пределов благопристойности
0.001
степенный
0.591
чинный
0.591
благочинный
0.646
бонтонный
0.646
комильфотный
0.646
пристойный
0.646
благонравный
0.684
благоприличный
0.684
благопристойный
0.684
корректный
0.684
REFERENCES
FUTURE PLANS
1. Expand synonym and associative thesauri with new ones
2. Add first letter filtering (see above)
3. Add hyponyms and hyperonyms
RHF #12-04-12039B
E. Vylomova
1. Brown, R., and McNeill, D. (1966). The tip of the tongue phenomenon.
Journal of Verbal Learning and Verbal Behavior 5, 325-337.
2. Караулов Ю.Н., Тарасов Е.Ф., Сорокин Ю.А., Уфимцева Н.В., Черкасова
Г.А. (1999). Ассоциативный тезаурус современного русского языка. РАН.
(russian)
3. Лурия А.Р. (1979). Язык и сознание.//под редакцией Хомско Е.Д., МГУ,
Москва - 320 стр.(russian)
E-mail: evylomova@gmail.com
Associative thesauri
34. Introduction
Associative Experiments
Associative Thesauri
Russian Associative Thesaurus'98
Associative Network(Graph)
Modelling of Associative Network
Future work
RAT'10
Time frame: 20 years after the rst one(2009-2010)
Location: dierent regions of Russia.
Stimuli included 1000 most frequent words in Russian language.
The participants: young people at the age of 17-25.
E. Vylomova
Associative thesauri