SlideShare ist ein Scribd-Unternehmen logo
1 von 62
Downloaden Sie, um offline zu lesen
Seman&c	
  Analysis	
  in	
  Language	
  Technology	
  
http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm 



Word Sense Disambiguation


Marina	
  San(ni	
  
san$nim@stp.lingfil.uu.se	
  
	
  
Department	
  of	
  Linguis(cs	
  and	
  Philology	
  
Uppsala	
  University,	
  Uppsala,	
  Sweden	
  
	
  
Spring	
  2016	
  
	
  
	
  
1	
  
Previous	
  Lecture:	
  Word	
  Senses	
  
•  Homonomy,	
  polysemy,	
  synonymy,	
  metonymy,	
  etc.	
  	
  
Prac(cal	
  ac(vi(es:	
  
1)	
  SELECTIONAL	
  RESTRICTIONS	
  
2)	
  MANUAL	
  DISAMBIGUATION	
  OF	
  EXAMPLES	
  USING	
  SENSEVAL	
  
SENSES	
  
AIMS	
  OF	
  PRACTICAL	
  ACTIVITiES:	
  	
  
•  STUDENTS	
  SHOULD	
  GET	
  ACQUINTED	
  WITH	
  REAL	
  DATA	
  
•  EXPLORATIONS	
  OF	
  APPLICATIONS,	
  RESOURCES	
  AND	
  METHODS.	
  	
  
2	
  
No	
  preset	
  solu$ons	
  (this	
  slide	
  is	
  to	
  tell	
  you	
  
that	
  you	
  are	
  doing	
  well	
  	
  J	
  	
  )	
  
•  Whatever	
  your	
  experience	
  with	
  data,	
  it	
  is	
  a	
  valuable	
  experience:	
  	
  
•  Disappointment	
  
•  Frustra(on	
  
•  Feeling	
  lost	
  
•  Happiness	
  
•  Power	
  
•  Excitement	
  
•  …	
  
•  All	
  the	
  students	
  so	
  far	
  	
  (also	
  in	
  previous	
  courses)	
  have	
  presented	
  their	
  	
  
own	
  solu(ons…	
  many	
  different	
  solu(ons	
  and	
  it	
  is	
  ok…	
  	
  3	
  
J&M	
  own	
  solu$ons:	
  Selec$onal	
  Restric$ons	
  (just	
  for	
  your	
  
records,	
  does	
  not	
  mean	
  they	
  are	
  necessearily	
  beMer	
  than	
  yours…	
  )	
  
4	
  
Other	
  possible	
  solu$ons…	
  
•  Kissàconcrete	
  sense:	
  touching	
  
with	
  lips/mouth	
  
•  animate	
  kiss	
  [using	
  lips/
mouth]	
  animate/inanimate	
  
•  Ex:	
  he	
  kissed	
  her;	
  	
  
•  The	
  dolphin	
  kissed	
  the	
  kid	
  	
  
•  Why	
  does	
  the	
  pope	
  kiss	
  the	
  
ground	
  a^er	
  he	
  disembarks	
  ...	
  
•  Kissàfigura(ve	
  sense:	
  touching	
  	
  
•  animate	
  kiss	
  inanimate	
  
•  Ex:	
  "Walk	
  as	
  if	
  you	
  are	
  kissing	
  the	
  Earth	
  
with	
  your	
  feet."	
  
5	
  
pursed	
  lips?	
  
NO	
  solu$on	
  or	
  comments	
  provided	
  for	
  Senseval	
  
•  All	
  your	
  impressions	
  and	
  feelings	
  are	
  plausible	
  and	
  acceptable	
  J	
  
6	
  
Remember	
  that	
  in	
  both	
  ac$vi$es…	
  
•  You	
  have	
  experienced	
  cases	
  of	
  POLYSEMY!	
  
•  YOU	
  HAVE	
  TRIED	
  TO	
  DISAMBIGUATE	
  THE	
  SENSES	
  MANUALLY,	
  IE	
  
WITH	
  YOUR	
  HUMAN	
  SKILLS…	
  	
  
7	
  
Previous	
  lecture:	
  end	
  
8	
  
Today:	
  Word	
  Sense	
  Disambigua$on	
  (WSD)	
  
•  Given:	
  
•  A	
  word	
  in	
  context;	
  	
  
•  A	
  fixed	
  inventory	
  of	
  poten(al	
  word	
  senses;	
  
•  Create	
  a	
  system	
  that	
  automa(cally	
  decides	
  which	
  sense	
  of	
  
the	
  word	
  is	
  correct	
  in	
  that	
  context.	
  
Word	
  Sense	
  Disambigua$on:	
  Defini$on	
  
•  Word	
  Sense	
  Disambitua(on	
  (WSD)	
  is	
  the	
  TASK	
  of	
  determining	
  the	
  
correct	
  sense	
  of	
  a	
  word	
  in	
  context.	
  
•  It	
  is	
  an	
  automa(c	
  task:	
  we	
  create	
  a	
  system	
  that	
  automa-cally	
  
disambiguates	
  the	
  senses	
  for	
  us.	
  
•  Useful	
  for	
  many	
  NLP	
  tasks:	
  informa(on	
  retrieval	
  (apple	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  or	
  
apple	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ?),	
  ques(on	
  answering	
  (does	
  United	
  serve	
  
Philadelphia?),	
  machine	
  transla(on	
  (eng	
  ”bat”	
  à	
  It:	
  pipistrello	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
or	
  mazza	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ?)	
  
10	
  	
  
Anecdote:	
  the	
  poison	
  apple	
  
•  In	
  1954,	
  Alan	
  Turing	
  died	
  a^er	
  bi(ng	
  into	
  an	
  apple	
  laced	
  with	
  
cyanide	
  
•  It	
  was	
  said	
  that	
  this	
  half-­‐biten	
  apple	
  inspired	
  the	
  Apple	
  logo…	
  
but	
  apparently	
  it	
  is	
  a	
  legend	
  J	
  	
  
•  hmp://mentalfloss.com/ar(cle/64049/did-­‐alan-­‐turing-­‐inspire-­‐
apple-­‐logo	
  	
  
11	
  
Be	
  alert…	
  
•  Word	
  sense	
  ambiguity	
  is	
  pervasive	
  !!!	
  
12	
  
Acknowledgements
Most	
  slides	
  borrowed	
  or	
  adapted	
  from:	
  
Dan	
  Jurafsky	
  and	
  James	
  H.	
  Mar(n	
  
Dan	
  Jurafsky	
  and	
  Christopher	
  Manning,	
  Coursera	
  	
  
	
  
J&M(2015,	
  dra^):	
  hmps://web.stanford.edu/~jurafsky/slp3/	
  	
  	
  
	
  
	
  	
  	
  
Outline:	
  WSD	
  Methods	
  
•  Thesaurus/Dic(onary	
  Methods	
  
•  Supervised	
  Machine	
  Learning	
  
•  Semi-­‐Supervised	
  Learning	
  (self-­‐reading)	
  
14	
  
Word Sense
Disambiguation
Dic(onary	
  and	
  
Thesaurus	
  Methods	
  
The	
  Simplified	
  Lesk	
  algorithm	
  
•  Let’s	
  disambiguate	
  “bank”	
  in	
  this	
  sentence:	
  
The	
  bank	
  can	
  guarantee	
  deposits	
  will	
  eventually	
  cover	
  future	
  tui(on	
  costs	
  
because	
  it	
  invests	
  in	
  adjustable-­‐rate	
  mortgage	
  securi(es.	
  	
  
•  given	
  the	
  following	
  two	
  WordNet	
  senses:	
  	
  
if overlap > max-overlap then
max-overlap overlap
best-sense sense
end
return(best-sense)
Figure 16.6 The Simplified Lesk algorithm. The COMPUTEOVERLAP function returns the
number of words in common between two sets, ignoring function words or other words on a
stop list. The original Lesk algorithm defines the context in a more complex way. The Cor-
pus Lesk algorithm weights each overlapping word w by its logP(w) and includes labeled
training corpus data in the signature.
bank1 Gloss: a financial institution that accepts deposits and channels the
money into lending activities
Examples: “he cashed a check at the bank”, “that bank holds the mortgage
on my home”
bank2 Gloss: sloping land (especially the slope beside a body of water)
Examples: “they pulled the canoe up on the bank”, “he sat on the bank of
the river and watched the currents”
The	
  Simplified	
  Lesk	
  algorithm	
  
The	
  bank	
  can	
  guarantee	
  deposits	
  will	
  eventually	
  cover	
  future	
  
tui(on	
  costs	
  because	
  it	
  invests	
  in	
  adjustable-­‐rate	
  mortgage	
  
securi(es.	
  	
  
if overlap > max-overlap then
max-overlap overlap
best-sense sense
end
return(best-sense)
Figure 16.6 The Simplified Lesk algorithm. The COMPUTEOVERLAP function returns the
number of words in common between two sets, ignoring function words or other words on a
stop list. The original Lesk algorithm defines the context in a more complex way. The Cor-
pus Lesk algorithm weights each overlapping word w by its logP(w) and includes labeled
training corpus data in the signature.
bank1 Gloss: a financial institution that accepts deposits and channels the
money into lending activities
Examples: “he cashed a check at the bank”, “that bank holds the mortgage
on my home”
bank2 Gloss: sloping land (especially the slope beside a body of water)
Examples: “they pulled the canoe up on the bank”, “he sat on the bank of
the river and watched the currents”
Choose	
  sense	
  with	
  most	
  word	
  overlap	
  between	
  gloss	
  and	
  context	
  
(not	
  coun(ng	
  func(on	
  words)	
  
Drawback	
  
•  Glosses	
  and	
  examples	
  migh	
  be	
  too	
  short	
  and	
  may	
  not	
  provide	
  
enough	
  chance	
  to	
  overlap	
  with	
  the	
  context	
  of	
  the	
  word	
  to	
  be	
  
disambiguated.	
  	
  
18	
  
The	
  Corpus(-­‐based)	
  Lesk	
  algorithm	
  
•  Assumes	
  we	
  have	
  some	
  sense-­‐labeled	
  data	
  (like	
  SemCor)	
  
•  Take	
  all	
  the	
  sentences	
  with	
  the	
  relevant	
  word	
  sense:	
  
These	
  short,	
  "streamlined"	
  mee-ngs	
  usually	
  are	
  sponsored	
  by	
  local	
  banks1,	
  
Chambers	
  of	
  Commerce,	
  trade	
  associa-ons,	
  or	
  other	
  civic	
  organiza-ons.	
  
•  Now	
  add	
  these	
  to	
  the	
  gloss	
  +	
  examples	
  for	
  each	
  sense,	
  call	
  it	
  the	
  
“signature”	
  of	
  a	
  sense.	
  Basically,	
  it	
  is	
  an	
  expansion	
  of	
  the	
  
dic(onary	
  entry.	
  
•  Choose	
  sense	
  with	
  most	
  word	
  overlap	
  between	
  context	
  and	
  
signature	
  (ie.	
  the	
  context	
  words	
  provided	
  by	
  the	
  resources).	
  
Corpus	
  Lesk:	
  IDF	
  weigh$ng	
  
•  Instead	
  of	
  just	
  removing	
  func(on	
  words	
  
•  Weigh	
  each	
  word	
  by	
  its	
  `promiscuity’	
  across	
  documents	
  
•  Down-­‐weights	
  words	
  that	
  occur	
  in	
  every	
  `document’	
  (gloss,	
  example,	
  etc)	
  
•  These	
  are	
  generally	
  func(on	
  words,	
  but	
  is	
  a	
  more	
  fine-­‐grained	
  measure	
  
•  Weigh	
  each	
  overlapping	
  word	
  by	
  inverse	
  document	
  frequency	
  
(IDF).	
  
20	
  
Graph-­‐based	
  methods	
  
•  First,	
  WordNet	
  can	
  be	
  viewed	
  as	
  a	
  graph	
  
•  senses	
  are	
  nodes	
  
•  rela(ons	
  (hypernymy,	
  meronymy)	
  are	
  edges	
  
•  Also	
  add	
  edge	
  between	
  word	
  and	
  unambiguous	
  gloss	
  words	
  
21	
  
toastn
4
drinkv
1
drinkern
1
drinkingn
1
potationn
1
sipn
1
sipv
1
beveragen
1 milkn
1
liquidn
1foodn
1
drinkn
1
helpingn
1
supv
1
consumptionn
1
consumern
1
consumev
1
An	
  undirected	
  
graph	
  is	
  set	
  of	
  
nodes	
  tha	
  are	
  
connected	
  
together	
  by	
  
bidirec(onal	
  
edges	
  (lines).	
  	
  
How	
  to	
  use	
  the	
  graph	
  for	
  WSD	
  
“She	
  drank	
  some	
  milk”	
  
•  choose	
  the	
  
	
  	
  	
  	
  	
  most	
  central	
  sense	
  
	
  
(several	
  algorithms	
  
	
  have	
  been	
  proposed	
  
recently)	
  	
  
22	
  
drinkv
1
drinkern
1
beveragen
1
boozingn
1
foodn
1
drinkn
1 milkn
1
milkn
2
milkn
3
milkn
4
drinkv
2
drinkv
3
drinkv
4
drinkv
5
nutrimentn
1
“drink” “milk”
Word Meaning and
Similarity
Word	
  Similarity:	
  
Thesaurus	
  Methods	
  
beg:	
  c_w8	
  
Word	
  Similarity	
  
•  Synonymy:	
  a	
  binary	
  rela(on	
  
•  Two	
  words	
  are	
  either	
  synonymous	
  or	
  not	
  
•  Similarity	
  (or	
  distance):	
  a	
  looser	
  metric	
  
•  Two	
  words	
  are	
  more	
  similar	
  if	
  they	
  share	
  more	
  features	
  of	
  meaning	
  
•  Similarity	
  is	
  properly	
  a	
  rela(on	
  between	
  senses	
  
•  We	
  do	
  not	
  say	
  “The	
  word	
  “bank”	
  is	
  not	
  similar	
  to	
  the	
  word	
  “slope”	
  “,	
  bu	
  w	
  say.	
  
•  Bank1	
  is	
  similar	
  to	
  fund3	
  
•  Bank2	
  is	
  similar	
  to	
  slope5	
  
•  But	
  we’ll	
  compute	
  similarity	
  over	
  both	
  words	
  and	
  senses	
  
Why	
  word	
  similarity	
  
•  Informa(on	
  retrieval	
  
•  Ques(on	
  answering	
  
•  Machine	
  transla(on	
  
•  Natural	
  language	
  genera(on	
  
•  Language	
  modeling	
  
•  Automa(c	
  essay	
  grading	
  
•  Plagiarism	
  detec(on	
  
•  Document	
  clustering	
  
Word	
  similarity	
  and	
  word	
  relatedness	
  
•  We	
  o^en	
  dis(nguish	
  word	
  similarity	
  	
  from	
  word	
  
relatedness	
  
•  Similar	
  words:	
  near-­‐synonyms	
  
•  car, bicycle:	
  	
  	
  	
  similar	
  
•  Related	
  words:	
  can	
  be	
  related	
  any	
  way	
  
•  car, gasoline:	
  	
  	
  related,	
  not	
  similar	
  
Cf.	
  Synonyms:	
  car	
  &	
  automobile	
  
Two	
  classes	
  of	
  similarity	
  algorithms	
  
•  Thesaurus-­‐based	
  algorithms	
  
•  Are	
  words	
  “nearby”	
  in	
  hypernym	
  hierarchy?	
  
•  Do	
  words	
  have	
  similar	
  glosses	
  (defini(ons)?	
  
•  Distribu(onal	
  algorithms:	
  next	
  (me!	
  
•  Do	
  words	
  have	
  similar	
  distribu(onal	
  contexts?	
  
Path-­‐based	
  similarity	
  
•  Two	
  concepts	
  (senses/synsets)	
  are	
  similar	
  if	
  
they	
  are	
  near	
  each	
  other	
  in	
  the	
  thesaurus	
  
hierarchy	
  	
  
•  =have	
  a	
  short	
  path	
  between	
  them	
  
•  concepts	
  have	
  path	
  1	
  to	
  themselves	
  
Refinements	
  to	
  path-­‐based	
  similarity	
  
•  pathlen(c1,c2) =	
  (distance	
  metric)	
  =	
  1	
  +	
  number	
  of	
  edges	
  in	
  the	
  
shortest	
  path	
  in	
  the	
  hypernym	
  graph	
  between	
  sense	
  nodes	
  c1	
  
and	
  c2	
  
•  simpath(c1,c2) =
•  wordsim(w1,w2) = max sim(c1,c2)
c1∈senses(w1),c2∈senses(w2)	
  
1
pathlen(c1,c2 )
Sense	
  similarity	
  metric:	
  1	
  
over	
  the	
  distance!	
  
Word	
  similarity	
  metric:	
  	
  
max	
  similarity	
  among	
  
pairs	
  of	
  senses.	
  
For	
  all	
  senses	
  of	
  w1	
  and	
  all	
  senses	
  of	
  w2,	
  take	
  the	
  similarity	
  between	
  each	
  of	
  the	
  senses	
  of	
  w1	
  
and	
  each	
  of	
  the	
  senses	
  of	
  w2	
  and	
  then	
  take	
  the	
  maximum	
  similarity	
  between	
  those	
  pairs.	
  
Example:	
  path-­‐based	
  similarity	
  
simpath(c1,c2) = 1/pathlen(c1,c2)
simpath(nickel,coin)	
  =	
  1/2 = .5
simpath(fund,budget)	
  =	
  1/2 = .5
simpath(nickel,currency)	
  =	
  1/4 = .25
simpath(nickel,money)	
  =	
  1/6 = .17
simpath(coinage,Richter	
  scale)	
  =	
  1/6 = .17
Problem	
  with	
  basic	
  path-­‐based	
  similarity	
  
•  Assumes	
  each	
  link	
  represents	
  a	
  uniform	
  distance	
  
•  But	
  nickel	
  to	
  money	
  seems	
  to	
  us	
  to	
  be	
  closer	
  than	
  nickel	
  to	
  
standard	
  
•  Nodes	
  high	
  in	
  the	
  hierarchy	
  are	
  very	
  abstract	
  
•  We	
  instead	
  want	
  a	
  metric	
  that	
  
•  Represents	
  the	
  cost	
  of	
  each	
  edge	
  independently	
  
•  Words	
  connected	
  only	
  through	
  abstract	
  nodes	
  	
  
•  are	
  less	
  similar	
  
Informa$on	
  content	
  similarity	
  metrics	
  
•  In	
  simple	
  words:	
  
•  We	
  define	
  the	
  probability	
  of	
  a	
  concept	
  C	
  as	
  the	
  probability	
  that	
  a	
  
randomly	
  selected	
  word	
  in	
  a	
  corpus	
  is	
  an	
  instance	
  of	
  that	
  concept.	
  
•  Basically,	
  for	
  each	
  random	
  word	
  in	
  a	
  corpus	
  we	
  compute	
  how	
  probable	
  it	
  
is	
  that	
  it	
  belongs	
  to	
  a	
  certain	
  concepts.	
  	
  
Resnik	
  1995.	
  Using	
  informa(on	
  content	
  to	
  evaluate	
  seman(c	
  
similarity	
  in	
  a	
  taxonomy.	
  IJCAI	
  
Formally:	
  Informa$on	
  content	
  similarity	
  metrics	
  
•  Let’s	
  define	
  P(c) as:	
  
•  The	
  probability	
  that	
  a	
  randomly	
  selected	
  word	
  in	
  a	
  corpus	
  is	
  an	
  instance	
  
of	
  concept	
  c
•  Formally:	
  there	
  is	
  a	
  dis(nct	
  random	
  variable,	
  ranging	
  over	
  words,	
  
associated	
  with	
  each	
  concept	
  in	
  the	
  hierarchy	
  
•  for	
  a	
  given	
  concept,	
  each	
  observed	
  noun	
  is	
  either	
  
•  	
  a	
  member	
  of	
  that	
  concept	
  	
  with	
  probability	
  P(c)
•  not	
  a	
  member	
  of	
  that	
  concept	
  with	
  probability	
  1-P(c)
•  All	
  words	
  are	
  members	
  of	
  the	
  root	
  node	
  (En(ty)	
  
•  P(root)=1
•  The	
  lower	
  a	
  node	
  in	
  hierarchy,	
  the	
  lower	
  its	
  probability	
  
Resnik	
  1995.	
  Using	
  informa(on	
  content	
  to	
  evaluate	
  seman(c	
  
similarity	
  in	
  a	
  taxonomy.	
  IJCAI	
  
Informa$on	
  content	
  similarity	
  
•  For	
  every	
  word	
  (ex	
  “natural	
  eleva(on”),	
  we	
  
count	
  all	
  the	
  words	
  in	
  that	
  concepts,	
  and	
  
then	
  we	
  normalize	
  by	
  the	
  total	
  number	
  of	
  
words	
  in	
  the	
  corpus.	
  
•  we	
  get	
  a	
  probability	
  value	
  that	
  tells	
  us	
  how	
  
probable	
  it	
  is	
  that	
  a	
  random	
  word	
  is	
  a	
  an	
  
instance	
  of	
  that	
  concept	
  	
  
P(c) =
count(w)
w∈words(c)
∑
N
geological-­‐forma(on	
  
shore	
  
hill	
  
natural	
  eleva(on	
  
coast	
  
cave	
  
gromo	
  ridge	
  
…	
  
en(ty	
  
In	
  order	
  o	
  compute	
  the	
  
probability	
  of	
  the	
  term	
  
"natural	
  eleva(on",	
  we	
  
take	
  ridge,	
  hill	
  +	
  natural	
  
eleva(on	
  itself	
  
Informa$on	
  content	
  similarity	
  
•  WordNet	
  hierarchy	
  augmented	
  with	
  probabili(es	
  P(c)	
  
D.	
  Lin.	
  1998.	
  An	
  Informa(on-­‐Theore(c	
  Defini(on	
  of	
  Similarity.	
  ICML	
  1998	
  
Informa$on	
  content:	
  defini$ons	
  
1.  Informa(on	
  content:	
  
1.  IC(c) = -log P(c)
2.  Most	
  informa(ve	
  subsumer	
  
(Lowest	
  common	
  subsumer)	
  
LCS(c1,c2) =
The	
  most	
  informa(ve	
  (lowest)	
  
node	
  in	
  the	
  hierarchy	
  
subsuming	
  both	
  c1	
  and	
  c2	
  
IC	
  aka…	
  
•  A	
  lot	
  of	
  people	
  prefer	
  the	
  term	
  surprisal	
  to	
  informa(on	
  or	
  to	
  
informa(on	
  content.	
  	
  
-­‐log	
  p(x)	
  
It	
  measures	
  the	
  amount	
  of	
  surprise	
  generated	
  by	
  the	
  event	
  x.	
  	
  
The	
  smaller	
  the	
  probability	
  of	
  x,	
  the	
  bigger	
  the	
  surprisal	
  is.	
  
	
  
It's	
  helpful	
  to	
  think	
  about	
  it	
  this	
  way,	
  par(cularly	
  for	
  linguis(cs	
  
examples.	
  	
  
37	
  
Using	
  informa$on	
  content	
  for	
  similarity:	
  	
  
the	
  Resnik	
  method	
  
•  The	
  similarity	
  between	
  two	
  words	
  is	
  related	
  to	
  their	
  
common	
  informa(on	
  
•  The	
  more	
  two	
  words	
  have	
  in	
  common,	
  the	
  more	
  
similar	
  they	
  are	
  
•  Resnik:	
  measure	
  common	
  informa(on	
  as:	
  
•  The	
  informa(on	
  content	
  of	
  the	
  most	
  informa(ve	
  
	
  (lowest)	
  subsumer	
  (MIS/LCS)	
  of	
  the	
  two	
  nodes	
  
•  simresnik(c1,c2) = -log P( LCS(c1,c2) )
Philip	
  Resnik.	
  1995.	
  Using	
  Informa(on	
  Content	
  to	
  Evaluate	
  Seman(c	
  Similarity	
  in	
  a	
  Taxonomy.	
  IJCAI	
  1995.	
  
Philip	
  Resnik.	
  1999.	
  Seman(c	
  Similarity	
  in	
  a	
  Taxonomy:	
  An	
  Informa(on-­‐Based	
  Measure	
  and	
  its	
  Applica(on	
  
to	
  Problems	
  of	
  Ambiguity	
  in	
  Natural	
  Language.	
  JAIR	
  11,	
  95-­‐130.	
  
Dekang	
  Lin	
  method	
  
•  Intui(on:	
  Similarity	
  between	
  A	
  and	
  B	
  is	
  not	
  just	
  what	
  they	
  have	
  
in	
  common	
  
•  The	
  more	
  differences	
  between	
  A	
  and	
  B,	
  the	
  less	
  similar	
  they	
  are:	
  
•  Commonality:	
  the	
  more	
  A	
  and	
  B	
  have	
  in	
  common,	
  the	
  more	
  similar	
  they	
  are	
  
•  Difference:	
  the	
  more	
  differences	
  between	
  A	
  and	
  B,	
  the	
  less	
  similar	
  
•  Commonality:	
  IC(common(A,B))	
  
•  Difference:	
  IC(descrip(on(A,B)-­‐IC(common(A,B))	
  
Dekang	
  Lin.	
  1998.	
  An	
  Informa(on-­‐Theore(c	
  Defini(on	
  of	
  Similarity.	
  ICML	
  
Dekang	
  Lin	
  similarity	
  theorem	
  
•  The	
  similarity	
  between	
  A	
  and	
  B	
  is	
  measured	
  by	
  the	
  ra(o	
  
between	
  the	
  amount	
  of	
  informa(on	
  needed	
  to	
  state	
  the	
  
commonality	
  of	
  A	
  and	
  B	
  and	
  the	
  informa(on	
  needed	
  to	
  fully	
  
describe	
  what	
  A	
  and	
  B	
  are	
  
	
  
simLin(A, B)∝
IC(common(A, B))
IC(description(A, B))
•  Lin	
  (altering	
  Resnik)	
  defines	
  IC(common(A,B))	
  as	
  2	
  x	
  informa(on	
  of	
  the	
  LCS	
  
simLin(c1,c2 ) =
2logP(LCS(c1,c2 ))
logP(c1)+ logP(c2 )
Lin	
  similarity	
  func$on	
  
simLin(A, B) =
2logP(LCS(c1,c2 ))
logP(c1)+ logP(c2 )
simLin(hill,coast) =
2logP(geological-formation)
logP(hill)+ logP(coast)
=
2ln0.00176
ln0.0000189 + ln0.0000216
=.59
The	
  (extended)	
  Lesk	
  Algorithm	
  	
  
•  A	
  thesaurus-­‐based	
  measure	
  that	
  looks	
  at	
  glosses	
  
•  Two	
  concepts	
  are	
  similar	
  if	
  their	
  glosses	
  contain	
  similar	
  words	
  
•  Drawing	
  paper:	
  paper	
  that	
  is	
  specially	
  prepared	
  for	
  use	
  in	
  dra^ing	
  
•  Decal:	
  the	
  art	
  of	
  transferring	
  designs	
  from	
  specially	
  prepared	
  paper	
  to	
  a	
  
wood	
  or	
  glass	
  or	
  metal	
  surface	
  
•  For	
  each	
  n-­‐word	
  phrase	
  that’s	
  in	
  both	
  glosses	
  
•  Add	
  a	
  score	
  of	
  n2	
  	
  
•  Paper	
  and	
  specially	
  prepared	
  for	
  1	
  +	
  22	
  =	
  5	
  
•  Compute	
  overlap	
  also	
  for	
  other	
  rela(ons	
  
•  glosses	
  of	
  hypernyms	
  and	
  hyponyms	
  
Summary:	
  thesaurus-­‐based	
  similarity	
  
Libraries	
  for	
  compu$ng	
  thesaurus-­‐based	
  
similarity	
  
•  NLTK	
  
•  hmp://nltk.github.com/api/nltk.corpus.reader.html?highlight=similarity	
  -­‐	
  
nltk.corpus.reader.WordNetCorpusReader.res_similarity	
  
•  WordNet::Similarity	
  
•  hmp://wn-­‐similarity.sourceforge.net/	
  
•  Web-­‐based	
  interface:	
  
•  hmp://marimba.d.umn.edu/cgi-­‐bin/similarity/similarity.cgi	
  
44	
  
Machine Learning
based approach
Basic	
  idea	
  
•  If	
  we	
  have	
  data	
  that	
  has	
  been	
  hand-­‐labelled	
  with	
  correct	
  word	
  
senses,	
  we	
  can	
  used	
  a	
  supervised	
  learning	
  approach	
  and	
  learn	
  
from	
  it!	
  
•  We	
  need	
  to	
  extract	
  features	
  and	
  train	
  a	
  classifier	
  
•  The	
  output	
  of	
  training	
  is	
  an	
  automa(c	
  system	
  capable	
  of	
  assigning	
  sense	
  
labels	
  TO	
  unlabelled	
  words	
  in	
  a	
  context.	
  	
  
46	
  
Two	
  variants	
  of	
  WSD	
  task	
  
•  Lexical	
  Sample	
  task	
  	
  
•  (we	
  need	
  labelled	
  corpora	
  for	
  individual	
  senses)	
  
•  Small	
  pre-­‐selected	
  set	
  of	
  target	
  words	
  (ex	
  difficulty)	
  
•  And	
  inventory	
  of	
  senses	
  for	
  each	
  word	
  
•  Supervised	
  machine	
  learning:	
  train	
  a	
  classifier	
  for	
  each	
  word	
  
•  All-­‐words	
  task	
  	
  
•  (each	
  word	
  in	
  each	
  sentence	
  is	
  labelled	
  with	
  a	
  sense)	
  
•  Every	
  word	
  in	
  an	
  en(re	
  text	
  
•  A	
  lexicon	
  with	
  senses	
  for	
  each	
  word	
  
SENSEVAL	
  1-­‐2-­‐3	
  
Supervised	
  Machine	
  Learning	
  Approaches	
  
•  Summary	
  of	
  what	
  we	
  need:	
  
•  the	
  tag	
  set	
  (“sense	
  inventory”)	
  
•  the	
  training	
  corpus	
  
•  A	
  set	
  of	
  features	
  extracted	
  from	
  the	
  training	
  corpus	
  
•  A	
  classifier	
  
Supervised	
  WSD	
  1:	
  WSD	
  Tags	
  
•  What’s	
  a	
  tag?	
  
A	
  dic(onary	
  sense?	
  
•  For	
  example,	
  for	
  WordNet	
  an	
  instance	
  of	
  “bass”	
  in	
  a	
  text	
  has	
  8	
  
possible	
  tags	
  or	
  labels	
  (bass1	
  through	
  bass8).	
  
8	
  senses	
  of	
  “bass”	
  in	
  WordNet	
  
1.  bass	
  -­‐	
  (the	
  lowest	
  part	
  of	
  the	
  musical	
  range)	
  
2.  bass,	
  bass	
  part	
  -­‐	
  (the	
  lowest	
  part	
  in	
  polyphonic	
  	
  music)	
  
3.  bass,	
  basso	
  -­‐	
  (an	
  adult	
  male	
  singer	
  with	
  the	
  lowest	
  voice)	
  
4.  sea	
  bass,	
  bass	
  -­‐	
  (flesh	
  of	
  lean-­‐fleshed	
  saltwater	
  fish	
  of	
  the	
  family	
  
Serranidae)	
  
5.  freshwater	
  bass,	
  bass	
  -­‐	
  (any	
  of	
  various	
  North	
  American	
  lean-­‐fleshed	
  
freshwater	
  fishes	
  especially	
  of	
  the	
  genus	
  Micropterus)	
  
6.  bass,	
  bass	
  voice,	
  basso	
  -­‐	
  (the	
  lowest	
  adult	
  male	
  singing	
  voice)	
  
7.  bass	
  -­‐	
  (the	
  member	
  with	
  the	
  lowest	
  range	
  of	
  a	
  family	
  of	
  musical	
  
instruments)	
  
8.  bass	
  -­‐	
  (nontechnical	
  name	
  for	
  any	
  of	
  numerous	
  edible	
  	
  marine	
  and	
  
freshwater	
  spiny-­‐finned	
  fishes)	
  
SemCor	
  
<wf	
  pos=PRP>He</wf>	
  
<wf	
  pos=VB	
  lemma=recognize	
  wnsn=4	
  lexsn=2:31:00::>recognized</wf>	
  
<wf	
  pos=DT>the</wf>	
  
<wf	
  pos=NN	
  lemma=gesture	
  wnsn=1	
  lexsn=1:04:00::>gesture</wf>	
  
<punc>.</punc>	
  
51	
  
SemCor: 234,000 words from Brown Corpus,
manually tagged with WordNet senses
Supervised	
  WSD:	
  Extract	
  feature	
  vectors	
  
Intui$on	
  from	
  Warren	
  Weaver	
  (1955):	
  
“If	
  one	
  examines	
  the	
  words	
  in	
  a	
  book,	
  one	
  at	
  a	
  (me	
  as	
  through	
  
an	
  opaque	
  mask	
  with	
  a	
  hole	
  in	
  it	
  one	
  word	
  wide,	
  then	
  it	
  is	
  
obviously	
  impossible	
  to	
  determine,	
  one	
  at	
  a	
  (me,	
  the	
  meaning	
  
of	
  the	
  words…	
  	
  
But	
  if	
  one	
  lengthens	
  the	
  slit	
  in	
  the	
  opaque	
  mask,	
  un(l	
  one	
  can	
  
see	
  not	
  only	
  the	
  central	
  word	
  in	
  ques(on	
  but	
  also	
  say	
  N	
  words	
  
on	
  either	
  side,	
  then	
  if	
  N	
  is	
  large	
  enough	
  one	
  can	
  unambiguously	
  
decide	
  the	
  meaning	
  of	
  the	
  central	
  word…	
  	
  
The	
  prac(cal	
  ques(on	
  is	
  :	
  ``What	
  minimum	
  value	
  of	
  N	
  will,	
  at	
  
least	
  in	
  a	
  tolerable	
  frac(on	
  of	
  cases,	
  lead	
  to	
  the	
  correct	
  choice	
  
of	
  meaning	
  for	
  the	
  central	
  word?”	
  
the	
  	
  window	
  
Feature	
  vectors	
  
•  Vectors	
  of	
  sets	
  of	
  feature/value	
  pairs	
  
Two	
  kinds	
  of	
  features	
  in	
  the	
  vectors	
  
•  Colloca$onal	
  features	
  and	
  bag-­‐of-­‐words	
  features	
  
•  Colloca$onal/Paradigma$c	
  
•  Features	
  about	
  words	
  at	
  specific	
  posi(ons	
  near	
  target	
  word	
  
•  O^en	
  limited	
  to	
  just	
  word	
  iden(ty	
  and	
  POS	
  
•  Bag-­‐of-­‐words	
  
•  Features	
  about	
  words	
  that	
  occur	
  anywhere	
  in	
  the	
  window	
  (regardless	
  
of	
  posi(on)	
  
•  Typically	
  limited	
  to	
  frequency	
  counts	
  
Generally speaking, a
collocation is a
sequence of words or
terms that co-occur
more often than would
be expected by
chance. But here the
meaning is not exactly
this…	
  
Examples	
  
•  Example	
  text	
  (WSJ):	
  
An	
  electric	
  guitar	
  and	
  bass	
  player	
  stand	
  off	
  to	
  
one	
  side	
  not	
  really	
  part	
  of	
  the	
  scene	
  
•  Assume	
  a	
  window	
  of	
  +/-­‐	
  2	
  from	
  the	
  target	
  
Examples	
  
•  Example	
  text	
  (WSJ)	
  
An	
  electric	
  guitar	
  and	
  bass	
  player	
  stand	
  off	
  to	
  
one	
  side	
  not	
  really	
  part	
  of	
  the	
  scene,	
  	
  
•  Assume	
  a	
  window	
  of	
  +/-­‐	
  2	
  from	
  the	
  target	
  
Colloca$onal	
  features	
  
•  Posi(on-­‐specific	
  informa(on	
  about	
  the	
  words	
  and	
  
colloca(ons	
  in	
  window	
  
•  guitar	
  and	
  bass	
  player	
  stand	
  
•  word	
  1,2,3	
  grams	
  in	
  window	
  of	
  ±3	
  is	
  common	
  
encoding local lexical and grammatical information that can often accurately isola
a given sense.
For example consider the ambiguous word bass in the following WSJ sentenc
(16.17) An electric guitar and bass player stand off to one side, not really part of
the scene, just as a sort of nod to gringo expectations perhaps.
A collocational feature vector, extracted from a window of two words to the rig
and left of the target word, made up of the words themselves, their respective part
of-speech, and pairs of words, that is,
[wi 2,POSi 2,wi 1,POSi 1,wi+1,POSi+1,wi+2,POSi+2,wi 1
i 2,wi+1
i ] (16.1
would yield the following vector:
[guitar, NN, and, CC, player, NN, stand, VB, and guitar, player stand]
High performing systems generally use POS tags and word collocations of leng
1, 2, and 3 from a window of words 3 to the left and 3 to the right (Zhong and N
For example consider the ambiguous word bass in the following WSJ sent
6.17) An electric guitar and bass player stand off to one side, not really par
the scene, just as a sort of nod to gringo expectations perhaps.
collocational feature vector, extracted from a window of two words to the
d left of the target word, made up of the words themselves, their respective
-speech, and pairs of words, that is,
[wi 2,POSi 2,wi 1,POSi 1,wi+1,POSi+1,wi+2,POSi+2,wi 1
i 2,wi+1
i ] (
ould yield the following vector:
[guitar, NN, and, CC, player, NN, stand, VB, and guitar, player stand]
gh performing systems generally use POS tags and word collocations of l
2, and 3 from a window of words 3 to the left and 3 to the right (Zhong an
Bag-­‐of-­‐words	
  features	
  
•  “an	
  unordered	
  set	
  of	
  words”	
  –	
  posi(on	
  ignored	
  
•  Choose	
  a	
  vocabulary:	
  a	
  useful	
  subset	
  of	
  words	
  in	
  a	
  
training	
  corpus	
  
•  Either:	
  the	
  count	
  of	
  how	
  o^en	
  each	
  of	
  those	
  terms	
  
occurs	
  in	
  a	
  given	
  window	
  OR	
  just	
  a	
  binary	
  “indicator”	
  1	
  
or	
  0	
  
	
  
Co-­‐Occurrence	
  Example	
  
•  Assume	
  we’ve	
  semled	
  on	
  a	
  possible	
  vocabulary	
  of	
  12	
  words	
  in	
  
“bass”	
  sentences:	
  	
  
	
  
[fishing,	
  big,	
  sound,	
  player,	
  fly,	
  rod,	
  pound,	
  double,	
  runs,	
  playing,	
  guitar,	
  band]	
  	
  
•  The	
  vector	
  for:	
  
	
  guitar	
  and	
  bass	
  player	
  stand	
  
	
  [0,0,0,1,0,0,0,0,0,0,1,0]	
  	
  
	
  
Word Sense
Disambiguation
Classifica(on	
  
Classifica$on	
  
•  Input:	
  
•  	
  a	
  word	
  w	
  and	
  some	
  features	
  f	
  
•  	
  a	
  fixed	
  set	
  of	
  classes	
  	
  C	
  =	
  {c1,	
  c2,…,	
  cJ}	
  
•  Output:	
  a	
  predicted	
  class	
  c∈C	
  
Any	
  kind	
  of	
  classifier	
  
•  Naive	
  Bayes	
  
•  Logis(c	
  regression	
  
•  Neural	
  Networks	
  
•  Support-­‐vector	
  machines	
  
•  k-­‐Nearest	
  Neighbors	
  
•  etc.	
  
	
  
The	
  end	
  	
  
62	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Natural language processing
Natural language processingNatural language processing
Natural language processingBasha Chand
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: ParsingRushdi Shams
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLPkartikaVashisht
 
Stuart russell and peter norvig artificial intelligence - a modern approach...
Stuart russell and peter norvig   artificial intelligence - a modern approach...Stuart russell and peter norvig   artificial intelligence - a modern approach...
Stuart russell and peter norvig artificial intelligence - a modern approach...Lê Anh Đạt
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelHemantha Kulathilake
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionbutest
 
5. phases of nlp
5. phases of nlp5. phases of nlp
5. phases of nlpmonircse2
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introductionwahab khan
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Natural language processing
Natural language processingNatural language processing
Natural language processingHansi Thenuwara
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translationAkshaya Arunan
 
AI_ 3 & 4 Knowledge Representation issues
AI_ 3 & 4 Knowledge Representation issuesAI_ 3 & 4 Knowledge Representation issues
AI_ 3 & 4 Knowledge Representation issuesKhushali Kathiriya
 

Was ist angesagt? (20)

Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP_KASHK:Minimum Edit Distance
NLP_KASHK:Minimum Edit DistanceNLP_KASHK:Minimum Edit Distance
NLP_KASHK:Minimum Edit Distance
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
 
Stuart russell and peter norvig artificial intelligence - a modern approach...
Stuart russell and peter norvig   artificial intelligence - a modern approach...Stuart russell and peter norvig   artificial intelligence - a modern approach...
Stuart russell and peter norvig artificial intelligence - a modern approach...
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
 
5. phases of nlp
5. phases of nlp5. phases of nlp
5. phases of nlp
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introduction
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP_KASHK:N-Grams
NLP_KASHK:N-GramsNLP_KASHK:N-Grams
NLP_KASHK:N-Grams
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Semantic analysis
Semantic analysisSemantic analysis
Semantic analysis
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Syntax directed translation
Syntax directed translationSyntax directed translation
Syntax directed translation
 
AI_ 3 & 4 Knowledge Representation issues
AI_ 3 & 4 Knowledge Representation issuesAI_ 3 & 4 Knowledge Representation issues
AI_ 3 & 4 Knowledge Representation issues
 

Andere mochten auch

Word sense disambiguation a survey
Word sense disambiguation a surveyWord sense disambiguation a survey
Word sense disambiguation a surveyunyil96
 
Word Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information AccessWord Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information AccessPierpaolo Basile
 
Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]akm sabbir
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationRubén Izquierdo Beviá
 
Sifting Social Data: Word Sense Disambiguation Using Machine Learning
Sifting Social Data: Word Sense Disambiguation Using Machine LearningSifting Social Data: Word Sense Disambiguation Using Machine Learning
Sifting Social Data: Word Sense Disambiguation Using Machine LearningStuart Shulman
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceLeon Derczynski
 
Graph-based Word Sense Disambiguation
Graph-based Word Sense DisambiguationGraph-based Word Sense Disambiguation
Graph-based Word Sense DisambiguationElena-Oana Tabaranu
 
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...Pierpaolo Basile
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Innovation Quotient Pvt Ltd
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationvini89
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word CloudsMarina Santini
 
Zoological nomenclature
Zoological nomenclatureZoological nomenclature
Zoological nomenclatureManideep Raj
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationvini89
 
Paying Guest v/s NestAway Rental Homes
Paying Guest v/s NestAway Rental HomesPaying Guest v/s NestAway Rental Homes
Paying Guest v/s NestAway Rental HomesNestaway.com
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 

Andere mochten auch (20)

Word sense disambiguation a survey
Word sense disambiguation a surveyWord sense disambiguation a survey
Word sense disambiguation a survey
 
Word Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information AccessWord Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information Access
 
Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
 
Sifting Social Data: Word Sense Disambiguation Using Machine Learning
Sifting Social Data: Word Sense Disambiguation Using Machine LearningSifting Social Data: Word Sense Disambiguation Using Machine Learning
Sifting Social Data: Word Sense Disambiguation Using Machine Learning
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
 
Graph-based Word Sense Disambiguation
Graph-based Word Sense DisambiguationGraph-based Word Sense Disambiguation
Graph-based Word Sense Disambiguation
 
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Zoological nomenclature
Zoological nomenclatureZoological nomenclature
Zoological nomenclature
 
Word-sense disambiguation
Word-sense disambiguationWord-sense disambiguation
Word-sense disambiguation
 
Demand for Rentals
Demand for RentalsDemand for Rentals
Demand for Rentals
 
Distressed Property to Rental
Distressed Property to RentalDistressed Property to Rental
Distressed Property to Rental
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Lecture 6 homonyms
Lecture 6 homonymsLecture 6 homonyms
Lecture 6 homonyms
 
Paying Guest v/s NestAway Rental Homes
Paying Guest v/s NestAway Rental HomesPaying Guest v/s NestAway Rental Homes
Paying Guest v/s NestAway Rental Homes
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
Project ara
Project araProject ara
Project ara
 

Ähnlich wie Lecture: Word Sense Disambiguation

DATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptxDATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptxDrPraveenPawar
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdfSoha82
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptOlusolaTop
 
A Simple Walkthrough of Word Sense Disambiguation
A Simple Walkthrough of Word Sense DisambiguationA Simple Walkthrough of Word Sense Disambiguation
A Simple Walkthrough of Word Sense DisambiguationMaryOsborne11
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectorsOsebe Sammi
 
Natural Language Inference: Logic from Humans
Natural Language Inference: Logic from HumansNatural Language Inference: Logic from Humans
Natural Language Inference: Logic from HumansValeria de Paiva
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Natural Language Inference in SICK
Natural Language Inference in SICKNatural Language Inference in SICK
Natural Language Inference in SICKValeria de Paiva
 
4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...
4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...
4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...ijaia
 
Use Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorUse Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorLiz Danzico
 

Ähnlich wie Lecture: Word Sense Disambiguation (20)

DATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptxDATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptx
 
Lecture: Word Senses
Lecture: Word SensesLecture: Word Senses
Lecture: Word Senses
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdf
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
NLP
NLPNLP
NLP
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Word vectors
Word vectorsWord vectors
Word vectors
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.ppt
 
A Simple Walkthrough of Word Sense Disambiguation
A Simple Walkthrough of Word Sense DisambiguationA Simple Walkthrough of Word Sense Disambiguation
A Simple Walkthrough of Word Sense Disambiguation
 
Icon 2007 Pedersen
Icon 2007 PedersenIcon 2007 Pedersen
Icon 2007 Pedersen
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectors
 
Natural Language Inference: Logic from Humans
Natural Language Inference: Logic from HumansNatural Language Inference: Logic from Humans
Natural Language Inference: Logic from Humans
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Inference in SICK
Natural Language Inference in SICKNatural Language Inference in SICK
Natural Language Inference in SICK
 
4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...
4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...
4A HYBRID APPROACH TO WORD SENSE DISAMBIGUATION COMBINING SUPERVISED AND UNSU...
 
intro.ppt
intro.pptintro.ppt
intro.ppt
 
Use Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorUse Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence Behavior
 

Mehr von Marina Santini

Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Marina Santini
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsMarina Santini
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-Marina Santini
 
An Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesAn Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesMarina Santini
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebMarina Santini
 
Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: SummarizationMarina Santini
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question AnsweringMarina Santini
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)Marina Santini
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Marina Santini
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingMarina Santini
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational SemanticsMarina Santini
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Marina Santini
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Marina Santini
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Marina Santini
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 
Lecture 3b: Decision Trees (1 part)
Lecture 3b: Decision Trees (1 part)Lecture 3b: Decision Trees (1 part)
Lecture 3b: Decision Trees (1 part) Marina Santini
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationMarina Santini
 
Lecture 2: Preliminaries (Understanding and Preprocessing data)
Lecture 2: Preliminaries (Understanding and Preprocessing data)Lecture 2: Preliminaries (Understanding and Preprocessing data)
Lecture 2: Preliminaries (Understanding and Preprocessing data)Marina Santini
 

Mehr von Marina Santini (20)

Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
 
An Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesAn Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability Features
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic Web
 
Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: Summarization
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Lecture 3b: Decision Trees (1 part)
Lecture 3b: Decision Trees (1 part)Lecture 3b: Decision Trees (1 part)
Lecture 3b: Decision Trees (1 part)
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
 
Lecture 2: Preliminaries (Understanding and Preprocessing data)
Lecture 2: Preliminaries (Understanding and Preprocessing data)Lecture 2: Preliminaries (Understanding and Preprocessing data)
Lecture 2: Preliminaries (Understanding and Preprocessing data)
 

Kürzlich hochgeladen

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 

Kürzlich hochgeladen (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 

Lecture: Word Sense Disambiguation

  • 1. Seman&c  Analysis  in  Language  Technology   http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm 
 
 Word Sense Disambiguation
 Marina  San(ni   san$nim@stp.lingfil.uu.se     Department  of  Linguis(cs  and  Philology   Uppsala  University,  Uppsala,  Sweden     Spring  2016       1  
  • 2. Previous  Lecture:  Word  Senses   •  Homonomy,  polysemy,  synonymy,  metonymy,  etc.     Prac(cal  ac(vi(es:   1)  SELECTIONAL  RESTRICTIONS   2)  MANUAL  DISAMBIGUATION  OF  EXAMPLES  USING  SENSEVAL   SENSES   AIMS  OF  PRACTICAL  ACTIVITiES:     •  STUDENTS  SHOULD  GET  ACQUINTED  WITH  REAL  DATA   •  EXPLORATIONS  OF  APPLICATIONS,  RESOURCES  AND  METHODS.     2  
  • 3. No  preset  solu$ons  (this  slide  is  to  tell  you   that  you  are  doing  well    J    )   •  Whatever  your  experience  with  data,  it  is  a  valuable  experience:     •  Disappointment   •  Frustra(on   •  Feeling  lost   •  Happiness   •  Power   •  Excitement   •  …   •  All  the  students  so  far    (also  in  previous  courses)  have  presented  their     own  solu(ons…  many  different  solu(ons  and  it  is  ok…    3  
  • 4. J&M  own  solu$ons:  Selec$onal  Restric$ons  (just  for  your   records,  does  not  mean  they  are  necessearily  beMer  than  yours…  )   4  
  • 5. Other  possible  solu$ons…   •  Kissàconcrete  sense:  touching   with  lips/mouth   •  animate  kiss  [using  lips/ mouth]  animate/inanimate   •  Ex:  he  kissed  her;     •  The  dolphin  kissed  the  kid     •  Why  does  the  pope  kiss  the   ground  a^er  he  disembarks  ...   •  Kissàfigura(ve  sense:  touching     •  animate  kiss  inanimate   •  Ex:  "Walk  as  if  you  are  kissing  the  Earth   with  your  feet."   5   pursed  lips?  
  • 6. NO  solu$on  or  comments  provided  for  Senseval   •  All  your  impressions  and  feelings  are  plausible  and  acceptable  J   6  
  • 7. Remember  that  in  both  ac$vi$es…   •  You  have  experienced  cases  of  POLYSEMY!   •  YOU  HAVE  TRIED  TO  DISAMBIGUATE  THE  SENSES  MANUALLY,  IE   WITH  YOUR  HUMAN  SKILLS…     7  
  • 9. Today:  Word  Sense  Disambigua$on  (WSD)   •  Given:   •  A  word  in  context;     •  A  fixed  inventory  of  poten(al  word  senses;   •  Create  a  system  that  automa(cally  decides  which  sense  of   the  word  is  correct  in  that  context.  
  • 10. Word  Sense  Disambigua$on:  Defini$on   •  Word  Sense  Disambitua(on  (WSD)  is  the  TASK  of  determining  the   correct  sense  of  a  word  in  context.   •  It  is  an  automa(c  task:  we  create  a  system  that  automa-cally   disambiguates  the  senses  for  us.   •  Useful  for  many  NLP  tasks:  informa(on  retrieval  (apple                                    or   apple                            ?),  ques(on  answering  (does  United  serve   Philadelphia?),  machine  transla(on  (eng  ”bat”  à  It:  pipistrello                     or  mazza                                                                        ?)   10    
  • 11. Anecdote:  the  poison  apple   •  In  1954,  Alan  Turing  died  a^er  bi(ng  into  an  apple  laced  with   cyanide   •  It  was  said  that  this  half-­‐biten  apple  inspired  the  Apple  logo…   but  apparently  it  is  a  legend  J     •  hmp://mentalfloss.com/ar(cle/64049/did-­‐alan-­‐turing-­‐inspire-­‐ apple-­‐logo     11  
  • 12. Be  alert…   •  Word  sense  ambiguity  is  pervasive  !!!   12  
  • 13. Acknowledgements Most  slides  borrowed  or  adapted  from:   Dan  Jurafsky  and  James  H.  Mar(n   Dan  Jurafsky  and  Christopher  Manning,  Coursera       J&M(2015,  dra^):  hmps://web.stanford.edu/~jurafsky/slp3/              
  • 14. Outline:  WSD  Methods   •  Thesaurus/Dic(onary  Methods   •  Supervised  Machine  Learning   •  Semi-­‐Supervised  Learning  (self-­‐reading)   14  
  • 15. Word Sense Disambiguation Dic(onary  and   Thesaurus  Methods  
  • 16. The  Simplified  Lesk  algorithm   •  Let’s  disambiguate  “bank”  in  this  sentence:   The  bank  can  guarantee  deposits  will  eventually  cover  future  tui(on  costs   because  it  invests  in  adjustable-­‐rate  mortgage  securi(es.     •  given  the  following  two  WordNet  senses:     if overlap > max-overlap then max-overlap overlap best-sense sense end return(best-sense) Figure 16.6 The Simplified Lesk algorithm. The COMPUTEOVERLAP function returns the number of words in common between two sets, ignoring function words or other words on a stop list. The original Lesk algorithm defines the context in a more complex way. The Cor- pus Lesk algorithm weights each overlapping word w by its logP(w) and includes labeled training corpus data in the signature. bank1 Gloss: a financial institution that accepts deposits and channels the money into lending activities Examples: “he cashed a check at the bank”, “that bank holds the mortgage on my home” bank2 Gloss: sloping land (especially the slope beside a body of water) Examples: “they pulled the canoe up on the bank”, “he sat on the bank of the river and watched the currents”
  • 17. The  Simplified  Lesk  algorithm   The  bank  can  guarantee  deposits  will  eventually  cover  future   tui(on  costs  because  it  invests  in  adjustable-­‐rate  mortgage   securi(es.     if overlap > max-overlap then max-overlap overlap best-sense sense end return(best-sense) Figure 16.6 The Simplified Lesk algorithm. The COMPUTEOVERLAP function returns the number of words in common between two sets, ignoring function words or other words on a stop list. The original Lesk algorithm defines the context in a more complex way. The Cor- pus Lesk algorithm weights each overlapping word w by its logP(w) and includes labeled training corpus data in the signature. bank1 Gloss: a financial institution that accepts deposits and channels the money into lending activities Examples: “he cashed a check at the bank”, “that bank holds the mortgage on my home” bank2 Gloss: sloping land (especially the slope beside a body of water) Examples: “they pulled the canoe up on the bank”, “he sat on the bank of the river and watched the currents” Choose  sense  with  most  word  overlap  between  gloss  and  context   (not  coun(ng  func(on  words)  
  • 18. Drawback   •  Glosses  and  examples  migh  be  too  short  and  may  not  provide   enough  chance  to  overlap  with  the  context  of  the  word  to  be   disambiguated.     18  
  • 19. The  Corpus(-­‐based)  Lesk  algorithm   •  Assumes  we  have  some  sense-­‐labeled  data  (like  SemCor)   •  Take  all  the  sentences  with  the  relevant  word  sense:   These  short,  "streamlined"  mee-ngs  usually  are  sponsored  by  local  banks1,   Chambers  of  Commerce,  trade  associa-ons,  or  other  civic  organiza-ons.   •  Now  add  these  to  the  gloss  +  examples  for  each  sense,  call  it  the   “signature”  of  a  sense.  Basically,  it  is  an  expansion  of  the   dic(onary  entry.   •  Choose  sense  with  most  word  overlap  between  context  and   signature  (ie.  the  context  words  provided  by  the  resources).  
  • 20. Corpus  Lesk:  IDF  weigh$ng   •  Instead  of  just  removing  func(on  words   •  Weigh  each  word  by  its  `promiscuity’  across  documents   •  Down-­‐weights  words  that  occur  in  every  `document’  (gloss,  example,  etc)   •  These  are  generally  func(on  words,  but  is  a  more  fine-­‐grained  measure   •  Weigh  each  overlapping  word  by  inverse  document  frequency   (IDF).   20  
  • 21. Graph-­‐based  methods   •  First,  WordNet  can  be  viewed  as  a  graph   •  senses  are  nodes   •  rela(ons  (hypernymy,  meronymy)  are  edges   •  Also  add  edge  between  word  and  unambiguous  gloss  words   21   toastn 4 drinkv 1 drinkern 1 drinkingn 1 potationn 1 sipn 1 sipv 1 beveragen 1 milkn 1 liquidn 1foodn 1 drinkn 1 helpingn 1 supv 1 consumptionn 1 consumern 1 consumev 1 An  undirected   graph  is  set  of   nodes  tha  are   connected   together  by   bidirec(onal   edges  (lines).    
  • 22. How  to  use  the  graph  for  WSD   “She  drank  some  milk”   •  choose  the            most  central  sense     (several  algorithms    have  been  proposed   recently)     22   drinkv 1 drinkern 1 beveragen 1 boozingn 1 foodn 1 drinkn 1 milkn 1 milkn 2 milkn 3 milkn 4 drinkv 2 drinkv 3 drinkv 4 drinkv 5 nutrimentn 1 “drink” “milk”
  • 23. Word Meaning and Similarity Word  Similarity:   Thesaurus  Methods   beg:  c_w8  
  • 24. Word  Similarity   •  Synonymy:  a  binary  rela(on   •  Two  words  are  either  synonymous  or  not   •  Similarity  (or  distance):  a  looser  metric   •  Two  words  are  more  similar  if  they  share  more  features  of  meaning   •  Similarity  is  properly  a  rela(on  between  senses   •  We  do  not  say  “The  word  “bank”  is  not  similar  to  the  word  “slope”  “,  bu  w  say.   •  Bank1  is  similar  to  fund3   •  Bank2  is  similar  to  slope5   •  But  we’ll  compute  similarity  over  both  words  and  senses  
  • 25. Why  word  similarity   •  Informa(on  retrieval   •  Ques(on  answering   •  Machine  transla(on   •  Natural  language  genera(on   •  Language  modeling   •  Automa(c  essay  grading   •  Plagiarism  detec(on   •  Document  clustering  
  • 26. Word  similarity  and  word  relatedness   •  We  o^en  dis(nguish  word  similarity    from  word   relatedness   •  Similar  words:  near-­‐synonyms   •  car, bicycle:        similar   •  Related  words:  can  be  related  any  way   •  car, gasoline:      related,  not  similar   Cf.  Synonyms:  car  &  automobile  
  • 27. Two  classes  of  similarity  algorithms   •  Thesaurus-­‐based  algorithms   •  Are  words  “nearby”  in  hypernym  hierarchy?   •  Do  words  have  similar  glosses  (defini(ons)?   •  Distribu(onal  algorithms:  next  (me!   •  Do  words  have  similar  distribu(onal  contexts?  
  • 28. Path-­‐based  similarity   •  Two  concepts  (senses/synsets)  are  similar  if   they  are  near  each  other  in  the  thesaurus   hierarchy     •  =have  a  short  path  between  them   •  concepts  have  path  1  to  themselves  
  • 29. Refinements  to  path-­‐based  similarity   •  pathlen(c1,c2) =  (distance  metric)  =  1  +  number  of  edges  in  the   shortest  path  in  the  hypernym  graph  between  sense  nodes  c1   and  c2   •  simpath(c1,c2) = •  wordsim(w1,w2) = max sim(c1,c2) c1∈senses(w1),c2∈senses(w2)   1 pathlen(c1,c2 ) Sense  similarity  metric:  1   over  the  distance!   Word  similarity  metric:     max  similarity  among   pairs  of  senses.   For  all  senses  of  w1  and  all  senses  of  w2,  take  the  similarity  between  each  of  the  senses  of  w1   and  each  of  the  senses  of  w2  and  then  take  the  maximum  similarity  between  those  pairs.  
  • 30. Example:  path-­‐based  similarity   simpath(c1,c2) = 1/pathlen(c1,c2) simpath(nickel,coin)  =  1/2 = .5 simpath(fund,budget)  =  1/2 = .5 simpath(nickel,currency)  =  1/4 = .25 simpath(nickel,money)  =  1/6 = .17 simpath(coinage,Richter  scale)  =  1/6 = .17
  • 31. Problem  with  basic  path-­‐based  similarity   •  Assumes  each  link  represents  a  uniform  distance   •  But  nickel  to  money  seems  to  us  to  be  closer  than  nickel  to   standard   •  Nodes  high  in  the  hierarchy  are  very  abstract   •  We  instead  want  a  metric  that   •  Represents  the  cost  of  each  edge  independently   •  Words  connected  only  through  abstract  nodes     •  are  less  similar  
  • 32. Informa$on  content  similarity  metrics   •  In  simple  words:   •  We  define  the  probability  of  a  concept  C  as  the  probability  that  a   randomly  selected  word  in  a  corpus  is  an  instance  of  that  concept.   •  Basically,  for  each  random  word  in  a  corpus  we  compute  how  probable  it   is  that  it  belongs  to  a  certain  concepts.     Resnik  1995.  Using  informa(on  content  to  evaluate  seman(c   similarity  in  a  taxonomy.  IJCAI  
  • 33. Formally:  Informa$on  content  similarity  metrics   •  Let’s  define  P(c) as:   •  The  probability  that  a  randomly  selected  word  in  a  corpus  is  an  instance   of  concept  c •  Formally:  there  is  a  dis(nct  random  variable,  ranging  over  words,   associated  with  each  concept  in  the  hierarchy   •  for  a  given  concept,  each  observed  noun  is  either   •   a  member  of  that  concept    with  probability  P(c) •  not  a  member  of  that  concept  with  probability  1-P(c) •  All  words  are  members  of  the  root  node  (En(ty)   •  P(root)=1 •  The  lower  a  node  in  hierarchy,  the  lower  its  probability   Resnik  1995.  Using  informa(on  content  to  evaluate  seman(c   similarity  in  a  taxonomy.  IJCAI  
  • 34. Informa$on  content  similarity   •  For  every  word  (ex  “natural  eleva(on”),  we   count  all  the  words  in  that  concepts,  and   then  we  normalize  by  the  total  number  of   words  in  the  corpus.   •  we  get  a  probability  value  that  tells  us  how   probable  it  is  that  a  random  word  is  a  an   instance  of  that  concept     P(c) = count(w) w∈words(c) ∑ N geological-­‐forma(on   shore   hill   natural  eleva(on   coast   cave   gromo  ridge   …   en(ty   In  order  o  compute  the   probability  of  the  term   "natural  eleva(on",  we   take  ridge,  hill  +  natural   eleva(on  itself  
  • 35. Informa$on  content  similarity   •  WordNet  hierarchy  augmented  with  probabili(es  P(c)   D.  Lin.  1998.  An  Informa(on-­‐Theore(c  Defini(on  of  Similarity.  ICML  1998  
  • 36. Informa$on  content:  defini$ons   1.  Informa(on  content:   1.  IC(c) = -log P(c) 2.  Most  informa(ve  subsumer   (Lowest  common  subsumer)   LCS(c1,c2) = The  most  informa(ve  (lowest)   node  in  the  hierarchy   subsuming  both  c1  and  c2  
  • 37. IC  aka…   •  A  lot  of  people  prefer  the  term  surprisal  to  informa(on  or  to   informa(on  content.     -­‐log  p(x)   It  measures  the  amount  of  surprise  generated  by  the  event  x.     The  smaller  the  probability  of  x,  the  bigger  the  surprisal  is.     It's  helpful  to  think  about  it  this  way,  par(cularly  for  linguis(cs   examples.     37  
  • 38. Using  informa$on  content  for  similarity:     the  Resnik  method   •  The  similarity  between  two  words  is  related  to  their   common  informa(on   •  The  more  two  words  have  in  common,  the  more   similar  they  are   •  Resnik:  measure  common  informa(on  as:   •  The  informa(on  content  of  the  most  informa(ve    (lowest)  subsumer  (MIS/LCS)  of  the  two  nodes   •  simresnik(c1,c2) = -log P( LCS(c1,c2) ) Philip  Resnik.  1995.  Using  Informa(on  Content  to  Evaluate  Seman(c  Similarity  in  a  Taxonomy.  IJCAI  1995.   Philip  Resnik.  1999.  Seman(c  Similarity  in  a  Taxonomy:  An  Informa(on-­‐Based  Measure  and  its  Applica(on   to  Problems  of  Ambiguity  in  Natural  Language.  JAIR  11,  95-­‐130.  
  • 39. Dekang  Lin  method   •  Intui(on:  Similarity  between  A  and  B  is  not  just  what  they  have   in  common   •  The  more  differences  between  A  and  B,  the  less  similar  they  are:   •  Commonality:  the  more  A  and  B  have  in  common,  the  more  similar  they  are   •  Difference:  the  more  differences  between  A  and  B,  the  less  similar   •  Commonality:  IC(common(A,B))   •  Difference:  IC(descrip(on(A,B)-­‐IC(common(A,B))   Dekang  Lin.  1998.  An  Informa(on-­‐Theore(c  Defini(on  of  Similarity.  ICML  
  • 40. Dekang  Lin  similarity  theorem   •  The  similarity  between  A  and  B  is  measured  by  the  ra(o   between  the  amount  of  informa(on  needed  to  state  the   commonality  of  A  and  B  and  the  informa(on  needed  to  fully   describe  what  A  and  B  are     simLin(A, B)∝ IC(common(A, B)) IC(description(A, B)) •  Lin  (altering  Resnik)  defines  IC(common(A,B))  as  2  x  informa(on  of  the  LCS   simLin(c1,c2 ) = 2logP(LCS(c1,c2 )) logP(c1)+ logP(c2 )
  • 41. Lin  similarity  func$on   simLin(A, B) = 2logP(LCS(c1,c2 )) logP(c1)+ logP(c2 ) simLin(hill,coast) = 2logP(geological-formation) logP(hill)+ logP(coast) = 2ln0.00176 ln0.0000189 + ln0.0000216 =.59
  • 42. The  (extended)  Lesk  Algorithm     •  A  thesaurus-­‐based  measure  that  looks  at  glosses   •  Two  concepts  are  similar  if  their  glosses  contain  similar  words   •  Drawing  paper:  paper  that  is  specially  prepared  for  use  in  dra^ing   •  Decal:  the  art  of  transferring  designs  from  specially  prepared  paper  to  a   wood  or  glass  or  metal  surface   •  For  each  n-­‐word  phrase  that’s  in  both  glosses   •  Add  a  score  of  n2     •  Paper  and  specially  prepared  for  1  +  22  =  5   •  Compute  overlap  also  for  other  rela(ons   •  glosses  of  hypernyms  and  hyponyms  
  • 44. Libraries  for  compu$ng  thesaurus-­‐based   similarity   •  NLTK   •  hmp://nltk.github.com/api/nltk.corpus.reader.html?highlight=similarity  -­‐   nltk.corpus.reader.WordNetCorpusReader.res_similarity   •  WordNet::Similarity   •  hmp://wn-­‐similarity.sourceforge.net/   •  Web-­‐based  interface:   •  hmp://marimba.d.umn.edu/cgi-­‐bin/similarity/similarity.cgi   44  
  • 46. Basic  idea   •  If  we  have  data  that  has  been  hand-­‐labelled  with  correct  word   senses,  we  can  used  a  supervised  learning  approach  and  learn   from  it!   •  We  need  to  extract  features  and  train  a  classifier   •  The  output  of  training  is  an  automa(c  system  capable  of  assigning  sense   labels  TO  unlabelled  words  in  a  context.     46  
  • 47. Two  variants  of  WSD  task   •  Lexical  Sample  task     •  (we  need  labelled  corpora  for  individual  senses)   •  Small  pre-­‐selected  set  of  target  words  (ex  difficulty)   •  And  inventory  of  senses  for  each  word   •  Supervised  machine  learning:  train  a  classifier  for  each  word   •  All-­‐words  task     •  (each  word  in  each  sentence  is  labelled  with  a  sense)   •  Every  word  in  an  en(re  text   •  A  lexicon  with  senses  for  each  word   SENSEVAL  1-­‐2-­‐3  
  • 48. Supervised  Machine  Learning  Approaches   •  Summary  of  what  we  need:   •  the  tag  set  (“sense  inventory”)   •  the  training  corpus   •  A  set  of  features  extracted  from  the  training  corpus   •  A  classifier  
  • 49. Supervised  WSD  1:  WSD  Tags   •  What’s  a  tag?   A  dic(onary  sense?   •  For  example,  for  WordNet  an  instance  of  “bass”  in  a  text  has  8   possible  tags  or  labels  (bass1  through  bass8).  
  • 50. 8  senses  of  “bass”  in  WordNet   1.  bass  -­‐  (the  lowest  part  of  the  musical  range)   2.  bass,  bass  part  -­‐  (the  lowest  part  in  polyphonic    music)   3.  bass,  basso  -­‐  (an  adult  male  singer  with  the  lowest  voice)   4.  sea  bass,  bass  -­‐  (flesh  of  lean-­‐fleshed  saltwater  fish  of  the  family   Serranidae)   5.  freshwater  bass,  bass  -­‐  (any  of  various  North  American  lean-­‐fleshed   freshwater  fishes  especially  of  the  genus  Micropterus)   6.  bass,  bass  voice,  basso  -­‐  (the  lowest  adult  male  singing  voice)   7.  bass  -­‐  (the  member  with  the  lowest  range  of  a  family  of  musical   instruments)   8.  bass  -­‐  (nontechnical  name  for  any  of  numerous  edible    marine  and   freshwater  spiny-­‐finned  fishes)  
  • 51. SemCor   <wf  pos=PRP>He</wf>   <wf  pos=VB  lemma=recognize  wnsn=4  lexsn=2:31:00::>recognized</wf>   <wf  pos=DT>the</wf>   <wf  pos=NN  lemma=gesture  wnsn=1  lexsn=1:04:00::>gesture</wf>   <punc>.</punc>   51   SemCor: 234,000 words from Brown Corpus, manually tagged with WordNet senses
  • 52. Supervised  WSD:  Extract  feature  vectors   Intui$on  from  Warren  Weaver  (1955):   “If  one  examines  the  words  in  a  book,  one  at  a  (me  as  through   an  opaque  mask  with  a  hole  in  it  one  word  wide,  then  it  is   obviously  impossible  to  determine,  one  at  a  (me,  the  meaning   of  the  words…     But  if  one  lengthens  the  slit  in  the  opaque  mask,  un(l  one  can   see  not  only  the  central  word  in  ques(on  but  also  say  N  words   on  either  side,  then  if  N  is  large  enough  one  can  unambiguously   decide  the  meaning  of  the  central  word…     The  prac(cal  ques(on  is  :  ``What  minimum  value  of  N  will,  at   least  in  a  tolerable  frac(on  of  cases,  lead  to  the  correct  choice   of  meaning  for  the  central  word?”   the    window  
  • 53. Feature  vectors   •  Vectors  of  sets  of  feature/value  pairs  
  • 54. Two  kinds  of  features  in  the  vectors   •  Colloca$onal  features  and  bag-­‐of-­‐words  features   •  Colloca$onal/Paradigma$c   •  Features  about  words  at  specific  posi(ons  near  target  word   •  O^en  limited  to  just  word  iden(ty  and  POS   •  Bag-­‐of-­‐words   •  Features  about  words  that  occur  anywhere  in  the  window  (regardless   of  posi(on)   •  Typically  limited  to  frequency  counts   Generally speaking, a collocation is a sequence of words or terms that co-occur more often than would be expected by chance. But here the meaning is not exactly this…  
  • 55. Examples   •  Example  text  (WSJ):   An  electric  guitar  and  bass  player  stand  off  to   one  side  not  really  part  of  the  scene   •  Assume  a  window  of  +/-­‐  2  from  the  target  
  • 56. Examples   •  Example  text  (WSJ)   An  electric  guitar  and  bass  player  stand  off  to   one  side  not  really  part  of  the  scene,     •  Assume  a  window  of  +/-­‐  2  from  the  target  
  • 57. Colloca$onal  features   •  Posi(on-­‐specific  informa(on  about  the  words  and   colloca(ons  in  window   •  guitar  and  bass  player  stand   •  word  1,2,3  grams  in  window  of  ±3  is  common   encoding local lexical and grammatical information that can often accurately isola a given sense. For example consider the ambiguous word bass in the following WSJ sentenc (16.17) An electric guitar and bass player stand off to one side, not really part of the scene, just as a sort of nod to gringo expectations perhaps. A collocational feature vector, extracted from a window of two words to the rig and left of the target word, made up of the words themselves, their respective part of-speech, and pairs of words, that is, [wi 2,POSi 2,wi 1,POSi 1,wi+1,POSi+1,wi+2,POSi+2,wi 1 i 2,wi+1 i ] (16.1 would yield the following vector: [guitar, NN, and, CC, player, NN, stand, VB, and guitar, player stand] High performing systems generally use POS tags and word collocations of leng 1, 2, and 3 from a window of words 3 to the left and 3 to the right (Zhong and N For example consider the ambiguous word bass in the following WSJ sent 6.17) An electric guitar and bass player stand off to one side, not really par the scene, just as a sort of nod to gringo expectations perhaps. collocational feature vector, extracted from a window of two words to the d left of the target word, made up of the words themselves, their respective -speech, and pairs of words, that is, [wi 2,POSi 2,wi 1,POSi 1,wi+1,POSi+1,wi+2,POSi+2,wi 1 i 2,wi+1 i ] ( ould yield the following vector: [guitar, NN, and, CC, player, NN, stand, VB, and guitar, player stand] gh performing systems generally use POS tags and word collocations of l 2, and 3 from a window of words 3 to the left and 3 to the right (Zhong an
  • 58. Bag-­‐of-­‐words  features   •  “an  unordered  set  of  words”  –  posi(on  ignored   •  Choose  a  vocabulary:  a  useful  subset  of  words  in  a   training  corpus   •  Either:  the  count  of  how  o^en  each  of  those  terms   occurs  in  a  given  window  OR  just  a  binary  “indicator”  1   or  0    
  • 59. Co-­‐Occurrence  Example   •  Assume  we’ve  semled  on  a  possible  vocabulary  of  12  words  in   “bass”  sentences:       [fishing,  big,  sound,  player,  fly,  rod,  pound,  double,  runs,  playing,  guitar,  band]     •  The  vector  for:    guitar  and  bass  player  stand    [0,0,0,1,0,0,0,0,0,0,1,0]      
  • 61. Classifica$on   •  Input:   •   a  word  w  and  some  features  f   •   a  fixed  set  of  classes    C  =  {c1,  c2,…,  cJ}   •  Output:  a  predicted  class  c∈C   Any  kind  of  classifier   •  Naive  Bayes   •  Logis(c  regression   •  Neural  Networks   •  Support-­‐vector  machines   •  k-­‐Nearest  Neighbors   •  etc.    
  • 62. The  end     62