SlideShare a Scribd company logo
1 of 36
Download to read offline
www.insight-­‐centre.org	
  www.insight-­‐centre.org	
  
Adap%ve	
  En%ty	
  Linking	
  
PhD	
  Day	
  –	
  October/2013	
  
Bianca	
  Pereira	
  
www.insight-­‐centre.org	
  
Agenda	
  
•  Mo%va%on	
  
•  Problem	
  
•  Proposed	
  Solu%on	
  
•  Experiments	
  
•  Next	
  Steps	
  
www.insight-­‐centre.org	
  
Mo%va%on	
  
•  En%ty	
  Linking	
  creates	
  links	
  from	
  men%ons	
  in	
  
text	
  to	
  en%%es	
  from	
  a	
  structured	
  knowledge	
  
base.	
  It	
  ..	
  
..	
  enables	
  reusing	
  knowledge	
  already	
  published	
  on	
  
the	
  web.	
  
..	
  can	
  be	
  used	
  as	
  the	
  first	
  step	
  for	
  ontology	
  learning	
  
and	
  popula%on	
  algorithms.	
  
www.insight-­‐centre.org	
  
Problem	
  
•  En%ty	
   Linking	
   has	
   been	
   performed	
   using	
  
generic	
  approaches.	
  
•  It	
  does	
  not	
  work	
  for	
  all	
  domains	
  and	
  types	
  of	
  
text.	
  
•  There	
  is	
  no	
  clear	
  defini%on	
  of	
  “en%ty”.	
  
www.insight-­‐centre.org	
  
Problem	
  
•  Research	
  Ques%on:	
  “How	
  to	
  adapt	
  a	
  general	
  
En%ty	
  Linking	
  Approach	
  to	
  a	
  Domain?”	
  
•  Philosophical	
  Ques%on:	
  “What	
  is	
  an	
  En%ty?”	
  
www.insight-­‐centre.org	
  
Proposed	
  Solu%on	
  
•  Usage	
  of	
  Linked	
  Data	
  datasets.	
  
•  AELA,	
   a	
   Framework	
   for	
   Adap%ve	
   En%ty	
  
Linking.	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  What	
  is	
  an	
  En%ty?	
  	
  
•  What	
  have	
  been	
  iden%fied	
  as	
  en%%es?	
  
•  How	
  to	
  manually	
  detect	
  en%%es	
  from	
  text?	
  
•  How	
  the	
  defini%on	
  of	
  En%ty	
  change	
  from	
  one	
  
domain	
  to	
  another?	
  
	
  
	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  What	
  is	
  an	
  En%ty?	
  	
  
•  What	
  have	
  been	
  iden8fied	
  as	
  en88es?	
  
•  How	
  to	
  manually	
  detect	
  en%%es	
  from	
  text?	
  
•  How	
  the	
  defini%on	
  of	
  En%ty	
  change	
  from	
  one	
  
domain	
  to	
  another?	
  
	
  
	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  AIDA-­‐CoNLL	
  annotated	
  dataset	
  
– 1,387	
  Reuters	
  documents	
  (some	
  of	
  them	
  are	
  
tables)	
  
– Annota%on	
  of	
  en%%es	
  with	
  links	
  to	
  Wikipedia.	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  AIDA-­‐CoNLL	
  annotated	
  dataset	
  
– 1,387	
  Reuters	
  documents	
  (some	
  of	
  them	
  are	
  
tables)	
  
– Annota%on	
  of	
  en88es	
  with	
  links	
  to	
  Wikipedia.	
  
	
  	
  	
  	
  	
  	
  	
  ?
www.insight-­‐centre.org	
  
Experiments	
  –	
  AIDA	
  CoNLL	
  
•  Proper	
  Nouns:	
  5576	
  
– Names	
  ini%ated	
  by	
  a	
  capitalized	
  leber	
  
•  Acronyms:	
  712	
  
– Names	
  with	
  all	
  lebers	
  in	
  upper	
  case	
  
•  Others:	
  20	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  Proper	
  Nouns	
  
•  German	
  
•  Bri%sh	
  
•  European	
  Commission	
  
•  Germany	
  
•  European	
  Union	
  
•  Britain	
  
•  Commission	
  
•  Franz	
  Fischler	
  
•  France	
  
•  Spanish	
  
•  Loyola	
  de	
  Palacio	
  
•  Europe	
  
•  Bonn	
  
•  Hendrix	
  
	
  
•  U.S.	
  
•  Jimi	
  Hendrix	
  
•  English	
  
•  Noengham	
  
•  Australian	
  
•  China	
  
•  Taiwan	
  
•  Taipei	
  
•  Taiwan	
  Strait	
  
•  Ukraine	
  
•  Taiwanese	
  
•  Lien	
  Chan	
  
•  Chinese	
  
•  Foreign	
  Ministry	
  
	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  Proper	
  Nouns	
  
•  German	
  
•  Bri%sh	
  
•  European	
  Commission	
  
•  Germany	
  
•  European	
  Union	
  
•  Britain	
  
•  Commission	
  
•  Franz	
  Fischler	
  
•  France	
  
•  Spanish	
  
•  Loyola	
  de	
  Palacio	
  
•  Europe	
  
•  Bonn	
  
•  Hendrix	
  
	
  
•  U.S.	
  
•  Jimi	
  Hendrix	
  
•  English	
  
•  Noengham	
  
•  Australian	
  
•  China	
  
•  Taiwan	
  
•  Taipei	
  
•  Taiwan	
  Strait	
  
•  Ukraine	
  
•  Taiwanese	
  
•  Lien	
  Chan	
  
•  Chinese	
  
•  Foreign	
  Ministry	
  
	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  “Acronyms”	
  
•  BRUSSELS	
  
•  BSE	
  
•  LONDON	
  
•  BEIJING	
  
•  FRANKFURT	
  
•  GREEK	
  
•  ATHENS	
  
•  BAYERISCHE	
  VEREINSBANK	
  
•  SWEDISH	
  
•  SWEDEN	
  
•  JERUSALEM	
  
•  TUNIS	
  
•  KDPI	
  
•  PUK	
  
•  KDP	
  
•  MANAMA	
  
•  UAE	
  
•  DUBAI	
  
•  BEIRUT	
  
•  AN-­‐NAHAR	
  
•  AS-­‐SAFIR	
  
•  AD-­‐DIYAR	
  
•  CME	
  
•  CHICAGO	
  
•  MONTGOMERY	
  
•  SNET	
  
•  PHOENIX	
  
•  PARIS	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  -­‐	
  Others	
  
•  interior	
  ministry	
  
•  neo-­‐Nazi	
  
•  neo-­‐Nazism	
  
•  post-­‐Soviet	
  
•  van	
  der	
  Sar	
  
•  1860	
  Munich	
  
•  serie	
  A	
  
•  1990	
  World	
  Cup	
  
•  1992	
  European	
  championship	
  
•  2,000	
  Guineas	
  
•  2000	
  Games	
  
•  pan-­‐Turkism	
  
•  al-­‐Akhbar	
  
•  al-­‐Ram	
  
•  1997	
  FED	
  CUP	
  
•  1998	
  World	
  Cup	
  
•  1995	
  World	
  Cup	
  
•  1.	
  FC	
  Cologne	
  
•  post-­‐Communist	
  
•  cocker	
  spaniels	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  -­‐	
  Others	
  
•  interior	
  ministry	
  
•  neo-­‐Nazi	
  
•  neo-­‐Nazism	
  
•  post-­‐Soviet	
  
•  van	
  der	
  Sar	
  
•  1860	
  Munich	
  
•  serie	
  A	
  
•  1990	
  World	
  Cup	
  
•  1992	
  European	
  championship	
  
•  2,000	
  Guineas	
  
•  2000	
  Games	
  
•  pan-­‐Turkism	
  
•  al-­‐Akhbar	
  
•  al-­‐Ram	
  
•  1997	
  FED	
  CUP	
  
•  1998	
  World	
  Cup	
  
•  1995	
  World	
  Cup	
  
•  1.	
  FC	
  Cologne	
  
•  post-­‐Communist	
  
•  cocker	
  spaniels	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  -­‐	
  Others	
  
•  interior	
  ministry	
  
•  neo-­‐Nazi	
  
•  neo-­‐Nazism	
  
•  post-­‐Soviet	
  
•  van	
  der	
  Sar	
  
•  1860	
  Munich	
  
•  serie	
  A	
  
•  1990	
  World	
  Cup	
  
•  1992	
  European	
  championship	
  
•  2,000	
  Guineas	
  
•  2000	
  Games	
  
•  pan-­‐Turkism	
  
•  al-­‐Akhbar	
  
•  al-­‐Ram	
  
•  1997	
  FED	
  CUP	
  
•  1998	
  World	
  Cup	
  
•  1995	
  World	
  Cup	
  
•  1.	
  FC	
  Cologne	
  
•  post-­‐Communist	
  
•  cocker	
  spaniels	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  SOCCER	
  -­‐	
  GERMAN	
  FIRST	
  DIVISION	
  RESULTS	
  /	
  STANDINGS.	
  BONN	
  
1996-­‐12-­‐06	
  Results	
  of	
  German	
  first	
  division	
  soccer	
  matches	
  played	
  
on	
  Friday	
  :	
  Bochum	
  2	
  Bayer	
  Leverkusen	
  2	
  Werder	
  Bremen	
  1	
  1860	
  
Munich	
  1	
  Karlsruhe	
  3	
  Freiburg	
  0	
  Schalke	
  2	
  Hansa	
  Rostock	
  0	
  
Standings	
  (	
  tabulated	
  under	
  played,	
  won,	
  drawn,	
  lost,	
  goals	
  for	
  
goals	
  against	
  points	
  )	
  :	
  Bayer	
  Leverkusen	
  17	
  10	
  4	
  3	
  38	
  22	
  34	
  Bayern	
  
Munich	
  16	
  9	
  6	
  1	
  26	
  14	
  33	
  VfB	
  Stubgart	
  16	
  9	
  4	
  3	
  39	
  17	
  31	
  Borussia	
  
Dortmund	
  16	
  9	
  4	
  3	
  33	
  17	
  31	
  Karlsruhe	
  17	
  8	
  4	
  5	
  30	
  20	
  28	
  VfL	
  
Bochum	
  16	
  7	
  6	
  3	
  23	
  21	
  27	
  1.	
  FC	
  Cologne	
  16	
  8	
  2	
  6	
  31	
  27	
  26	
  Schalke	
  
04	
  17	
  7	
  4	
  6	
  25	
  26	
  25	
  Werder	
  Bremen	
  17	
  6	
  4	
  7	
  29	
  28	
  22	
  MSV	
  
Duisburg	
  16	
  5	
  4	
  7	
  16	
  22	
  19	
  SV	
  1860	
  Munich	
  17	
  4	
  6	
  7	
  25	
  31	
  18	
  FC	
  St.	
  
Pauli	
  15	
  5	
  3	
  7	
  21	
  28	
  18	
  Fortuna	
  Dusseldorf	
  16	
  5	
  3	
  8	
  13	
  24	
  18	
  
Hamburger	
  SV	
  16	
  4	
  5	
  7	
  20	
  25	
  17	
  Arminia	
  Bielefeld	
  16	
  4	
  4	
  8	
  18	
  28	
  16	
  
FC	
  Hansa	
  Rostock	
  17	
  4	
  3	
  10	
  19	
  26	
  15	
  Borussia	
  Monchengladbach	
  16	
  
4	
  3	
  9	
  12	
  22	
  15	
  SC	
  Freiburg	
  17	
  4	
  1	
  12	
  20	
  40	
  13	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  Some	
  findings	
  
•  Syntac%c	
  structure	
  does	
  not	
  help	
  in	
  all	
  cases.	
  
– Proper	
   Nouns	
   may	
   not	
   be	
   ini%alized	
   by	
   a	
  
capitalized	
  leber.	
  
– Not	
   all	
   words	
   with	
   all	
   lebers	
   in	
   upper	
   case	
   are	
  
Acronyms.	
  
•  There	
   may	
   be	
   some	
   “men%on	
   boundary”	
  
problems	
  even	
  on	
  manual	
  annota%on.	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  5596	
  en%%es	
  
•  6308	
  different	
  men%on	
  strings	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  1110	
  en%%es	
  with	
  name	
  varia%ons.	
  
hbp://en.wikipedia.org/wiki/New_York_Jets 	
  	
   New	
  York	
  Jets	
   NY	
  JETS	
  
hbp://en.wikipedia.org/wiki/Butch_Harmon	
   Butch	
  Harmon	
   Butch	
  
hbp://en.wikipedia.org/wiki/Norway 	
  	
   Norway	
   Norwegian	
  
hbp://en.wikipedia.org/wiki/Cincinna%_Reds	
   Cincinna%	
  Reds	
   CINCINNATI	
  Reds	
  
hbp://en.wikipedia.org/wiki/Republika_Srpska	
   Bosnian	
  Serb	
   Republika	
  Srpska	
  
hbp://en.wikipedia.org/wiki/John_Smoltz	
   John	
  Smoltz	
   Smoltz	
  
hbp://en.wikipedia.org/wiki/Rede_Globo	
   TV	
  Globo	
   Globo	
  
hbp://en.wikipedia.org/wiki/London_Wasps	
   London	
   Wasps	
  
hbp://en.wikipedia.org/wiki/Chicago_Cubs	
   CHICAGO	
   CUBS	
   Chicago	
  Cubs	
  
hbp://en.wikipedia.org/wiki/England_cricket_team	
   ENGLAND	
   Englishmen	
  
hbp://en.wikipedia.org/wiki/Alexander_Downer	
   Alexander	
  Downer	
   Downer	
  
hbp://en.wikipedia.org/wiki/Wales	
   Wales	
   Welsh	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  1110	
  en%%es	
  with	
  name	
  varia%ons.	
  
hbp://en.wikipedia.org/wiki/New_York_Jets 	
  	
   New	
  York	
  Jets	
   NY	
  JETS	
  
hbp://en.wikipedia.org/wiki/Butch_Harmon	
   Butch	
  Harmon	
   Butch	
  
hCp://en.wikipedia.org/wiki/Norway 	
  	
   Norway	
   Norwegian	
  
hbp://en.wikipedia.org/wiki/Cincinna%_Reds	
   Cincinna%	
  Reds	
   CINCINNATI	
  Reds	
  
hbp://en.wikipedia.org/wiki/Republika_Srpska	
   Bosnian	
  Serb	
   Republika	
  Srpska	
  
hbp://en.wikipedia.org/wiki/John_Smoltz	
   John	
  Smoltz	
   Smoltz	
  
hbp://en.wikipedia.org/wiki/Rede_Globo	
   TV	
  Globo	
   Globo	
  
hbp://en.wikipedia.org/wiki/London_Wasps	
   London	
   Wasps	
  
hbp://en.wikipedia.org/wiki/Chicago_Cubs	
   CHICAGO	
   CUBS	
   Chicago	
  Cubs	
  
hCp://en.wikipedia.org/wiki/England_cricket_team	
   ENGLAND	
   Englishmen	
  
hbp://en.wikipedia.org/wiki/Alexander_Downer	
   Alexander	
  Downer	
   Downer	
  
hCp://en.wikipedia.org/wiki/Wales	
   Wales	
   Welsh	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  Some	
  findings	
  
•  Use	
  of	
  metonymy.	
  
•  Disambigua%on	
  (Norway	
  vs.	
  Norwegians).	
  
•  Men%on	
  to	
  an	
  en%ty	
  using	
  part	
  of	
  the	
  name.	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  434	
  ambiguous	
  men%on	
  strings	
  (corpus	
  level)	
  
French	
   hbp://en.wikipedia.org/wiki/France	
  	
  
hbp://en.wikipedia.org/wiki/France_na%onal_football_team	
  
NORTHAMPTON	
   hbp://en.wikipedia.org/wiki/Northampton	
  
hbp://en.wikipedia.org/wiki/Northampton_Town_F.C.	
  
hbp://en.wikipedia.org/wiki/Northamptonshire_County_Cricket_Club	
  
hbp://en.wikipedia.org/wiki/Northampton_Saints	
  
West	
   hbp://en.wikipedia.org/wiki/Western_World	
  
hbp://en.wikipedia.org/wiki/American_League_West	
  
Volkswagen	
  AG	
   hbp://en.wikipedia.org/wiki/Volkswagen	
  
hbp://en.wikipedia.org/wiki/Volkswagen_Group	
  
EDMONTON	
   hbp://en.wikipedia.org/wiki/Edmonton	
  
hbp://en.wikipedia.org/wiki/Edmonton_Oilers	
  
Rangers	
   hbp://en.wikipedia.org/wiki/Texas_Rangers_(baseball)	
  
hbp://en.wikipedia.org/wiki/Rangers_F.C.	
  
Va%can	
   hbp://en.wikipedia.org/wiki/Holy_See	
  
hbp://en.wikipedia.org/wiki/Va%can_Library	
  
hbp://en.wikipedia.org/wiki/Va%can_City	
  
Shell	
   hbp://en.wikipedia.org/wiki/Shell_Turbo_Chargers	
  
hbp://en.wikipedia.org/wiki/Shell_Oil_Company	
  
Irish	
   hbp://en.wikipedia.org/wiki/Republic_of_Ireland	
  
hbp://en.wikipedia.org/wiki/Republic_of_Ireland_na%onal_football_team	
  
hbp://en.wikipedia.org/wiki/Northern_Ireland	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  190	
  ambiguous	
  men%on	
  strings	
  (document)	
  
17	
  Iraq	
   BAGHDAD	
   hbp://en.wikipedia.org/wiki/Baghdad	
  
hbp://en.wikipedia.org/wiki/Iraq	
  
965testa	
  SOCCER	
   SILVA	
   hbp://en.wikipedia.org/wiki/Mario_Silva	
  
hbp://en.wikipedia.org/wiki/Mauro_Silva	
  
1102testa	
  SOCCER	
   WORLD	
  CUP	
   hCp://en.wikipedia.org/wiki/1998_FIFA_World_Cup	
  
hCp://en.wikipedia.org/wiki/FIFA_World_Cup	
  
791	
  PRESS	
   Chinese	
   hbp://en.wikipedia.org/wiki/People’s_Republic_of_China	
  
hbp://en.wikipedia.org/wiki/Chinese_language	
  
179	
  Soccer	
   Liechenstein	
   hCp://en.wikipedia.org/wiki/Liechtenstein_na8onal_football_team	
  
hCp://en.wikipedia.org/wiki/Liechtenstein	
  
703	
  Cricket	
   Pakistan	
   hbp://en.wikipedia.org/wiki/Pakistan_na%onal_cricket_team	
  
hbp://en.wikipedia.org/wiki/Pakistan	
  
1323testb	
  Frankfurt	
   Frankfurt	
   hbp://en.wikipedia.org/wiki/Frankfurt_Stock_Exchange	
  
hbp://en.wikipedia.org/wiki/Frankfurt_am_Main	
  
1054testa	
  CRICKET	
   ENGLAND	
   hbp://en.wikipedia.org/wiki/England_cricket_team	
  
hbp://en.wikipedia.org/wiki/England	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  Some	
  findings	
  
•  Even	
  misspelled	
  text	
  is	
  marked.	
  
•  “Classes”	
  and	
  “instances”	
  are	
  annotated.	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  
•  39	
  Classes	
  
hbp://dbpedia.org/ontology/Agent	
   2579	
  
hbp://xmlns.com/foaf/0.1/Person	
   426	
  
hbp://dbpedia.org/ontology/Place	
   333	
  
hbp://dbpedia.org/ontology/City	
   234	
  
hbp://dbpedia.org/ontology/Country	
   194	
  
hbp://dbpedia.org/ontology/Administra%veRegion	
   76	
  
hCp://dbpedia.org/ontology/Newspaper	
   55	
  
hbp://dbpedia.org/ontology/ArchitecturalStructure	
   39	
  
hCp://dbpedia.org/ontology/EthnicGroup	
   30	
  
hbp://dbpedia.org/ontology/Airport	
   21	
  
hCp://dbpedia.org/ontology/Event	
   18	
  
hbp://dbpedia.org/ontology/Island	
   12	
  
hCp://dbpedia.org/ontology/Film	
   10	
  
hbp://dbpedia.org/ontology/BodyOfWater	
   10	
  
www.insight-­‐centre.org	
  
AIDA	
  CoNLL	
  –	
  Some	
  findings	
  
•  Not	
  only	
  Person,	
  Loca%on	
  and	
  Organiza%on.	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  How	
  were	
  those	
  en%%es	
  annotated?	
  
•  Which	
  Wikipedia	
  pages	
  were	
  chosen	
  as	
  
represen%ng	
  en%%es?	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  How	
  were	
  those	
  en%%es	
  annotated?	
  
•  Which	
  Wikipedia	
  pages	
  were	
  chosen	
  as	
  
represen%ng	
  en%%es?	
  
•  What	
  is	
  the	
  Annota8on	
  Guideline?	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  What	
  is	
  an	
  En%ty?	
  	
  
•  What	
  have	
  been	
  iden%fied	
  as	
  en%%es?	
  
•  How	
  to	
  manually	
  detect	
  en88es	
  from	
  text?	
  
•  How	
  the	
  defini%on	
  of	
  En%ty	
  change	
  from	
  one	
  
domain	
  to	
  another?	
  
	
  
	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  Survey	
  on	
  Annota%on	
  Guidelines	
  
– Ques%on:	
  “Is	
  there	
  any	
  guideline	
  for	
  en%ty	
  
annota%on?”	
  
– Search	
  Strategy:	
  
•  Papers	
  from	
  “en%ty	
  annota%on	
  guidelines”.	
  
•  Guidelines	
  from	
  annotated	
  corpora	
  provided	
  by	
  En%ty	
  
Recogni%on,	
  Disambigua%on	
  and	
  Linking	
  challenges.	
  
www.insight-­‐centre.org	
  
Experiments	
  
•  Survey	
  on	
  Annota%on	
  Guidelines	
  
– Common	
   Problems	
   (differ	
   from	
   one	
   domain	
   to	
  
another)	
  
•  Men%on	
  Boundaries	
  
•  Name	
  varia%ons	
  
•  Metonymy	
  
– Annota%on	
  Process	
  
– Evalua%on	
  
www.insight-­‐centre.org	
  
Next	
  Steps	
  
•  Corpus	
  Sampling	
  for	
  Annota%on	
  
•  Development	
  of	
  Annota%on	
  Guidelines	
  
– Domain/Task	
  dependent	
  
– Itera%ve	
  Process	
  
•  Domains:	
  
– Touris%c	
  Domain	
  (TripAdvisor	
  corpus)	
  
– Electronics	
  Domain	
  
– Other	
  
www.insight-­‐centre.org	
  
Next	
  Steps	
  
•  What	
  is	
  an	
  En%ty?	
  	
  
•  What	
  have	
  been	
  iden%fied	
  as	
  en%%es?	
  
•  How	
  to	
  manually	
  detect	
  en%%es	
  from	
  text?	
  
•  How	
  the	
  defini8on	
  of	
  En8ty	
  change	
  from	
  one	
  
domain	
  to	
  another?	
  
	
  
	
  
www.insight-­‐centre.org	
  
Next	
  Steps	
  
•  What	
  is	
  an	
  En%ty?	
  	
  
•  What	
  have	
  been	
  iden%fied	
  as	
  en%%es?	
  
•  How	
  to	
  manually	
  detect	
  en%%es	
  from	
  text?	
  
•  How	
  the	
  defini%on	
  of	
  En%ty	
  change	
  from	
  one	
  
domain	
  to	
  another?	
  
•  How	
  to	
  iden8fy	
  the	
  most	
  frequent	
  classes	
  in	
  
a	
  domain?	
  
	
  
	
  

More Related Content

Similar to Adaptive Entity Linking Framework

Gold rushwriterspresentation 2013
Gold rushwriterspresentation 2013Gold rushwriterspresentation 2013
Gold rushwriterspresentation 2013J T "Tom" Johnson
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web ArchivesMichael Nelson
 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)ux singapore
 
Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012Kerryn Amery
 
Saving the World with Open Source and Science
Saving the World with Open Source and ScienceSaving the World with Open Source and Science
Saving the World with Open Source and ScienceAll Things Open
 
A Fractured Fairy Tale of the Internet (SI110)
A Fractured Fairy Tale of the Internet (SI110)A Fractured Fairy Tale of the Internet (SI110)
A Fractured Fairy Tale of the Internet (SI110)Charles Severance
 
Wikidata Introductory Workshop
Wikidata Introductory WorkshopWikidata Introductory Workshop
Wikidata Introductory WorkshopBeat Estermann
 
Urban Archaeology - Session 12: Writing for Archaeology
Urban Archaeology - Session 12: Writing for ArchaeologyUrban Archaeology - Session 12: Writing for Archaeology
Urban Archaeology - Session 12: Writing for ArchaeologyNicole Beale
 
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data VocabulariesIsaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data VocabulariesAntoine Isaac
 
Büyük Veriyle Büyük Resmi Görmek
Büyük Veriyle Büyük Resmi GörmekBüyük Veriyle Büyük Resmi Görmek
Büyük Veriyle Büyük Resmi Görmekideaport
 
IT Trends for 2011: Things Might Be Very Different Today
IT Trends for 2011: Things Might Be Very Different TodayIT Trends for 2011: Things Might Be Very Different Today
IT Trends for 2011: Things Might Be Very Different TodayCharles Severance
 
The European Innovation Partnership on Water Online Marketplace
The European Innovation Partnership on Water Online MarketplaceThe European Innovation Partnership on Water Online Marketplace
The European Innovation Partnership on Water Online MarketplaceMartin Kaltenböck
 
How to read a million books?
How to read a million books?How to read a million books?
How to read a million books?cneudecker
 
0011 ICT In Physical Education
0011 ICT In Physical Education0011 ICT In Physical Education
0011 ICT In Physical EducationKeith Lyons
 
Estermann wikidata introduction-sapa-20180630
Estermann wikidata introduction-sapa-20180630Estermann wikidata introduction-sapa-20180630
Estermann wikidata introduction-sapa-20180630Beat Estermann
 
The Web of Data is Our Opportunity
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our OpportunityRichard Wallis
 
Keeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation WorkshopKeeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation WorkshopCarly Strasser
 

Similar to Adaptive Entity Linking Framework (20)

Ar search skills
Ar search skillsAr search skills
Ar search skills
 
Gold rushwriterspresentation 2013
Gold rushwriterspresentation 2013Gold rushwriterspresentation 2013
Gold rushwriterspresentation 2013
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
 
Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012Emtacl12, mlibraries12 conferences, 2012
Emtacl12, mlibraries12 conferences, 2012
 
Saving the World with Open Source and Science
Saving the World with Open Source and ScienceSaving the World with Open Source and Science
Saving the World with Open Source and Science
 
A Fractured Fairy Tale of the Internet (SI110)
A Fractured Fairy Tale of the Internet (SI110)A Fractured Fairy Tale of the Internet (SI110)
A Fractured Fairy Tale of the Internet (SI110)
 
Wikidata Introductory Workshop
Wikidata Introductory WorkshopWikidata Introductory Workshop
Wikidata Introductory Workshop
 
Information Update Feb 2015
Information Update Feb 2015Information Update Feb 2015
Information Update Feb 2015
 
Urban Archaeology - Session 12: Writing for Archaeology
Urban Archaeology - Session 12: Writing for ArchaeologyUrban Archaeology - Session 12: Writing for Archaeology
Urban Archaeology - Session 12: Writing for Archaeology
 
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data VocabulariesIsaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
 
Büyük Veriyle Büyük Resmi Görmek
Büyük Veriyle Büyük Resmi GörmekBüyük Veriyle Büyük Resmi Görmek
Büyük Veriyle Büyük Resmi Görmek
 
IT Trends for 2011: Things Might Be Very Different Today
IT Trends for 2011: Things Might Be Very Different TodayIT Trends for 2011: Things Might Be Very Different Today
IT Trends for 2011: Things Might Be Very Different Today
 
The European Innovation Partnership on Water Online Marketplace
The European Innovation Partnership on Water Online MarketplaceThe European Innovation Partnership on Water Online Marketplace
The European Innovation Partnership on Water Online Marketplace
 
How to read a million books?
How to read a million books?How to read a million books?
How to read a million books?
 
0011 ICT In Physical Education
0011 ICT In Physical Education0011 ICT In Physical Education
0011 ICT In Physical Education
 
Estermann wikidata introduction-sapa-20180630
Estermann wikidata introduction-sapa-20180630Estermann wikidata introduction-sapa-20180630
Estermann wikidata introduction-sapa-20180630
 
The Web of Data is Our Opportunity
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our Opportunity
 
Hosting public domain chemicals data online for the community – the challenge...
Hosting public domain chemicals data online for the community – the challenge...Hosting public domain chemicals data online for the community – the challenge...
Hosting public domain chemicals data online for the community – the challenge...
 
Keeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation WorkshopKeeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation Workshop
 

More from Bianca Pereira

Dealing with writer's block
Dealing with writer's blockDealing with writer's block
Dealing with writer's blockBianca Pereira
 
HCI Challenges in Crowd4Access Citizen Science project
HCI Challenges in Crowd4Access Citizen Science projectHCI Challenges in Crowd4Access Citizen Science project
HCI Challenges in Crowd4Access Citizen Science projectBianca Pereira
 
Taxonomy Extraction for Customer Service Knowledge Base Construction
Taxonomy Extraction for Customer Service Knowledge Base ConstructionTaxonomy Extraction for Customer Service Knowledge Base Construction
Taxonomy Extraction for Customer Service Knowledge Base ConstructionBianca Pereira
 
How to build your topic?
How to build your topic?How to build your topic?
How to build your topic?Bianca Pereira
 
Dealing with writer's block
Dealing with writer's blockDealing with writer's block
Dealing with writer's blockBianca Pereira
 
Smart Futures presentation at St. Raphael's College
Smart Futures presentation at St. Raphael's CollegeSmart Futures presentation at St. Raphael's College
Smart Futures presentation at St. Raphael's CollegeBianca Pereira
 
Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...
Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...
Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...Bianca Pereira
 
Tutorial de Web Semântica - CompSem 2015
Tutorial de Web Semântica - CompSem 2015Tutorial de Web Semântica - CompSem 2015
Tutorial de Web Semântica - CompSem 2015Bianca Pereira
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterBianca Pereira
 
Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...
Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...
Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...Bianca Pereira
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsBianca Pereira
 
PhD Day: Entity Linking using Ontology Modularization
PhD Day: Entity Linking using Ontology ModularizationPhD Day: Entity Linking using Ontology Modularization
PhD Day: Entity Linking using Ontology ModularizationBianca Pereira
 
NUIG Research Showcase 2014
NUIG Research Showcase 2014NUIG Research Showcase 2014
NUIG Research Showcase 2014Bianca Pereira
 
AELA: An Adaptive Entity Linking Approach
AELA: An Adaptive Entity Linking ApproachAELA: An Adaptive Entity Linking Approach
AELA: An Adaptive Entity Linking ApproachBianca Pereira
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content SmarterBianca Pereira
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Bianca Pereira
 
Reading Group 2014 (Insight NUIG)
Reading Group 2014 (Insight NUIG)Reading Group 2014 (Insight NUIG)
Reading Group 2014 (Insight NUIG)Bianca Pereira
 

More from Bianca Pereira (17)

Dealing with writer's block
Dealing with writer's blockDealing with writer's block
Dealing with writer's block
 
HCI Challenges in Crowd4Access Citizen Science project
HCI Challenges in Crowd4Access Citizen Science projectHCI Challenges in Crowd4Access Citizen Science project
HCI Challenges in Crowd4Access Citizen Science project
 
Taxonomy Extraction for Customer Service Knowledge Base Construction
Taxonomy Extraction for Customer Service Knowledge Base ConstructionTaxonomy Extraction for Customer Service Knowledge Base Construction
Taxonomy Extraction for Customer Service Knowledge Base Construction
 
How to build your topic?
How to build your topic?How to build your topic?
How to build your topic?
 
Dealing with writer's block
Dealing with writer's blockDealing with writer's block
Dealing with writer's block
 
Smart Futures presentation at St. Raphael's College
Smart Futures presentation at St. Raphael's CollegeSmart Futures presentation at St. Raphael's College
Smart Futures presentation at St. Raphael's College
 
Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...
Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...
Compreensão de Linguagem Natural no Insight: Construindo a Ponte entre Texto ...
 
Tutorial de Web Semântica - CompSem 2015
Tutorial de Web Semântica - CompSem 2015Tutorial de Web Semântica - CompSem 2015
Tutorial de Web Semântica - CompSem 2015
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge Chapter
 
Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...
Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...
Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Appr...
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data Datasets
 
PhD Day: Entity Linking using Ontology Modularization
PhD Day: Entity Linking using Ontology ModularizationPhD Day: Entity Linking using Ontology Modularization
PhD Day: Entity Linking using Ontology Modularization
 
NUIG Research Showcase 2014
NUIG Research Showcase 2014NUIG Research Showcase 2014
NUIG Research Showcase 2014
 
AELA: An Adaptive Entity Linking Approach
AELA: An Adaptive Entity Linking ApproachAELA: An Adaptive Entity Linking Approach
AELA: An Adaptive Entity Linking Approach
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content Smarter
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)
 
Reading Group 2014 (Insight NUIG)
Reading Group 2014 (Insight NUIG)Reading Group 2014 (Insight NUIG)
Reading Group 2014 (Insight NUIG)
 

Recently uploaded

Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 

Recently uploaded (20)

Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 

Adaptive Entity Linking Framework

  • 1. www.insight-­‐centre.org  www.insight-­‐centre.org   Adap%ve  En%ty  Linking   PhD  Day  –  October/2013   Bianca  Pereira  
  • 2. www.insight-­‐centre.org   Agenda   •  Mo%va%on   •  Problem   •  Proposed  Solu%on   •  Experiments   •  Next  Steps  
  • 3. www.insight-­‐centre.org   Mo%va%on   •  En%ty  Linking  creates  links  from  men%ons  in   text  to  en%%es  from  a  structured  knowledge   base.  It  ..   ..  enables  reusing  knowledge  already  published  on   the  web.   ..  can  be  used  as  the  first  step  for  ontology  learning   and  popula%on  algorithms.  
  • 4. www.insight-­‐centre.org   Problem   •  En%ty   Linking   has   been   performed   using   generic  approaches.   •  It  does  not  work  for  all  domains  and  types  of   text.   •  There  is  no  clear  defini%on  of  “en%ty”.  
  • 5. www.insight-­‐centre.org   Problem   •  Research  Ques%on:  “How  to  adapt  a  general   En%ty  Linking  Approach  to  a  Domain?”   •  Philosophical  Ques%on:  “What  is  an  En%ty?”  
  • 6. www.insight-­‐centre.org   Proposed  Solu%on   •  Usage  of  Linked  Data  datasets.   •  AELA,   a   Framework   for   Adap%ve   En%ty   Linking.  
  • 7. www.insight-­‐centre.org   Experiments   •  What  is  an  En%ty?     •  What  have  been  iden%fied  as  en%%es?   •  How  to  manually  detect  en%%es  from  text?   •  How  the  defini%on  of  En%ty  change  from  one   domain  to  another?      
  • 8. www.insight-­‐centre.org   Experiments   •  What  is  an  En%ty?     •  What  have  been  iden8fied  as  en88es?   •  How  to  manually  detect  en%%es  from  text?   •  How  the  defini%on  of  En%ty  change  from  one   domain  to  another?      
  • 9. www.insight-­‐centre.org   Experiments   •  AIDA-­‐CoNLL  annotated  dataset   – 1,387  Reuters  documents  (some  of  them  are   tables)   – Annota%on  of  en%%es  with  links  to  Wikipedia.  
  • 10. www.insight-­‐centre.org   Experiments   •  AIDA-­‐CoNLL  annotated  dataset   – 1,387  Reuters  documents  (some  of  them  are   tables)   – Annota%on  of  en88es  with  links  to  Wikipedia.                ?
  • 11. www.insight-­‐centre.org   Experiments  –  AIDA  CoNLL   •  Proper  Nouns:  5576   – Names  ini%ated  by  a  capitalized  leber   •  Acronyms:  712   – Names  with  all  lebers  in  upper  case   •  Others:  20  
  • 12. www.insight-­‐centre.org   AIDA  CoNLL  –  Proper  Nouns   •  German   •  Bri%sh   •  European  Commission   •  Germany   •  European  Union   •  Britain   •  Commission   •  Franz  Fischler   •  France   •  Spanish   •  Loyola  de  Palacio   •  Europe   •  Bonn   •  Hendrix     •  U.S.   •  Jimi  Hendrix   •  English   •  Noengham   •  Australian   •  China   •  Taiwan   •  Taipei   •  Taiwan  Strait   •  Ukraine   •  Taiwanese   •  Lien  Chan   •  Chinese   •  Foreign  Ministry    
  • 13. www.insight-­‐centre.org   AIDA  CoNLL  –  Proper  Nouns   •  German   •  Bri%sh   •  European  Commission   •  Germany   •  European  Union   •  Britain   •  Commission   •  Franz  Fischler   •  France   •  Spanish   •  Loyola  de  Palacio   •  Europe   •  Bonn   •  Hendrix     •  U.S.   •  Jimi  Hendrix   •  English   •  Noengham   •  Australian   •  China   •  Taiwan   •  Taipei   •  Taiwan  Strait   •  Ukraine   •  Taiwanese   •  Lien  Chan   •  Chinese   •  Foreign  Ministry    
  • 14. www.insight-­‐centre.org   AIDA  CoNLL  –  “Acronyms”   •  BRUSSELS   •  BSE   •  LONDON   •  BEIJING   •  FRANKFURT   •  GREEK   •  ATHENS   •  BAYERISCHE  VEREINSBANK   •  SWEDISH   •  SWEDEN   •  JERUSALEM   •  TUNIS   •  KDPI   •  PUK   •  KDP   •  MANAMA   •  UAE   •  DUBAI   •  BEIRUT   •  AN-­‐NAHAR   •  AS-­‐SAFIR   •  AD-­‐DIYAR   •  CME   •  CHICAGO   •  MONTGOMERY   •  SNET   •  PHOENIX   •  PARIS  
  • 15. www.insight-­‐centre.org   AIDA  CoNLL  -­‐  Others   •  interior  ministry   •  neo-­‐Nazi   •  neo-­‐Nazism   •  post-­‐Soviet   •  van  der  Sar   •  1860  Munich   •  serie  A   •  1990  World  Cup   •  1992  European  championship   •  2,000  Guineas   •  2000  Games   •  pan-­‐Turkism   •  al-­‐Akhbar   •  al-­‐Ram   •  1997  FED  CUP   •  1998  World  Cup   •  1995  World  Cup   •  1.  FC  Cologne   •  post-­‐Communist   •  cocker  spaniels  
  • 16. www.insight-­‐centre.org   AIDA  CoNLL  -­‐  Others   •  interior  ministry   •  neo-­‐Nazi   •  neo-­‐Nazism   •  post-­‐Soviet   •  van  der  Sar   •  1860  Munich   •  serie  A   •  1990  World  Cup   •  1992  European  championship   •  2,000  Guineas   •  2000  Games   •  pan-­‐Turkism   •  al-­‐Akhbar   •  al-­‐Ram   •  1997  FED  CUP   •  1998  World  Cup   •  1995  World  Cup   •  1.  FC  Cologne   •  post-­‐Communist   •  cocker  spaniels  
  • 17. www.insight-­‐centre.org   AIDA  CoNLL  -­‐  Others   •  interior  ministry   •  neo-­‐Nazi   •  neo-­‐Nazism   •  post-­‐Soviet   •  van  der  Sar   •  1860  Munich   •  serie  A   •  1990  World  Cup   •  1992  European  championship   •  2,000  Guineas   •  2000  Games   •  pan-­‐Turkism   •  al-­‐Akhbar   •  al-­‐Ram   •  1997  FED  CUP   •  1998  World  Cup   •  1995  World  Cup   •  1.  FC  Cologne   •  post-­‐Communist   •  cocker  spaniels  
  • 18. www.insight-­‐centre.org   AIDA  CoNLL   •  SOCCER  -­‐  GERMAN  FIRST  DIVISION  RESULTS  /  STANDINGS.  BONN   1996-­‐12-­‐06  Results  of  German  first  division  soccer  matches  played   on  Friday  :  Bochum  2  Bayer  Leverkusen  2  Werder  Bremen  1  1860   Munich  1  Karlsruhe  3  Freiburg  0  Schalke  2  Hansa  Rostock  0   Standings  (  tabulated  under  played,  won,  drawn,  lost,  goals  for   goals  against  points  )  :  Bayer  Leverkusen  17  10  4  3  38  22  34  Bayern   Munich  16  9  6  1  26  14  33  VfB  Stubgart  16  9  4  3  39  17  31  Borussia   Dortmund  16  9  4  3  33  17  31  Karlsruhe  17  8  4  5  30  20  28  VfL   Bochum  16  7  6  3  23  21  27  1.  FC  Cologne  16  8  2  6  31  27  26  Schalke   04  17  7  4  6  25  26  25  Werder  Bremen  17  6  4  7  29  28  22  MSV   Duisburg  16  5  4  7  16  22  19  SV  1860  Munich  17  4  6  7  25  31  18  FC  St.   Pauli  15  5  3  7  21  28  18  Fortuna  Dusseldorf  16  5  3  8  13  24  18   Hamburger  SV  16  4  5  7  20  25  17  Arminia  Bielefeld  16  4  4  8  18  28  16   FC  Hansa  Rostock  17  4  3  10  19  26  15  Borussia  Monchengladbach  16   4  3  9  12  22  15  SC  Freiburg  17  4  1  12  20  40  13  
  • 19. www.insight-­‐centre.org   AIDA  CoNLL  –  Some  findings   •  Syntac%c  structure  does  not  help  in  all  cases.   – Proper   Nouns   may   not   be   ini%alized   by   a   capitalized  leber.   – Not   all   words   with   all   lebers   in   upper   case   are   Acronyms.   •  There   may   be   some   “men%on   boundary”   problems  even  on  manual  annota%on.  
  • 20. www.insight-­‐centre.org   AIDA  CoNLL   •  5596  en%%es   •  6308  different  men%on  strings  
  • 21. www.insight-­‐centre.org   AIDA  CoNLL   •  1110  en%%es  with  name  varia%ons.   hbp://en.wikipedia.org/wiki/New_York_Jets     New  York  Jets   NY  JETS   hbp://en.wikipedia.org/wiki/Butch_Harmon   Butch  Harmon   Butch   hbp://en.wikipedia.org/wiki/Norway     Norway   Norwegian   hbp://en.wikipedia.org/wiki/Cincinna%_Reds   Cincinna%  Reds   CINCINNATI  Reds   hbp://en.wikipedia.org/wiki/Republika_Srpska   Bosnian  Serb   Republika  Srpska   hbp://en.wikipedia.org/wiki/John_Smoltz   John  Smoltz   Smoltz   hbp://en.wikipedia.org/wiki/Rede_Globo   TV  Globo   Globo   hbp://en.wikipedia.org/wiki/London_Wasps   London   Wasps   hbp://en.wikipedia.org/wiki/Chicago_Cubs   CHICAGO   CUBS   Chicago  Cubs   hbp://en.wikipedia.org/wiki/England_cricket_team   ENGLAND   Englishmen   hbp://en.wikipedia.org/wiki/Alexander_Downer   Alexander  Downer   Downer   hbp://en.wikipedia.org/wiki/Wales   Wales   Welsh  
  • 22. www.insight-­‐centre.org   AIDA  CoNLL   •  1110  en%%es  with  name  varia%ons.   hbp://en.wikipedia.org/wiki/New_York_Jets     New  York  Jets   NY  JETS   hbp://en.wikipedia.org/wiki/Butch_Harmon   Butch  Harmon   Butch   hCp://en.wikipedia.org/wiki/Norway     Norway   Norwegian   hbp://en.wikipedia.org/wiki/Cincinna%_Reds   Cincinna%  Reds   CINCINNATI  Reds   hbp://en.wikipedia.org/wiki/Republika_Srpska   Bosnian  Serb   Republika  Srpska   hbp://en.wikipedia.org/wiki/John_Smoltz   John  Smoltz   Smoltz   hbp://en.wikipedia.org/wiki/Rede_Globo   TV  Globo   Globo   hbp://en.wikipedia.org/wiki/London_Wasps   London   Wasps   hbp://en.wikipedia.org/wiki/Chicago_Cubs   CHICAGO   CUBS   Chicago  Cubs   hCp://en.wikipedia.org/wiki/England_cricket_team   ENGLAND   Englishmen   hbp://en.wikipedia.org/wiki/Alexander_Downer   Alexander  Downer   Downer   hCp://en.wikipedia.org/wiki/Wales   Wales   Welsh  
  • 23. www.insight-­‐centre.org   AIDA  CoNLL  –  Some  findings   •  Use  of  metonymy.   •  Disambigua%on  (Norway  vs.  Norwegians).   •  Men%on  to  an  en%ty  using  part  of  the  name.  
  • 24. www.insight-­‐centre.org   AIDA  CoNLL   •  434  ambiguous  men%on  strings  (corpus  level)   French   hbp://en.wikipedia.org/wiki/France     hbp://en.wikipedia.org/wiki/France_na%onal_football_team   NORTHAMPTON   hbp://en.wikipedia.org/wiki/Northampton   hbp://en.wikipedia.org/wiki/Northampton_Town_F.C.   hbp://en.wikipedia.org/wiki/Northamptonshire_County_Cricket_Club   hbp://en.wikipedia.org/wiki/Northampton_Saints   West   hbp://en.wikipedia.org/wiki/Western_World   hbp://en.wikipedia.org/wiki/American_League_West   Volkswagen  AG   hbp://en.wikipedia.org/wiki/Volkswagen   hbp://en.wikipedia.org/wiki/Volkswagen_Group   EDMONTON   hbp://en.wikipedia.org/wiki/Edmonton   hbp://en.wikipedia.org/wiki/Edmonton_Oilers   Rangers   hbp://en.wikipedia.org/wiki/Texas_Rangers_(baseball)   hbp://en.wikipedia.org/wiki/Rangers_F.C.   Va%can   hbp://en.wikipedia.org/wiki/Holy_See   hbp://en.wikipedia.org/wiki/Va%can_Library   hbp://en.wikipedia.org/wiki/Va%can_City   Shell   hbp://en.wikipedia.org/wiki/Shell_Turbo_Chargers   hbp://en.wikipedia.org/wiki/Shell_Oil_Company   Irish   hbp://en.wikipedia.org/wiki/Republic_of_Ireland   hbp://en.wikipedia.org/wiki/Republic_of_Ireland_na%onal_football_team   hbp://en.wikipedia.org/wiki/Northern_Ireland  
  • 25. www.insight-­‐centre.org   AIDA  CoNLL   •  190  ambiguous  men%on  strings  (document)   17  Iraq   BAGHDAD   hbp://en.wikipedia.org/wiki/Baghdad   hbp://en.wikipedia.org/wiki/Iraq   965testa  SOCCER   SILVA   hbp://en.wikipedia.org/wiki/Mario_Silva   hbp://en.wikipedia.org/wiki/Mauro_Silva   1102testa  SOCCER   WORLD  CUP   hCp://en.wikipedia.org/wiki/1998_FIFA_World_Cup   hCp://en.wikipedia.org/wiki/FIFA_World_Cup   791  PRESS   Chinese   hbp://en.wikipedia.org/wiki/People’s_Republic_of_China   hbp://en.wikipedia.org/wiki/Chinese_language   179  Soccer   Liechenstein   hCp://en.wikipedia.org/wiki/Liechtenstein_na8onal_football_team   hCp://en.wikipedia.org/wiki/Liechtenstein   703  Cricket   Pakistan   hbp://en.wikipedia.org/wiki/Pakistan_na%onal_cricket_team   hbp://en.wikipedia.org/wiki/Pakistan   1323testb  Frankfurt   Frankfurt   hbp://en.wikipedia.org/wiki/Frankfurt_Stock_Exchange   hbp://en.wikipedia.org/wiki/Frankfurt_am_Main   1054testa  CRICKET   ENGLAND   hbp://en.wikipedia.org/wiki/England_cricket_team   hbp://en.wikipedia.org/wiki/England  
  • 26. www.insight-­‐centre.org   AIDA  CoNLL  –  Some  findings   •  Even  misspelled  text  is  marked.   •  “Classes”  and  “instances”  are  annotated.  
  • 27. www.insight-­‐centre.org   AIDA  CoNLL   •  39  Classes   hbp://dbpedia.org/ontology/Agent   2579   hbp://xmlns.com/foaf/0.1/Person   426   hbp://dbpedia.org/ontology/Place   333   hbp://dbpedia.org/ontology/City   234   hbp://dbpedia.org/ontology/Country   194   hbp://dbpedia.org/ontology/Administra%veRegion   76   hCp://dbpedia.org/ontology/Newspaper   55   hbp://dbpedia.org/ontology/ArchitecturalStructure   39   hCp://dbpedia.org/ontology/EthnicGroup   30   hbp://dbpedia.org/ontology/Airport   21   hCp://dbpedia.org/ontology/Event   18   hbp://dbpedia.org/ontology/Island   12   hCp://dbpedia.org/ontology/Film   10   hbp://dbpedia.org/ontology/BodyOfWater   10  
  • 28. www.insight-­‐centre.org   AIDA  CoNLL  –  Some  findings   •  Not  only  Person,  Loca%on  and  Organiza%on.  
  • 29. www.insight-­‐centre.org   Experiments   •  How  were  those  en%%es  annotated?   •  Which  Wikipedia  pages  were  chosen  as   represen%ng  en%%es?  
  • 30. www.insight-­‐centre.org   Experiments   •  How  were  those  en%%es  annotated?   •  Which  Wikipedia  pages  were  chosen  as   represen%ng  en%%es?   •  What  is  the  Annota8on  Guideline?  
  • 31. www.insight-­‐centre.org   Experiments   •  What  is  an  En%ty?     •  What  have  been  iden%fied  as  en%%es?   •  How  to  manually  detect  en88es  from  text?   •  How  the  defini%on  of  En%ty  change  from  one   domain  to  another?      
  • 32. www.insight-­‐centre.org   Experiments   •  Survey  on  Annota%on  Guidelines   – Ques%on:  “Is  there  any  guideline  for  en%ty   annota%on?”   – Search  Strategy:   •  Papers  from  “en%ty  annota%on  guidelines”.   •  Guidelines  from  annotated  corpora  provided  by  En%ty   Recogni%on,  Disambigua%on  and  Linking  challenges.  
  • 33. www.insight-­‐centre.org   Experiments   •  Survey  on  Annota%on  Guidelines   – Common   Problems   (differ   from   one   domain   to   another)   •  Men%on  Boundaries   •  Name  varia%ons   •  Metonymy   – Annota%on  Process   – Evalua%on  
  • 34. www.insight-­‐centre.org   Next  Steps   •  Corpus  Sampling  for  Annota%on   •  Development  of  Annota%on  Guidelines   – Domain/Task  dependent   – Itera%ve  Process   •  Domains:   – Touris%c  Domain  (TripAdvisor  corpus)   – Electronics  Domain   – Other  
  • 35. www.insight-­‐centre.org   Next  Steps   •  What  is  an  En%ty?     •  What  have  been  iden%fied  as  en%%es?   •  How  to  manually  detect  en%%es  from  text?   •  How  the  defini8on  of  En8ty  change  from  one   domain  to  another?      
  • 36. www.insight-­‐centre.org   Next  Steps   •  What  is  an  En%ty?     •  What  have  been  iden%fied  as  en%%es?   •  How  to  manually  detect  en%%es  from  text?   •  How  the  defini%on  of  En%ty  change  from  one   domain  to  another?   •  How  to  iden8fy  the  most  frequent  classes  in   a  domain?