SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
EVALITA 2018
EVALUATION OF NLP AND SPEECH TOOLS FOR ITALIAN
Overview of the EVALITA 2018 Solving
language games (NLP4FUN) Task
Pierpaolo Basile, Marco de Gemmis
Lucia Siciliani, Giovanni Semeraro
Dipartimento di Informatica
Università degli Studi di Bari Aldo Moro, Italy
EVALITA 2018 Workshop
December 12-13 2018, Turin
“La Ghigliottina”
EVALITA 2018 Workshop
December 12-13 2018, Turin
“La Ghigliottina”
The solution is pacco:
✓ “Pacco, doppio pacco e contropaccotto” (movie)
✓ Carta da pacco
✓ Pacco di soldi
✓ Pacco di pasta
✓ Pacco regalo
EVALITA 2018 Workshop
December 12-13 2018, Turin
Motivation
● Language Games have attracted the attention of
researchers in the fields of AI and NLP
○ Jeopardy!, crossword puzzles
● “La Ghigliottina” is a challenging language
game which demands knowledge covering a
broad range of topics
○ take advantage from the availability of open
repositories and the web
○ cultural and linguistic background are
necessary to understand clues
EVALITA 2018 Workshop
December 12-13 2018, Turin
Task and dataset
● The task: given a set of five words - the
clues - each linked in some way to a
specific word that represents the unique
solution of the game
○ clues are unrelated to each other
○ the player has one minute to find the
solution!!!
● Dataset: set of games taken from
○ the TV show “L’Eredità”
○ the board game “L’Eredità”
EVALITA 2018 Workshop
December 12-13 2018, Turin
Data format
<games>
<game>
<id>3fc953bd...</id>
<clue>uomo</clue>
<clue>cane</clue>
<clue>musica</clue>
<clue>casa</clue>
<clue>pietra</clue>
<solution>chiesa</solution>
<type>TV</type>
</game>
...
</games>
● XML format
● a root element
games which
contains several
game elements
● each game has five
clue elements and
one solution
● the element type
specifies the type of
the game: TV or
board game
EVALITA 2018 Workshop
December 12-13 2018, Turin
Output
The participants must return a ranked list of
solutions in plain text file:
id solution score rank time
For example:
3fc953bd-... porta 0.978 1 3459
3fc953bd-... chiesa 0.932 2 3251
3fc953bd-... santo 0.897 3 4321
...
3fc953bd-... carta 0.321 100 2343
MAX 100
candidate
solutions for each
game
EVALITA 2018 Workshop
December 12-13 2018, Turin
Output
The participants must return a ranked list of
solutions in plain text file:
id solution score rank time
For example:
3fc953bd-... porta 0.978 1 3459
3fc953bd-... chiesa 0.932 2 3251
3fc953bd-... santo 0.897 3 4321
...
3fc953bd-... carta 0.321 100 2343
time taken by the
system to
compute the
solution is
reported in
milliseconds
EVALITA 2018 Workshop
December 12-13 2018, Turin
Dataset: statistics
● Games have different levels of difficulty
○ instances taken both from the TV game and
from the official board game
● Training set: 315 instances of the game
○ 64.8% (TV game), 35.2% (board game)
● Test set: 105 instances of the game
○ 62.9% (TV game)
○ 37.1% (board game)
● 300 fake games (automatically created)
added in the evaluation data
EVALITA 2018 Workshop
December 12-13 2018, Turin
Evaluation
● a (time) weighted version of Mean
Reciprocal Rank (MRR)
● G is the set of games
● rg
is the rank of the solution
● tg
denotes the minutes taken by the system
to give the solution
EVALITA 2018 Workshop
December 12-13 2018, Turin
Participants
● 12 registered teams
● only 2 team submitted results
○ UNIOR4FUN: the idea is that clue words and
the corresponding solution are often part of a
multiword expression (multiword expressions
are filtered by linguistic patterns)
○ LucaSquadrone: co-occurrences of clues and
candidate solutions
EVALITA 2018 Workshop
December 12-13 2018, Turin
Results
● UNIOR4NLP reports very high MRR, the
system is able to place the solution in the
first positions
● Squadrone system takes more time for
solving games MRR≠MRR (std)
System MRR MRR (std) Solved
UNIOR4NLP 0.6428 0.6428 81.90%
Squadrone 0.0134 0.0350 25.71%
EVALITA 2018 Workshop
December 12-13 2018, Turin
Comments
Reported results are remarkable but some
difficult games requiring inference are
unsolved:
● uno, notte, la trippa, auto, palazzo → portiere
○ uno is the number generally assigned to the
role of the goalkeeper (portiere)
○ “La Trippa” is the surname of “Antonio La
Trippa”, a character of the Italian movie “Gli
onorevoli”, whose job is the porter (portiere) of
a building
EVALITA 2018 Workshop
December 12-13 2018, Turin
Conclusions
● Challenging task
● Good results when the solution is a
multiword expression
○ inference is hard to tackle
● Few participants
○ Is the task too difficult?
○ Do no-classification tasks attract few
participants?
● Mobile app “Ghigliottiniamo”
○ integrate your artificial player through REST API,
contact support@quiztime.io
EVALITA 2018 Workshop
December 12-13 2018, Turin
Thank you!
Download our dataset from the GitHub
EVALITA 2018 repository
https://github.com/evalita2018/data

Weitere ähnliche Inhalte

Mehr von Pierpaolo Basile

La macchina più geek dell’universo The Turing Machine
La macchina più geek dell’universo The Turing MachineLa macchina più geek dell’universo The Turing Machine
La macchina più geek dell’universo The Turing MachinePierpaolo Basile
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...Pierpaolo Basile
 
Building WordSpaces via Random Indexing from simple to complex spaces
Building WordSpaces via Random Indexing from simple to complex spacesBuilding WordSpaces via Random Indexing from simple to complex spaces
Building WordSpaces via Random Indexing from simple to complex spacesPierpaolo Basile
 
Analysing Word Meaning over Time by Exploiting Temporal Random Indexing
Analysing Word Meaning over Time by Exploiting Temporal Random IndexingAnalysing Word Meaning over Time by Exploiting Temporal Random Indexing
Analysing Word Meaning over Time by Exploiting Temporal Random IndexingPierpaolo Basile
 
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...Pierpaolo Basile
 
A Study on Compositional Semantics of Words in Distributional Spaces
A Study on Compositional Semantics of Words in Distributional SpacesA Study on Compositional Semantics of Words in Distributional Spaces
A Study on Compositional Semantics of Words in Distributional SpacesPierpaolo Basile
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringPierpaolo Basile
 
Sst evalita2011 basile_pierpaolo
Sst evalita2011 basile_pierpaoloSst evalita2011 basile_pierpaolo
Sst evalita2011 basile_pierpaoloPierpaolo Basile
 
AI*IA 2012 PAI Workshop OTTHO
AI*IA 2012 PAI Workshop OTTHOAI*IA 2012 PAI Workshop OTTHO
AI*IA 2012 PAI Workshop OTTHOPierpaolo Basile
 
Word Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information AccessWord Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information AccessPierpaolo Basile
 
Encoding syntactic dependencies by vector permutation
Encoding syntactic dependencies by vector permutationEncoding syntactic dependencies by vector permutation
Encoding syntactic dependencies by vector permutationPierpaolo Basile
 

Mehr von Pierpaolo Basile (13)

Diachronic Analysis
Diachronic AnalysisDiachronic Analysis
Diachronic Analysis
 
(Open) data hacking
(Open) data hacking(Open) data hacking
(Open) data hacking
 
La macchina più geek dell’universo The Turing Machine
La macchina più geek dell’universo The Turing MachineLa macchina più geek dell’universo The Turing Machine
La macchina più geek dell’universo The Turing Machine
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
 
Building WordSpaces via Random Indexing from simple to complex spaces
Building WordSpaces via Random Indexing from simple to complex spacesBuilding WordSpaces via Random Indexing from simple to complex spaces
Building WordSpaces via Random Indexing from simple to complex spaces
 
Analysing Word Meaning over Time by Exploiting Temporal Random Indexing
Analysing Word Meaning over Time by Exploiting Temporal Random IndexingAnalysing Word Meaning over Time by Exploiting Temporal Random Indexing
Analysing Word Meaning over Time by Exploiting Temporal Random Indexing
 
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
 
A Study on Compositional Semantics of Words in Distributional Spaces
A Study on Compositional Semantics of Words in Distributional SpacesA Study on Compositional Semantics of Words in Distributional Spaces
A Study on Compositional Semantics of Words in Distributional Spaces
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question Answering
 
Sst evalita2011 basile_pierpaolo
Sst evalita2011 basile_pierpaoloSst evalita2011 basile_pierpaolo
Sst evalita2011 basile_pierpaolo
 
AI*IA 2012 PAI Workshop OTTHO
AI*IA 2012 PAI Workshop OTTHOAI*IA 2012 PAI Workshop OTTHO
AI*IA 2012 PAI Workshop OTTHO
 
Word Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information AccessWord Sense Disambiguation and Intelligent Information Access
Word Sense Disambiguation and Intelligent Information Access
 
Encoding syntactic dependencies by vector permutation
Encoding syntactic dependencies by vector permutationEncoding syntactic dependencies by vector permutation
Encoding syntactic dependencies by vector permutation
 

Kürzlich hochgeladen

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 

Kürzlich hochgeladen (20)

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 

EVALITA 2018 NLP4FUN - Solving language games

  • 1. EVALITA 2018 EVALUATION OF NLP AND SPEECH TOOLS FOR ITALIAN Overview of the EVALITA 2018 Solving language games (NLP4FUN) Task Pierpaolo Basile, Marco de Gemmis Lucia Siciliani, Giovanni Semeraro Dipartimento di Informatica Università degli Studi di Bari Aldo Moro, Italy
  • 2. EVALITA 2018 Workshop December 12-13 2018, Turin “La Ghigliottina”
  • 3. EVALITA 2018 Workshop December 12-13 2018, Turin “La Ghigliottina” The solution is pacco: ✓ “Pacco, doppio pacco e contropaccotto” (movie) ✓ Carta da pacco ✓ Pacco di soldi ✓ Pacco di pasta ✓ Pacco regalo
  • 4. EVALITA 2018 Workshop December 12-13 2018, Turin Motivation ● Language Games have attracted the attention of researchers in the fields of AI and NLP ○ Jeopardy!, crossword puzzles ● “La Ghigliottina” is a challenging language game which demands knowledge covering a broad range of topics ○ take advantage from the availability of open repositories and the web ○ cultural and linguistic background are necessary to understand clues
  • 5. EVALITA 2018 Workshop December 12-13 2018, Turin Task and dataset ● The task: given a set of five words - the clues - each linked in some way to a specific word that represents the unique solution of the game ○ clues are unrelated to each other ○ the player has one minute to find the solution!!! ● Dataset: set of games taken from ○ the TV show “L’Eredità” ○ the board game “L’Eredità”
  • 6. EVALITA 2018 Workshop December 12-13 2018, Turin Data format <games> <game> <id>3fc953bd...</id> <clue>uomo</clue> <clue>cane</clue> <clue>musica</clue> <clue>casa</clue> <clue>pietra</clue> <solution>chiesa</solution> <type>TV</type> </game> ... </games> ● XML format ● a root element games which contains several game elements ● each game has five clue elements and one solution ● the element type specifies the type of the game: TV or board game
  • 7. EVALITA 2018 Workshop December 12-13 2018, Turin Output The participants must return a ranked list of solutions in plain text file: id solution score rank time For example: 3fc953bd-... porta 0.978 1 3459 3fc953bd-... chiesa 0.932 2 3251 3fc953bd-... santo 0.897 3 4321 ... 3fc953bd-... carta 0.321 100 2343 MAX 100 candidate solutions for each game
  • 8. EVALITA 2018 Workshop December 12-13 2018, Turin Output The participants must return a ranked list of solutions in plain text file: id solution score rank time For example: 3fc953bd-... porta 0.978 1 3459 3fc953bd-... chiesa 0.932 2 3251 3fc953bd-... santo 0.897 3 4321 ... 3fc953bd-... carta 0.321 100 2343 time taken by the system to compute the solution is reported in milliseconds
  • 9. EVALITA 2018 Workshop December 12-13 2018, Turin Dataset: statistics ● Games have different levels of difficulty ○ instances taken both from the TV game and from the official board game ● Training set: 315 instances of the game ○ 64.8% (TV game), 35.2% (board game) ● Test set: 105 instances of the game ○ 62.9% (TV game) ○ 37.1% (board game) ● 300 fake games (automatically created) added in the evaluation data
  • 10. EVALITA 2018 Workshop December 12-13 2018, Turin Evaluation ● a (time) weighted version of Mean Reciprocal Rank (MRR) ● G is the set of games ● rg is the rank of the solution ● tg denotes the minutes taken by the system to give the solution
  • 11. EVALITA 2018 Workshop December 12-13 2018, Turin Participants ● 12 registered teams ● only 2 team submitted results ○ UNIOR4FUN: the idea is that clue words and the corresponding solution are often part of a multiword expression (multiword expressions are filtered by linguistic patterns) ○ LucaSquadrone: co-occurrences of clues and candidate solutions
  • 12. EVALITA 2018 Workshop December 12-13 2018, Turin Results ● UNIOR4NLP reports very high MRR, the system is able to place the solution in the first positions ● Squadrone system takes more time for solving games MRR≠MRR (std) System MRR MRR (std) Solved UNIOR4NLP 0.6428 0.6428 81.90% Squadrone 0.0134 0.0350 25.71%
  • 13. EVALITA 2018 Workshop December 12-13 2018, Turin Comments Reported results are remarkable but some difficult games requiring inference are unsolved: ● uno, notte, la trippa, auto, palazzo → portiere ○ uno is the number generally assigned to the role of the goalkeeper (portiere) ○ “La Trippa” is the surname of “Antonio La Trippa”, a character of the Italian movie “Gli onorevoli”, whose job is the porter (portiere) of a building
  • 14. EVALITA 2018 Workshop December 12-13 2018, Turin Conclusions ● Challenging task ● Good results when the solution is a multiword expression ○ inference is hard to tackle ● Few participants ○ Is the task too difficult? ○ Do no-classification tasks attract few participants? ● Mobile app “Ghigliottiniamo” ○ integrate your artificial player through REST API, contact support@quiztime.io
  • 15. EVALITA 2018 Workshop December 12-13 2018, Turin Thank you! Download our dataset from the GitHub EVALITA 2018 repository https://github.com/evalita2018/data