Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
A Fully-automatic Approach to Answer Geographic Queries: GIRSA-WP at GikiP
1. A fully-automatic approach
to answer geographic queries:
GIRSA-WP at GikiP
Johannes Leveling Sven Hartrumpf
Intelligent Information and Communication Systems (IICS)
University of Hagen (FernUniversität in Hagen)
58084 Hagen, Germany
firstname.lastname@fernuni-hagen.de
2. GIRSA-WP
J. Leveling,
S. Hartrumpf
Main idea
Main idea
GIRSA-WP
InSicht (Hartrumpf, 2005)
Semantic filter • open-domain QA system
Experiments • based on matching semantic network representations
and Results of question and documents
Conclusions • supports question decomposition
References e.g. temporal or geographical constraints
+ GIRSA (Leveling and Hartrumpf, 2008)
• textual GIR system
• supports methods to boost recall
e.g. normalizing location indicators
• supports methods to boost precision
e.g. metonymy recognition
= GIRSA-WP (GIRSA for Wikipedia)
• automatic combination of InSicht and GIRSA
J. Leveling, S. Hartrumpf GIRSA-WP 2/9
3. GIRSA-WP
J. Leveling,
S. Hartrumpf
GIRSA-WP
Main idea
GIRSA-WP
Semantic filter
Experiments
and Results
• applies semantic filter on answer candidates
Conclusions • merges results from InSicht and GIRSA by using the
References maximum score of documents
• returns list of Wikipedia article names
• simple multilingual approach:
follow German Wikipedia links to articles in English and
Portuguese
J. Leveling, S. Hartrumpf GIRSA-WP 3/9
4. GIRSA-WP
J. Leveling,
S. Hartrumpf
Semantic filter (1/2)
Main idea
GIRSA-WP
• in QA: check expected answer type of answer
Semantic filter
candidates
Experiments
and Results • for GIRSA-WP: check semantic answer types
Conclusions (semantic sort and features, see Helbig (2006))
References • extract word representing the answer type from topic
title and description (the first noun not a proper noun)
• parse these words with WOCADI, a syntactico-semantic
parser (includes a disambiguation of words) and find
semantic features corresponding to the extracted words
• parse the answer candidates (titles of Wikipedia
articles) and determine their semantic features
• test if unification of semantic features succeeds;
discard answer candidate, otherwise
J. Leveling, S. Hartrumpf GIRSA-WP 4/9
5. GIRSA-WP
J. Leveling,
S. Hartrumpf
Semantic filter (2/2)
Main idea
GIRSA-WP
• Which Swiss cantons border Germany?
Semantic filter
Experiments
→ extracted word: cantons
and Results • parse result: corresponding concept is canton
Conclusions
• artificial geographical entity or regional institution
References
• legal-person:+, movable:–, etc.
• answer candidate Cross-Border-Leasing:
• prototypical-theoretical-concept
• legal-person:–, movable:–
→ semantic features not unifiable
• answer candidate Aargau:
→ unifiable semantic features
J. Leveling, S. Hartrumpf GIRSA-WP 5/9
6. GIRSA-WP
J. Leveling,
S. Hartrumpf
Experiments and results
Main idea
GIRSA-WP
Semantic filter
Experiments • six runs submitted:
and Results
three with threshold score of 0.01 and
Conclusions
References
varied settings for stemming, location name
normalization, and noun decompounding;
additional three experiments with threshold
score of 0.03
• 798 (372) answers found
• 79 correct answers in best run
J. Leveling, S. Hartrumpf GIRSA-WP 6/9
7. GIRSA-WP
J. Leveling,
S. Hartrumpf
Conclusions (1/2)
Main idea
GIRSA-WP
Semantic filter
GikiP topics
Experiments • are at least as difficult as QA or GeoCLEF topics
and Results
Conclusions
• aim at a wider range of expected answer types
References • include complex geographic relations
(GP2: outside, GP4: on the border ),
restrictions on measurable properties
(GP3: more than, GP13: longer than), and
temporal constraints
(GP9: Renaissance, GP15: between 1980 and 1990)
⇒ new challenge for QA and GIR community
J. Leveling, S. Hartrumpf GIRSA-WP 7/9
8. GIRSA-WP
J. Leveling,
S. Hartrumpf
Conclusions (2/2)
Main idea
GIRSA-WP
• GIRSA:
Semantic filter
• indexing single sentences was meant to ensure a high
Experiments
and Results precision (but did not work);
Conclusions • geographic entities have not been annotated at all in the
References Wikipedia documents
• InSicht:
• important information is given in tables (like inhabitant
numbers), but WOCADI ignores these
• the semantic matching approach is still too strict for the
IR oriented parts of GikiP queries (similarly for
GeoCLEF)
⇒ tasks for future work
J. Leveling, S. Hartrumpf GIRSA-WP 8/9
9. GIRSA-WP
J. Leveling,
S. Hartrumpf
Selected References
Main idea
GIRSA-WP Hartrumpf, S. (2005). Question answering using sentence parsing and
Semantic filter semantic network matching. In Multilingual Information Access for
Experiments Text, Speech and Images: 5th Workshop of the Cross-Language
and Results Evaluation Forum, CLEF 2004 (edited by Peters, C.; Clough, P.;
Conclusions Gonzalo, J.; Jones, G. J. F.; Kluck, M.; and Magnini, B.), volume 3491
References of LNCS, pp. 512–521. Berlin: Springer.
Helbig, H. (2006). Knowledge Representation and the Semantics of
Natural Language. Berlin: Springer.
Leveling, J. and Hartrumpf, S. (2008). Inferring location names for
geographic information retrieval. In Advances in Multilingual and
Multimodal Information Retrieval: 8th Workshop of the
Cross-Language Evaluation Forum, CLEF 2007 (edited by Peters, C.;
Jijkoun, V.; Mandl, T.; Müller, H.; Oard, D. W.; Peñas, A.; Petras, V.;
and Santos, D.), volume 5152 of LNCS, pp. 773–780. Berlin:
Springer.
J. Leveling, S. Hartrumpf GIRSA-WP 9/9