3. Information Retrieval
Usual IR process
Document
Document
Document
Query
Indexing
Query
representation
Indexing
Matching
Document
Document
representation
Document
representation
representation
List of
estimated
relevant
documents
3
4. Contextual Information Retrieval
Notion of context in IR
Information
User
Software
Device
How to consider context in IR process ?
Q1 : Retrieve items corresponding to the context
Q2 : Retrieve the context corresponding to items
4
5. Contextual IR
Context integration in Q1
Document
Document
Document
Query
Context
Indexing
Query
representation
Indexing
Matching
Document
Document
representation
Document
representation
representation
List of estimated
relevant documents
Reranking
List of estimated
relevant documents
5
6. TREC
Text Retrieval Conference
Organized by the NIST (USA) since 1992
Based on the Cranfield paradigm of retrieval system evaluation
A set of documents (Collection)
A set of information needs (Topics/Queries)
A set of relevance judgments (Qrels)
Various tracks: AdHoc, Robust, Web…
Evaluation measures
System output:
retrieved documents
A
relevant, retrieved
(True positive)
C
Information need:
relevant documents
relevant, not retrieved
(False negative)
irrelevant, retrieved
(False positive)
B
irrelevant, not retrieved
(True negative)
Document collection
D
precision =
A
A+B
A
recall =
A+C
AP (Average Precision),
MAP (Mean Average Precision),
P@5 (Precision at 5 retrieved documents)
…
6
8. TREC Contextual Suggestion Track 2012
Where to go around here
on this Sunday afternoon?
Great summer !!!
8
9. TREC Contextual Suggestion Track 2012
Retrieve items corresponding to the context (Q1)
Items = Suggestions
Places to visit (shops, restaurants, parks…) around the user (5 hours by
car max.)
Collection = Open Web (Websites)
Context =
Spatiotemporal data
<context number=”1”>
<city>Portland</city>
<state>Oregon</state>
<lat>45.5</lat>
<long>-122.7</long>
<day>weekday</day>
<time>evening</time>
<season>fall</season>
</context>
User preferences
<profile number=”1”>
<example number=”1” initial=”1” final=”1”/>
<example number=”2” initial=”0” final=”-1”/>
</profile >
<example number=”1”>
<title> Dogfish Head Alehouse </title>
<description>Craft Brewed Ales and tasty wood
grilled food
</description>
<url>http://www.dogfishalehouse.com/</url>
</example>
<example number=”2”>
<title>The Flaming Pit</title>
<description>
The Flaming Pit Restaurant and Piano Lounge,
home of Tyrone DeMonke.
</description>
<url>http://www.flamingpitrestaurant.com/</url>
</example>
9
10. TREC Contextual Suggestion Track 2012
Two subtasks
S1 : Suggestions corresponding to spatiotemporal data
List of suggestions for each context
S2 : S1 + user preferences
List of suggestions for each profile (user) and each context
Suggestion = Title + Description + Url
<context2012 groupid=”waterloo” runid=”watcs12a”>
<suggestion profile=”1” context=”1” rank=”1”>
<title>Deschutes Brewery Portland Public House</title>
<description>
Deschutes Brewery’s distinct Northwest brew pub in Portland’s Pearl District has
become a convivial gathering spot of beer and food lovers since it’s 2008 opening.
</description>
<url>http://www.deschutesbrewery.com</url>
</suggestion>
etc.
</context2012>
2 “runs” maximum
Our participation
Team : G. Cabanac & G. Hubert (IRIT – Univ. of Toulouse)
2 runs submitted to S2 subtask
10
11. TREC Contextual Suggestion Track 2012: Our approach
Contextual IRS framework 2012
Input data
Internal process
Output data
Place sets
Intermediate data
Database
External resource
Context processing
Contexti
Preference processing
Profilei
Place
query
Place selection
Preference
definition
Contextual
list of
places
Google Places
API
Positive
preferencesi
Personalization
Negative
Contextual
list of
detailed
places
Place
description
enrichment
preferencesi
Examples
Personalized
suggestions
Bing
Google
Useri
11
12. TREC Contextual Suggestion Track 2012: Our approach
Spatiotemporal data
User preferences
Coarse-grained approach : iritSplit3CPv1
Merging of descriptions of examples with initial and final = 1 -> Pref+(P)
Merging of descriptions of examples with initial and final = -1 -> Pref-(P)
score(P,r) = cosine(Pref+(P),R) − cosine(Pref−(P),R)
Fine-grained approach : iritSplit3CPv2
Example description with initial and final = 1 -> Pref+l(P)
Example description with initial and final = -1 -> Pref-m(P)
score(P, r) = max(cosine(Pref+l (P), r))− max(cosine(Pref−m(P), r))
12
13. TREC Contextual Suggestion Track 2012: Results
Evaluations
For each profile and each context
Different dimensions : W (Website), G (Geographical), T (Temporal),
and D (Description), and combinations (WGT and GT)
Two measures : P@5 and MRR (Mean Reciprocal Rank)
iritSplit3CPv1
iritSplit3CPv2
13
17. TREC Contextual Suggestion Track 2013
Context =
Spatial only
{
"1": {
"lat": "40.71427", "city": "New York City", "state": "NY", "long": "-74.00597”
},
…
}
{
"1": {
"url": http://www.freshrestaurants.ca,
"description": "Our vegan menu boasts an array of
exotic starters, multi-layered salads, filling wraps,
high protein burgers and our signature Fresh
bowls.”,
"title": "Fresh on Bloor”
},
“2": {
"url": http://www.flamingpitrestaurant.com/,
"description": "The Flaming Pit Restaurant and
Piano Lounge, home of Tyrone DeMonke.”,
"title": "The Flaming Pit”
},
…
User preferences
{
"1": [
{"attraction_id": 1, "website": 1, "description": 0},
...
],
"2": [
{"attraction_id": 1, "website": 4, "description": 3},
…
],
”3": [
{"attraction_id": 1, "website": -1, "description": 2},
…
],
…
}
}
17
18. TREC Contextual Suggestion Track 2013
Two subtasks
Open Web
Same question: Suggest places items corresponding to the context (Q1)
Places to visit (restaurants, museums…) around the user (5 hours by car)
Collection = Open Web (Websites)
ClueWeb
ClueWeb12 (Same question as OpenWeb)
ClueWeb12 Contextual suggestion subcollection
Sets of ClueWeb12 documents per context
Question: Personalization per user profile
2 “runs” maximum
Our participation
Team: G. Cabanac, G. Hubert & K. Pinel-Sauvagnat (IRIT – Univ. of Toulouse)
C. Sallaberry (LIUPPA – Univ. of Pau)
D. Palacio (GeoComp – Univ. of Zurich)
1 “run” Open Web
1 “run” ClueWeb (Contextual suggestion subcollection)
18
19. TREC Contextual Suggestion Track 2013: Our approach
Contextual IRS framework 2013
Input data
Intermediate data
L: Lucene W: WordNet
GP: Google Places
T: Terrier
Useri
GN: Geonames
Output data
Y: Yahoo! BOSS Geo
Process
P: PostGis GG: Gisgraphy
Useri
Examples
Examples
Profilei
Contexti
B: Bing
1
Profilei
Posi ve
preferencesi
L, T, W
Preference
processing
1
Contexti
Nega ve
preferencesi
Posi ve
preferencesi
L, T, W
Preference
processing
Nega ve
preferencesi
Categories
of interesti
Categories
of interesti
Predefined
categories
2
2
GP
Context
processing
3
3
GN, Y, P, GG, B
Contextual
list of places
Place filtering &
descrip on
enrichment
4
T
Retrieval
B
Place
filtering &
descrip on
enrichment
list of places
4
Ranking &
refinement
Ranking
Personalized
sugges ons
Personalized
sugges ons
a) Open Web
b) ClueWeb
19
20. Example of suggestion in 2013
Title: Celtic Mist Pub
Description:
Place types: bar, establishment.
This place is about .3 Km West from here (2 min by car with no
traffic).
Address: 117 South 7th Street, Springfield.
There are 11 POIs around: 2 Hotels, 3 Libraries, 3 Parks, 1
PostOffice, 2 Religious.
Snippet: Located in Springfield, IL the Celtic Mist is your home
away from home with over 16 imported beers on tap and a
friendly staff ready to serve you…
URL: http://www.celticmistpub.com/
20
21. Example of suggestion in 2012
Title: Oakley Pub and Grill
Description
Oakley Pub and Grill - Located in Oakley Square, Cincinnati, Ohio.
Local pub with pleasant atmoshpere and great food. Voted #1
Best Burger in Cincinnati. Outdoor ...
PUB and GRILL OAKLEYOAKLEY Oakley Pub and Grill ~ 3924
Isabella Avenue ~ Cincinnati, Ohio 45209 On Oakley Square ~
(513) 531-2500 www.oakleypub.com Used with permission…
URL: http://oakleypubandgrill.com/
21
24. Result Analysis
First edition (2012)
All the participants discovered the track principles
Worst results: Descriptions of suggestions
Second edition (2013)
OpenWeb
Focus on suggestion descriptions
Changes in relevance judgments
ClueWeb
Misunderstanding of guidelines or insufficient details
Next edition: TREC Contextual Suggestion Track 2014
Close to TREC Contextual Suggestion Track 2013
Future work
Experiment framework variants on 2013 data
Replace limited online tools/services
Process larger collection: ClueWeb12 (870 millions pages, ~27TB)
24