6. 01
Relevance definition
• Dictionary.com:-
the condition of being relevant, or connected with the
matter at hand
• Cambridge:
- the degree to which something is related or useful to
what is happening or being talked about
• Merriam-Webster (2):
- the ability (as of an information retrieval system) to
retrieve material that satisfies the needs of the user
http://www.dictionary.com/browse/relevance
http://dictionary.cambridge.org/us/dictionary/english/relevance
https://www.merriam-webster.com/dictionary/relevance
7. 01
Measuring relevance of a result set
• Precision: % of the results that are relevant
• Recall: % of the relevant set that is in the results
• Only evaluates the quality of the result set
• What is the size of the relevant set?
• We need a measure for individual documents
https://en.wikipedia.org/wiki/Precision_and_recall
8. 01
What about sorting?
A list of results is only as good as its order
• Sort by any type of field:
- Sorting by text
- Sorting by number
- Sorting by rank
• Sort by relevance:
- Sorting by similarity score
10. 01
Similarity score
Score for individual documents
• Similarity defines the components of Lucene scoring
• Calculated per query per field ( 𝑠 𝑐𝑜𝑟𝑒↓𝑞, 𝑓 )
- BM25Similarity
- TFIDFSimilarity
- BooleanSimilarity
- MultiSimilarity
- PerFieldSimilarityWrapper
https://lucene.apache.org/core/6_6_0/
core/org/apache/lucene/search/similarities/Similarity.html
11. 01
Similarity parameters
• tf: term frequency
- How many times a term appears in a document
- The more times the term appears the more relevant it
is
• idf: inverted document frequency:
- One divided by how many documents in the whole
corpus contains the term
- The more document contain the document the less
semantically meaningful and therefore relevant it is
12. 01
Similarity parameters
• norm: field norm
- Inversed length of the field
- The longer the field the less relevant a term
appearance is
• C: coordinator factor:
- Factor multiplied to the sum of the scores for all
terms in the query
- Makes the score end in a reasonable interval of
values
13. 01
TF-IDF
Term Frequency x Inversed document frequency
• Classical Information retrieval score
• Default in Solr/Lucene from first versions to 6
𝒔 𝒄𝒐𝒓𝒆↓𝒒, 𝒅 = 𝑪↓𝒒, 𝒅 ∑𝒕∈ 𝒒↑▒ 𝒕 𝒇↓𝒕, 𝒅 ∙ 𝒊 𝒅𝒇↓𝒕 ∙ 𝒏 𝒐𝒓𝒎↓𝒅
https://lucene.apache.org/core/6_6_0/
core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
16. 01
Okapi BM25
Best Matching 25
• Calculates relevance based on probability
• The effect of term frequency has a ceiling: k1
• The length of the field affects the tf curve: b
• Default from version 6
𝒔 𝒄𝒐𝒓𝒆↓𝒒, 𝒅 = 𝑪↓𝒒, 𝒅 ∑𝒕∈ 𝒒↑▒∙ 𝒊 𝒅𝒇↓𝒕 ∙ 𝒕 𝒇↓𝒕, 𝒅 ∙( 𝒌↓ 𝟏 +𝟏)/ 𝒕 𝒇↓𝒕, 𝒅
+ 𝒌↓ 𝟏 ∙(𝟏− 𝒃+ 𝒃∙| 𝒅|/𝒂𝒗𝒈(|𝒅|) )
𝒌↓ 𝟏 =𝟏.𝟐
𝒃=𝟎.𝟕𝟓
21. 01
Edismax
Extended disyuntive maximum
• Relevance on several fields
- fq=field1^boost field2&q=query
• Get the score of the best (maximum) match
- Except if use tie parameter to factor in other fields
• AND, OR, or Minimum match
- Can manipulate how many term matches required
22. 01
Understanding the calculations
To understand the order of scores use debug parameter
• debug=results or debugQuery
1.8534669
=
weight(_text_:book
in
3)
[],
result
of:
1.8534669
=
score(doc=3,freq=6.0
=
termFreq=6.0
),
product
of:
1.1108824
=
idf(docFreq=13,
docCount=40)
1.6684636
=
tfNorm,
computed
from:
6.0
=
termFreq=6.0
1.2
=
parameter
k1
0.75
=
parameter
b
142.975
=
avgFieldLength
256.0
=
fieldLength
25. 01
Relevance testing
How do we work with relevance?
1. Change the relevance model
2. Evaluate
3. Improve
How to quantitatively evaluate the relevance model?
• Relevance testing
26. 01
Relevance testing
• Get a domain expert
• Get the most popular search queries
• Select the relevant document[s] for those queries
• Test [automated/scripted] in what position the relevant
document is in the result set
• Aggregate all the test cases
- In 1-3 first results -> OK
- In 4-pagesize results -> Can be improved
- After -> Somethings wrong
27. 01
Relevance testing
Query Context Relevant result
Result
position
google! google.com! 1!
Solr
revolution! Year:2017! lucenerevolution.org! 1!
Relevance
in the wild!
https://
lucenesolrrevolution2017.sch
ed.com/event/BAwt/
relevance-in-the-wild!
4!
edismax!
https://cwiki.apache.org/
confluence/display/solr/The
+Extended+DisMax+Query
+Parser!
6!
28. 01
Behavior analysis and overoptimization
Track your systems queries and click-throughs
• It is a great start for relevance testing
• Most common queries improvements will have the biggest
impact
• Users know what is relevant for them when they see it
• Do not focus only on the top X cases
30. 01
Real cases
• Oversimplified list of cases and recommendations
• There are many ways of solving the same problem
- Use your domain knowledge
- Choose the best fitting alternative performance/
flexibility
• All parameters can be
- configured as defaults in a request handler
- sent independently in each request
• Remember to normalize your boosts
31. 01
Classical Information model
Have an article database: library, intranet, website,
catalogue …
• Multiple metadata on different fields
• Information has variable importance
• Use edismax fq parameter to weight different fields
- fq=id^5 title^3 description
comments^0.1
32. 01
Synonym expansion
Sometimes queries contain similar terms not appearing on
the index
• Important domain concepts have several names for the
user
• They are always referred in one way on the index
• Use synonym expansion to find all the similar matches
- <filter
class="solr.ManagedSynonymFilterFactory" >
- {"Vacation”:["Holiday"]}
33. 01
Document type tiering
Have a set of information belonging different categories
• Certain documents types are just more important
• Add a field with the document type factor boost and
use the boost parameter
- boost=documenttypeboost
- Multiplicative, reindex for different boosts
• Use boost query for each type
- bq=documenttype:important^boost
- Additive, more complex query
34. 01
Date promotion slope
Information is less relevant when old
• News search, Event related information
• Add a date decay from today to X months based on the
last updated field
- boost=recip(ms(NOW,mydatefield),
3.16e-11,1,1)
• There are approximately 3.16e10 milliseconds in a year,
so one can scale dates to fractions of a year with the
inverse, or 3.16e-11
35. 01
User customization
Applications with registered users know their context
• Intranet system know user department (employee
registry)
• Website with registration knows users preferences
(history)
• Add boost queries with matching information between
metadata and user context
- bq=department:IT
- bq=category:electronics
36. 01
Physical proximity relevance
Most applications can request user location
• Promote the shops/restaurants that are closeby
• Add boost function calculating distance to the user
- bf=div(1/geodist(docloc,
36.092938,-115.173179))
37. 01
Rating/Popularity system
Catalogues often provide the possibility of article review
• Promote articles that get most views
• Promote articles that get a good average rating
• Add boost function applying the score of the article
- bf=sqrt(articlepageviews)-
bf=if(gte(reviews,req),sqrt(articleavgrevie
w),0)
38. 01
Commerce/Sales parameters
Search on e-shops have many parameters to take into
account
• Prefer higher profit margin
- bf=price-cost
• Encourage last items or many in stock
- bf=stock OR bf=1/stock
• Campaing recommendations
- bq=category:campaigncategory
39. 01
Combine them
All of this techniques can be applied together
• Just add more parameters
• Must balance each one
• Do not forget to evaluate changes
40. 01
Summary
• Work on relevance à Improve user perception
• There are plenty of tools/parameters available in
Solr
• Improve your data and usage tracking
• Evaluate changes à Guaranteed improvement