Learning to Rank (LTR) presentation at RELX Search Summit 2018. Contains information about history of LTR, taxonomy of LTR algorithms, popular algorithms, and case studies of applying LTR using the TMDB dataset using Solr, Elasticsearch and without index support.
2. RELXSearchSummIt2018
Outline
• History
• Problem setup
• Learning to Rank Algorithms
• Practical Considerations
• LTR Case Studies (Solr, Elasticsearch, DIY)
• Wrap Up
Learning to Rank - what it is, how it's done, and what it can do for you 2
3. RELXSearchSummIt2018
History
• 1992: Idea of LTR (or Machine Learned Ranking) first proposed
• 2003: Altavista (later acquired by Yahoo!) using LTR in its engine
• 2005: Microsoft invents RankNet, deploys in Bing
• 2008: In contrast, Google’s engine hand tuned, relies on up ~200 signals
• 2009: Yandex invents and deploys MatrixNet in its engine
• 2016: Google says RankBrain is #3 signal to its search engine
• 2016: Bloomberg contributes LTR plugin to Solr
• 2017: Open Source Connections contributes LTR plugin in Elasticsearch
Learning to Rank - what it is, how it's done, and what it can do for you 3
5. RELXSearchSummIt2018
LTR Pipeline
Learning to Rank - what it is, how it's done, and what it can do for you 5
Image Credit: https://towardsdatascience.com/when-to-use-a-machine-learned-vs-score-based-search-ranker-aa8762cd9aa9
• Training: Build LTR
model using training
data (query, document,
label) triples
• Label is rank
• Inference: Use model to
predict label ŷ = h(x)
from unseen (query,
document) pairs
6. RELXSearchSummIt2018
Difference between search and LTR
• Search engines
• Use text based relevance – TF-IDF, BM25, etc.
• Unsupervised, backed by statistical models.
• LTR
• Can support different (application specific) notions of relevance. For example:
• Recommendations – depends on price, geolocation or user ratings.
• Question Answering – best text match might not return best answer, right set of features
may be hard to articulate explicitly.
• Supervised technique, needs labeled data to train.
• Just a re-ranker, search layer must return results to re-rank.
Learning to Rank - what it is, how it's done, and what it can do for you 6
7. RELXSearchSummIt2018
Difference between ML and LTR
• ML solves a prediction problem (classification or regression) for a
single instance at a time.
• LTR solves a ranking problem for a list of items – objective is to find an
optimal ordering of items.
Learning to Rank - what it is, how it's done, and what it can do for you 7
8. RELXSearchSummIt2018
Reasons to consider LTR
• Too many parameters to tune manually without overfitting to
particular query set.
• Ranking requirements not being met with traditional text based
search tools (including use of metadata fields).
• Availability of enough (implicit or explicit) good training data to train
LTR model.
Learning to Rank - what it is, how it's done, and what it can do for you 8
10. RELXSearchSummIt2018
Traditional Ranking Models
• Vector Space Models
• Boolean – predicts if document relevant to query or not
• TF-IDF – rank documents by cosine similarity between document and query
• Probabilistic Models
• BM25 – rank documents by log odds of relevance to query
• LMIR – probability of document’s LM generating terms in query
• Importance based Models
• HITS – rank documents by hubness/authority (inlinks/outlinks).
• PageRank – rank document by probability of random surfer arriving on page
• Impact Factor – rank documents by number of citations
Learning to Rank - what it is, how it's done, and what it can do for you 10
11. RELXSearchSummIt2018
Evaluation Metrics
• Mean Average Precision (MAP
@k)
• Mean Reciprocal Rank (MRR)
Learning to Rank - what it is, how it's done, and what it can do for you 11
• Normalized Discounted
Cumulative Gain (NDCG @k)
• Rank Correlation
12. RELXSearchSummIt2018
High Level Taxonomy of LTR Algorithms
• Pointwise – documents ranked by relevance of each (query,
document) pair
• Pairwise – documents ranked by considering priority between pairs of
(query, document) pairs
• Listwise – documents ranked by considering the entire relevance
ordering of all (query, Documents) tuples per query
Learning to Rank - what it is, how it's done, and what it can do for you 12
13. RELXSearchSummIt2018
Pointwise Approach
• Input: (query, document) pair (q, d)
• Output: score indicating rank on result list
• Model: 𝒇(q, d) → score
• Regression problem (in case of numeric scores) or Classification
problem (in case of relevant/irrelevant, or multi-level classes like
Perfect/Excellent/Good/Fair/Bad)
• Ordinal regression: include ordinal relationship between labels.
• Examples: SLR (Staged Logistic Regression), Pranking
Learning to Rank - what it is, how it's done, and what it can do for you 13
14. RELXSearchSummIt2018
Pairwise Approach
• Input: triples of (query, document pairs) (q, dA, dB)
• Output: one of [-1, 1]
• Model: 𝒇(q, dA, dB) → [-1, 1]
• Classification problem, learn binary classifier to predict [-1, 1] for a
given pair of (query, document pair) triples
• Goal is to minimize average number of inversions in ranking
• Examples: RankNet, RankSVM, LambdaMART
Learning to Rank - what it is, how it's done, and what it can do for you 14
15. RELXSearchSummIt2018
Listwise Approach
• Input: (query, Documents {d1, d2, …, dN})
• Output: desired ranked list of documents 𝕯
• Model: 𝒇(q, {d1, d2, …, dN}) → 𝕯
• Classification problem, with indirect loss functions such as
RankCosine or KL Divergence, or smoothing IR measures (since not
directly differentiable) and applying Gradient Descent
• Examples: AdaRank, ListNET, RankCosine, SVMMap
Learning to Rank - what it is, how it's done, and what it can do for you 15
16. RELXSearchSummIt2018
Commonly used Algorithms
• Linear Model
• Predicted rank is linear combination of input features
• RankNet
• Neural network based
• Good for binary (relevant/irrelevant) labels
• Weight matrix transforms input features into rank probabilities
• LambdaMART
• Tree (forest) based
• Good for multi-class labels
• Feature splits with thresholds
Learning to Rank - what it is, how it's done, and what it can do for you 16
18. RELXSearchSummIt2018
Acquiring labels
• Implicit
• Intrinsic features (words, phrases)
• Document metadata
• User Clicks
• Time spent on document
• Purchases (if applicable)
• Cheap to build but noisy
• Explicit
• Human expert rates relevancy of each document against query
• Cleaner but expensive to build
Learning to Rank - what it is, how it's done, and what it can do for you 18
19. RELXSearchSummIt2018
Feature Selection
• Document Features
• Document Length
• URL Length
• Publication Date
• Number of outlinks
• PageRank
• Query Features
• Number of words
• PER or ORG in query
Learning to Rank - what it is, how it's done, and what it can do for you 19
• Query-Document Features
• TF-IDF, BM25 similarity
• Frequency of query in anchor
text
• Document contains query words
in title
• User Dependent Features
• Star ratings
• Age, gender
• Device
20. RELXSearchSummIt2018
Unbalanced Datasets
• If dataset is unbalanced, i.e., classes are not represented
approximately equally, then use under- or oversampling to balance.
• Consider using something like SMOTE for oversampling instead of
naïve oversampling by duplication.
• Make sure no data leakage in case of oversampling.
Learning to Rank - what it is, how it's done, and what it can do for you 20
21. RELXSearchSummIt2018
LTR used as re-ranker
• LTR models are usually more
computationally expensive than
search engines.
• Search engine used to pull out
matched documents
• Top-N of these documents are fed into
the LTR model and top-n of those are
replaced with the output of the
model, for N >> n (typically 50-100x).
Learning to Rank - what it is, how it's done, and what it can do for you 21
Index
Query
Matched
(10k)
Scored
(10k)
Top 1000
retrieved
Re-ranked
Top 10
Ranking
Model
Image Credit: https://lucidworks.com/2016/08/17/learning-to-rank-solr/
22. RELXSearchSummIt2018
LTR Algorithm Implementations
• RankLib (Java) – from Lemur Project (UMass, CMU), provides
Coordinate Ascent, Random Forest (pointwise), MART, RankNet,
RankBoost (pairwise), LambdaMART (pair/listwise), AdaRank and
ListNet (listwise)
• SVMRank (C++) – from Cornell, provides SVMRank (pairwise)
• XGBoost (Python/C++) – LambdaRank (pairwise)
• PyLTR (Python) – LambdaMART (pairwise)
• Michael Alcorn (Python) – RankNet and LambdaMART (pairwise)
Learning to Rank - what it is, how it's done, and what it can do for you 22
23. RELXSearchSummIt2018
LETOR Data Format
2 qid:1 1:3 2:3 3:0 4:0 5:3 6:1 7:1 8:0 9:0 10:1 11:156... # 11
2 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:406... # 23
0 qid:1 1:3 2:0 3:2 4:0 5:3 6:1 7:0 8:0.666667 9:0 10:1 ... # 44
2 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:287 ... # 57
1 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:2009 ... # 89
Learning to Rank - what it is, how it's done, and what it can do for you 23
label
Query ID
Features: query, document, query/document, other
(sparse or dense format)
Comments (ex: docID)
25. RELXSearchSummIt2018
Preprocessing Data
• We use The Movie Database (TMDB) from Kaggle.
• 45k movies, 20 genres, 31k unique keywords
• We extract following fields: (docID, title, description, popularity,
release date, running time, rating (0-10), keywords, genres)
• Categorical labels 1-5 created from rating
• Objective is to build LTR model that learns the ordering implied by
rating and re-rank top 10 results using this model
• Features chosen: (query-title and query-description similarity using
TF-IDF and BM25, document recency, original score, and boolean 0/1
for each genre)
Learning to Rank - what it is, how it's done, and what it can do for you 25
26. RELXSearchSummIt2018
LTR with Solr
• Prepare Solr for LTR (add snippet to solrconfig.xml) and start with
solr.ltr.enabled=True
• Load data
• Define LTR features to be used to Solr
• Define dummy linear model to use Solr to extract features (rq) for some
queries to LETOR format
• Train RankLib LambdaMART model using extracted features
• Upload trained model definition to Solr
• Run Solr re-rank query (rq) using trained LTR model
• See notebooks – 02-solr/01 .. 04
Learning to Rank - what it is, how it's done, and what it can do for you 26
27. RELXSearchSummIt2018
LTR with Elasticsearch
• Install LTR plugin and load data
• Initialize feature store
• Define features – load feature templates into Elasticsearch
• Extract features (sltr) to LETOR format
• Train RankLib model (also supported natively XGBoost, SVMRank).
• Upload trained LTR model to Elasticsearch
• Run re-rank query (rescore) using trained LTR model
• See notebooks – 03-elasticsearch/01 .. 04
Learning to Rank - what it is, how it's done, and what it can do for you 27
28. RELXSearchSummIt2018
DIY LTR – Index Agnostic
• Run queries, generate features from results to LETOR format
• Train RankLib (or other third party LTR) model
• Run re-rank query on trained model
• Merge output of re-rank with actual results from index
• See notebooks – 04-ranklib/02..04
• Pros: index agnostic; more freedom to add novel features
• Cons: less support from index
Learning to Rank - what it is, how it's done, and what it can do for you 28
30. RELXSearchSummIt2018
Resources
• Book – Learning to Rank for Information Retrieval, by Tie-Yan Liu.
• Paper – From RankNet to LambdaRank to LambdaMART: An
Overview, by Christopher J. C. Burges
• Tutorials
• Solr - https://github.com/airalcorn2/Solr-LTR
• Elasticsearch – Learning to Rank 101 by Pere Urbon-Bayes, ES-LTR Demo by
Doug Turnbull.
• Product Centric LTR Documentation
• Solr Learning To Rank Docs
• Elasticsearch Learning to Rank Docs
Learning to Rank - what it is, how it's done, and what it can do for you 30
Most of the key work done between 2008 – 2011, competitions sponsored by MS, Yahoo and Yandex.
Bloomberg LTR meetup – Michael Nillson, Erick Erickson.
OSC LTR – at Haystack earlier this year.
All cases you need judgement list (ie relevant vs irrelevant). For MRR you need first good result so notion of position; for DCG you need graded results and for NDCG and Rank Correlation we also need ideal ordering.
SMOTE – take a minority class and pick one of its k-nearest neighbors, create synthetic data based on a mix between the original and the neighbor.