4. 4
185+
Consultants
Worldwide
San
Diego
London,
UK
San
Jose,
CR
Cincinna>
Prague,
CZ
Washington
(HQ)
Frankfurt,
DE
• Founded 2005
• Deep search expertise
• 700+ customers worldwide
• Consistent profitability
• Search engines & Big Data
• Vendor independent
5. 5
Typical Conversation with Customer
Our search
accuracy
is bad
How bad?
Really,
really,
bad.
Uh… on a
scale of
1 to 10,
how bad?
An eight.
No wait…
a nine.
Maybe even
a 9.5.
Let’s call it
a 9.23
6. 6
Current methods are woefully inadequate
• Golden Query Set
o Key Documents
• Top 100 / Top 1000 Queries Analysis
• Zero result queries
• Abandonment rate
• Queries with click
• Conversion
7. 7
What are we trying to achieve?
• Reliable metrics for search accuracy
• Can run analysis off-line
o Does not require production deployment (!)
• Can accurately compare two engines
• Runs quickly = agility = high quality
• Can handle different user types / personalization
o Broad coverage
• Provides lots of data to analyze what’s going on
o Data to decide how best to improve the engine
8. Search
Engine
Under
Evalua1on
Search
Engine
Under
Evalua1on
Search
Engine
Under
Evalua1on
8
Leverage logs for accuracy testing
Query
Logs
Click
Logs
Big
Data
Framework
• Engine
Score(s)
• Other
metrics
&
histograms
• Scoring
database
Search
Engine
Under
Evalua1on
9. 9
From Queries à Users
• User by User Metrics
o Change in focus
• Group activity by session and/or user
o Call this an “Activity Set”
o Merge sessions and users
• Use Big Data to analyze all users
o There are no stupid queries and no stupid users
o Overall performance based on the experience of the users
Queries
Other
Ac>vity
Clicks
Clusters
User
10. 10
Engine Score
• Group activity by session and/or user (Queries & Clicks)
• Determine “relevant” documents
o What did the user view? Add to cart? Purchase?
o Did the search engine return what the user ultimately wanted?
• Determine engine score per query based on user’s POV
o Σ power(FACTOR, position)*isRelevant[user, searchResult[position].DocID]
o (Note: many other formulae possible, MRR, MAP, DCG, etc.)
• Average score for all user queries = user score
• Average scores across all users = final engine score
12. 12
Off-Line Engine Analysis
o Can we re-compute this array for all queries?
o ANSWER: Yes!
Σ power(FACTOR, position)*isRelevant[User, searchResult[position].DocID]
Offline
Re-‐Query
Search
Engine
Query
Logs
New
Results
Big
Data
Array
Search
Engine
(possibly
embedded)
15. 15
What else can we do with Engine Scoring?
Predictive Analytics
16. 16
The Brutal Truth about Search Engine Scores
• Random ad-hoc formulae put together
o No statistical or mathematical foundation
• TF / IDF à All kinds of inappropriate biases
o Bias towards document size (smaller / larger)
o Bias towards rare (misspelled? archaic?) words
o Not scalable (different scores on different shards)
• Same formula since the 1970’s
They
are
not
based
on
science.
We
can
do
beKer!
17. Big
Data
Cluster
17
We use Big Data to Predict Relevancy
Search
Engine
Content
Sources
Connectors Index Search
Index
Search
Project
Docs
Web
Site
Pages
Support
Pages
Landing
Pages
Content
Processing
Content
Copy
Search
Click
Logs
Click
Logs
Query
Logs
Financial
Data
Business
Data
Query
Logs
Op
Relevancy
Model
18. 18
Probability Scoring / Predictive Relevancy
clicked
?
purchased
?
0 0
1 1
1 0
0 0
1 0
1 1
Predic1ve
Analy1cs
Sta1s1cal
Model
to
Predict
Probability
Product
Signals
Query
Signals
User
Signals
Comparison
Signals
19. 19
The Power of the Probability Score
• The score predicts probability of relevancy
• Value is 0 à 1
o Can be used for threshold processing
o All documents too weak? Try something else!
o Can combine results from different sources / constructions together
• Identifies what’s important
o Machine learning optimizes for parameters
-‐ Identifies the impact and contribution of every parameter
o If a parameter does not improve relevancy à REMOVE IT
o Scoring becomes objective, not subjective (now based on SCIENCE)
o Allows for experimentation on parameters
23. The Age of Enlightenment
for search engine accuracy
is upon us!
24. Search Accuracy Metrics & Predictive Analytics
A Big Data Use Case
Paul Nelson
Chief Architect, Search Technologies
pnelson@searchtechnologies.com
Thank you!