SlideShare ist ein Scribd-Unternehmen logo
Encores? - Going beyond
matching and ranking of
search results
Berlin Buzzwords 2021
Eric Pugh, René Kriegler
Who we are
@renekrie @dep4b
Combined 30 years of
experience in search
Open Source enthusiasts
ASF member, Committers on:
Solr, Querqy, SMUI, Quepid,
https://mices.co
Search - beyond matching and ranking
We tend to focus on matching and ranking. Other search features are almost
treated like an afterthought, like ‘encores’ that follow the main performance:
● Facets
● Query auto-completion
● Spelling correction
● Query relaxation
=> BUT: These are essential features that help the user formulate the query,
understand and narrow down the results
Our main act today (not encores!)
● Facets
● Query auto-completion
● Spelling correction
● Query relaxation
Learn about solutions that come out of the box (in Solr)
Typical challenges and how to overcome them
Advanced solutions: understand the concepts, create your own
The Art of Facets
Facets help the user ...
● understand the search results (see ‘what is there’, learn about the domain)
● narrow down search results
Chorus Electronics Project: https://github.com/querqy/chorus
Try the Demo Ecommerce Shop: http://chorus.dev.o19s.com:4000/
Facets help the user ...
● understand the search results (see ‘what is there’, learn about the domain)
● narrow down search results
Challenges
Getting the counts right in e-commerce search
Showing the best facets in the best order
Selecting the facet values to show
“Qui numerare incipit errare incipit”
Facet Counts
Facets and ïŹlters
A trivial example:
query=t-shirts
filter=color:black
Challenge: We still need to count all colours in the facets, even if the search result
contains only black t-shirts
Solution: Tagging and exclusion of ïŹlters
Facets and ïŹlters: tagging and exclusion
Tagging:
fq={!tag=f_color}color:black
Exclusion:
Facet param
facet.field={!ex=f_color}color
JSON facets
"facet": {
"color": {
"type": "terms",
"field": "color",
"domain": {
"excludeTags":"f_color"
}
}
}
Challenge: product variants
Product ID: 9739, brand: “inteemate”
Size: XS
Price: 11.99
Size: XL
Price: 11.99
Size: S
Price: 12.99
Size: S
Price: 12.99
Size: M
Price: 13.99
Size: L
Price: 13.99
Challenge: product variants
Product ID: 9739, brand: “inteemate”
Size: XS
Price: 11.99
Size: XL
Price: 11.99
Size: S
Price: 12.99
Size: S
Price: 12.99
Size: M
Price: 13.99
Size: L
Price: 13.99
color: [green, yellow, blue]
size: [XS, S, M, L, XL]
price: [11.99, 12.99, 13.99]
Merge into single document??
Facets would work great but
false matches for filter color:green AND size:M
Challenge: product variants
Best solution (in our opinion):
● Index one document for each variant
● Group variants at query time using the collapse query parser:
fq={!collapse field=productId}
=> Boolean ïŹlters work as expected
=> Great ïŹ‚exibility for counting facets
=> Fast enough
Challenge: product variants
Size: XS
Price: 11.99
Product: 9739
Brand: inteemate
Size: XL
Price: 11.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
filter query
fq=brand:inteemate
Challenge: product variants
Size: XS
Price: 11.99
Product: 9739
Brand: inteemate
Size: XL
Price: 11.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
filter query
fq=brand:inteemate
filter query
fq=color:blue
Challenge: product variants
Size: XS
Price: 11.99
Product: 9739
Brand: inteemate
Size: XL
Price: 11.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
filter query
fq=brand:inteemate
filter query
fq=color:blue
query
q=t-shirt
Challenge: product variants
Size: XS
Price: 11.99
Product: 9739
Brand: inteemate
Size: XL
Price: 11.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
filter query
fq=brand:inteemate
filter query
fq=color:blue
query
q=t-shirt
post filter query
fq={!collapse
field=productId}
Challenge: product variants in facets
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
...
post filter query
fq={!collapse
field=productId}
Facet counts will be correct for
product attributes (“brand”)
Challenge: product variants in facets
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
...
post filter query
fq={!collapse
field=productId
tag=coll}
For facet counts of variant
attributes we’ll have to tag and
exclude collapse ïŹlter:
"facet": {
"size": {
"type": "terms",
"field": "size",
"domain": {
"excludeTags":"coll"
}
}
}
Sizes S, M, L all shown as ‘1 result’ in facets ✅
Challenge: product variants in facets
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
...
post filter query
fq={!collapse
field=productId
tag=coll}
For facet counts of variant
attributes we’ll have to tag and
exclude collapse ïŹlter:
"facet": {
"color": {
"type": "terms",
"field": "color",
"domain": {
"excludeTags":"coll"
}
}
}
Sizes S, M, L all shown as ‘1 result’ in facets ✅
Color ‘Blue’ shown as ‘3 results’ in facets ❌
Challenge: product variants in facets
Size: S
Price: 12.99
Product: 9739
Brand: inteemate
Size: M
Price: 13.99
Product: 9739
Brand: inteemate
Size: L
Price: 13.99
Product: 9739
Brand: inteemate
...
post filter query
fq={!collapse
field=productId
tag=coll}
For facet counts of variant
attributes we’ll have to tag and
exclude collapse ïŹlter:
"facet": {
"color": {
"type": "terms",
"field": "color",
"domain": {
"excludeTags":"coll"
},
facet: {
"numProducts":"unique(productId)"
}
}
}
Sizes S, M, L all shown as ‘1 result’ in facets ✅
Color ‘Blue’ shown as ‘1 result’ in facets ✅
Collapse query parser - notes on implementation
● Beware of high cardinality of product IDs.
○ If you have 10M diïŹ€erent product IDs in your index, the collapse query parser will allocate
heap space for 2 arrays (ïŹ‚oat/int) x 10M elements (ca. 80 MB) per request!
○ Solution:
■ Many products have just 1 variant. It’s better to leave the productId empty in this case.
■ Combine with nullPolicy=expand, which avoids reserving array space for products
without a productId:
fq={!collapse field=productId nullPolicy=expand}
● All variants of a product must be indexed to the same shard
Facet Selection
Which facets should we show?
Some domains are rich in attributes. For example, electronics could use 10k
diïŹ€erent attributes.
Even if we reduced the number of attributes to be used in facets at index time, we
could be left with several hundreds of candidates for facetting.
Building a request for hundreds of facets is not feasible. We’ll show a simple
solution, that will just use the search engine to select facets.
At the other end of the spectrum, you could train a model, that predicts which
facets to show for a given query.
Which facets should we show? - Solution
Index a ïŹeld multivalued ïŹeld that holds the names of the facettable ïŹelds...
Doc1:
screenSize: 17, ....
facettableFields: [
“screenSize”, “ramGB”, “height”, “width”, ...
]
Doc2:
...
facettableFields: [
“screenSize”, “numHDMIPorts”, “height”, “width”, ...
]
Which facets should we show? - Solution
... and execute an additional, prior facet request on this ïŹeld. Add the facet values
returned by this request as facet parameters to the main request:
"facet": {
"facettable_fields": {
"type": "terms",
"field": "facettable_fields"
}
}
(query/ïŹlter queries are the same like in
the ‘main request’)
"facets":{
"facettable_fields":{
"buckets":[{
"val":"screenSize",
"count":12},
{
"val":"ramGB",
"count":4},
{
"val":"height",
"count":3},
]
}
}
Which facets should we show? - Solution
Index additional information together with the names of facettable ïŹelds
facettableFields: [
"00010;screenSize;Screen size ",
"00100;ramGB;Memory (GB) ",
"00005;height;Height ",
"00005;width;Width ",
...]
Importance
(padding makes
values sortable!)
Field name Label
Facet Value Selection
Which facet values?
Category Pills being dynamically included IF the entropy model says they are
meaningful for ïŹltering the data
Shannon’s Entropy Worksheet
https://bit.ly/measure-diversity
(https://en.wikipedia.org/wiki/Entropy_(information_theory))
Auto-completion & spelling correction
Autocompletion - Using a Suggester
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnCommit">true</str>
<str name="field">dictionary</str>
</lst>
</searchComponent>
Experiment with combinations of Lookups & Dictionary implementations.
Spellchecking - Using Solr component
Two ïŹ‚avours: “coïŹ€fee --> coïŹ€ee”, collations: “expresso machine”
-->“espresso machine”
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">title</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.maxCollations">100</str>
<str name="spellcheck.maxCollationTries">5</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collateParam.mm">100%</str>
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnCommit">true</str>
<str name="field">dictionary</str>
</lst>
</searchComponent>
Good collations are
what we want!
Autocompletion - using a query index
User enters: ja
Show the best query completions
in the best order!
Using ideas from Joshua Bacher/Christine Bellstedt, Search
Suggestions - The Underestimated Killer Feature of your Online Shop.
Berlin Buzzwords 2018
Autocompletion - using a query index
User enters: ja
Match prefix in field “match”
q=*:*&fq=match:ja
indexed with EdgeNGramFilter,
lowercase, remove accents/ASCII
folding, ...
Optionally index and match
spelling variants (jacket/jakcet)
Autocompletion - using a query index
Sort by “weight desc”
q=*:*&fq=match:ja
&sort=weight desc
Sorting might get slow for short
prefixes if the query index is large -
tag the top N queries for lengths 1 and
2 and add another filter (fewer
matches to sort, nicely cacheable):
q=*:*&fq=match:ja
&sort=weight desc
&fq=top_len_2:true
Autocompletion - using a query index
If two queries have the same
fingerprint, drop the one with the
lower weight
Fingerprint: concatenated sorted,
normalised query tokens
This increases the diversity of the
suggestions.
Autocompletion - using a query index
Suggest the Labels as query
completions!
Autocompletion - using a query index
You can show the best matching
category for disambiguation and
affirmation:
* jacket
* jacket in Fashion
Spelling correction - using a query index
Structure similar to query index for
autocompletion
Copy of the ‘match’ field indexed
as n-grams
Spelling correction - using a query index
Filters on edit distance and rank
based on n-grams (via TF*IDF)
q=jakc jones
&defType=edismax
&qf=match_ngram
&sow=false
&fq=match:jakc jones~2
Spelling correction - using a query index
Add boost by weight (or a function
of it)
q=jakc jones
&defType=edismax
&qf=match_ngram
&sow=false
&fq=match:jakc jones~2
&boost=weight
General model for spelling correction &
autocompletion
Noisy Channel Model / Bayesian Inference
(Kernighan et. al., 1990; Jurafsky & Martin, 2009)
Our ‘Weight’ field
Edit distance, n-gram
model, keyboard layout
(Symspell!), ... prefix
match for autocompletion
Query relaxation
Query relaxation
Which query term should we drop if we can’t match all of them together?
jacket xs green
jacket xs green
jacket xs green
jacket xs green
iphone 12
iphone 12
iphone 12
Query relaxation - ‘mm’ anti-pattern
Loosening ‘minimum should match’ (mm) constraint to < 100%
iphone 12
iphone 12
You’ll get matches for “12”
She will just see probably imprecise results that don’t match her
query exactly.
You cannot tell the user what happened and which term you
dropped. She wouldn’t know what to do in order to improve the
query.
Don’t do this!
At least not in e-commerce search
Query relaxation - Solutions
René.Kriegler, Query Relaxation - a rewriting technique
between search and recommendations. Haystack
Conference 2019
Query relaxation - Solutions
Try searching with each term individually and drop the one from
the query that yields the fewest results (might require additional
rules to avoid just keeping number terms)
Query relaxation - Solutions
Multi-layer Neural Network,
Word embeddings as input to represent terms
Encores?
Facets, autocompletion, spelling correction, query relaxation are important
features of a search application.
We’ve shown simple out-of-the-box solutions and a path to implement more
advanced approaches.
Thank you!

Weitere Àhnliche Inhalte

Was ist angesagt?

Android application development ppt
Android application development pptAndroid application development ppt
Android application development ppt
Gautam Kumar
 
Collections and its types in C# (with examples)
Collections and its types in C# (with examples)Collections and its types in C# (with examples)
Collections and its types in C# (with examples)
Aijaz Ali Abro
 
jQuery
jQueryjQuery
jQuery
Vishwa Mohan
 
Dot net syllabus book
Dot net syllabus bookDot net syllabus book
Dot net syllabus book
Papitha Velumani
 
Android activities & views
Android activities & viewsAndroid activities & views
Android activities & views
ma-polimi
 
Presentation on Android application life cycle and saved instancestate
Presentation on Android application life cycle and saved instancestatePresentation on Android application life cycle and saved instancestate
Presentation on Android application life cycle and saved instancestate
Osahon Gino Ediagbonya
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
Ahmed Salama
 
List in Python
List in PythonList in Python
List in Python
Sharath Ankrajegowda
 
Android Programming Basics
Android Programming BasicsAndroid Programming Basics
Android Programming Basics
Eueung Mulyana
 
Introduction to c#
Introduction to c#Introduction to c#
Introduction to c#
OpenSource Technologies Pvt. Ltd.
 
Ado.Net Tutorial
Ado.Net TutorialAdo.Net Tutorial
Ado.Net Tutorial
prabhu rajendran
 
BDD from QA side
BDD from QA sideBDD from QA side
BDD from QA side
Anton Shapin
 
Introduction to flutter's basic concepts
Introduction to flutter's basic conceptsIntroduction to flutter's basic concepts
Introduction to flutter's basic concepts
Kumaresh Chandra Baruri
 
Notes Unit2.pptx
Notes Unit2.pptxNotes Unit2.pptx
Notes Unit2.pptx
MIT Autonomous Aurangabad
 
Object-oriented Programming-with C#
Object-oriented Programming-with C#Object-oriented Programming-with C#
Object-oriented Programming-with C#
Doncho Minkov
 
Event Driven programming(ch1 and ch2).pdf
Event Driven programming(ch1 and ch2).pdfEvent Driven programming(ch1 and ch2).pdf
Event Driven programming(ch1 and ch2).pdf
AliEndris3
 
Overloading vs Overriding.pptx
Overloading vs Overriding.pptxOverloading vs Overriding.pptx
Overloading vs Overriding.pptx
Karudaiyar Ganapathy
 
Android contentprovider
Android contentproviderAndroid contentprovider
Android contentprovider
Krazy Koder
 
Visual Basic IDE Introduction
Visual Basic IDE IntroductionVisual Basic IDE Introduction
Visual Basic IDE Introduction
Ahllen Javier
 
Turtle graphics
Turtle graphicsTurtle graphics
Turtle graphics
grahamwell
 

Was ist angesagt? (20)

Android application development ppt
Android application development pptAndroid application development ppt
Android application development ppt
 
Collections and its types in C# (with examples)
Collections and its types in C# (with examples)Collections and its types in C# (with examples)
Collections and its types in C# (with examples)
 
jQuery
jQueryjQuery
jQuery
 
Dot net syllabus book
Dot net syllabus bookDot net syllabus book
Dot net syllabus book
 
Android activities & views
Android activities & viewsAndroid activities & views
Android activities & views
 
Presentation on Android application life cycle and saved instancestate
Presentation on Android application life cycle and saved instancestatePresentation on Android application life cycle and saved instancestate
Presentation on Android application life cycle and saved instancestate
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
List in Python
List in PythonList in Python
List in Python
 
Android Programming Basics
Android Programming BasicsAndroid Programming Basics
Android Programming Basics
 
Introduction to c#
Introduction to c#Introduction to c#
Introduction to c#
 
Ado.Net Tutorial
Ado.Net TutorialAdo.Net Tutorial
Ado.Net Tutorial
 
BDD from QA side
BDD from QA sideBDD from QA side
BDD from QA side
 
Introduction to flutter's basic concepts
Introduction to flutter's basic conceptsIntroduction to flutter's basic concepts
Introduction to flutter's basic concepts
 
Notes Unit2.pptx
Notes Unit2.pptxNotes Unit2.pptx
Notes Unit2.pptx
 
Object-oriented Programming-with C#
Object-oriented Programming-with C#Object-oriented Programming-with C#
Object-oriented Programming-with C#
 
Event Driven programming(ch1 and ch2).pdf
Event Driven programming(ch1 and ch2).pdfEvent Driven programming(ch1 and ch2).pdf
Event Driven programming(ch1 and ch2).pdf
 
Overloading vs Overriding.pptx
Overloading vs Overriding.pptxOverloading vs Overriding.pptx
Overloading vs Overriding.pptx
 
Android contentprovider
Android contentproviderAndroid contentprovider
Android contentprovider
 
Visual Basic IDE Introduction
Visual Basic IDE IntroductionVisual Basic IDE Introduction
Visual Basic IDE Introduction
 
Turtle graphics
Turtle graphicsTurtle graphics
Turtle graphics
 

Ähnlich wie Encores

Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
Jean Silva
 
Feature surfacing - meetup
Feature surfacing  - meetupFeature surfacing  - meetup
Feature surfacing - meetup
PredicSis
 
Product Line And Product Development Exercises
Product Line And Product Development ExercisesProduct Line And Product Development Exercises
Product Line And Product Development Exercises
Roland Frasier
 
The product is not "the product". Who owns it anyway?
The product is not "the product". Who owns it anyway? The product is not "the product". Who owns it anyway?
The product is not "the product". Who owns it anyway?
donato mangialardo
 
Design handoffs with design systems in mind
Design handoffs with design systems in mindDesign handoffs with design systems in mind
Design handoffs with design systems in mind
Roland Lösslein
 
Make money with industrial design. Jukola7 presentation 2014
Make money with industrial design. Jukola7 presentation 2014Make money with industrial design. Jukola7 presentation 2014
Make money with industrial design. Jukola7 presentation 2014
Lauri Aaltio
 
QM-008-Design for Six Sigma 
QM-008-Design for Six Sigma QM-008-Design for Six Sigma 
QM-008-Design for Six Sigma 
handbook
 
Applying NLP to product comparison at visual meta
Applying NLP to product comparison at visual metaApplying NLP to product comparison at visual meta
Applying NLP to product comparison at visual meta
Ross Turner
 
Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalog
MongoDB
 
Samples of Database and Website Design
Samples of Database and Website DesignSamples of Database and Website Design
Samples of Database and Website Design
Sherri Orwick Ogden
 
Writing a specification
Writing a specificationWriting a specification
Writing a specification
David Hassett
 
DC Group
DC Group DC Group
DC Group
Andinsson
 
Boosting Product Categorization with Machine Learning
Boosting Product Categorization with Machine LearningBoosting Product Categorization with Machine Learning
Boosting Product Categorization with Machine Learning
Amadeus Magrabi
 
Just ask: Designing intent-driven algos
Just ask: Designing intent-driven algosJust ask: Designing intent-driven algos
Just ask: Designing intent-driven algos
Anna Schneider
 
Vb introduction.
Vb introduction.Vb introduction.
Vb introduction.
sagaroceanic11
 
SXSW After Party
SXSW After PartySXSW After Party
SXSW After Party
Nathan O'Hanlon
 
Building complex UI on Android
Building complex UI on AndroidBuilding complex UI on Android
Building complex UI on Android
Maciej Witowski
 
4.Data-Visualization.pptx
4.Data-Visualization.pptx4.Data-Visualization.pptx
4.Data-Visualization.pptx
PratyushJain37
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
Revolution Analytics
 
New Principles for Digital Experiences That Perform
New Principles for Digital Experiences That PerformNew Principles for Digital Experiences That Perform
New Principles for Digital Experiences That Perform
Optimizely
 

Ähnlich wie Encores (20)

Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Feature surfacing - meetup
Feature surfacing  - meetupFeature surfacing  - meetup
Feature surfacing - meetup
 
Product Line And Product Development Exercises
Product Line And Product Development ExercisesProduct Line And Product Development Exercises
Product Line And Product Development Exercises
 
The product is not "the product". Who owns it anyway?
The product is not "the product". Who owns it anyway? The product is not "the product". Who owns it anyway?
The product is not "the product". Who owns it anyway?
 
Design handoffs with design systems in mind
Design handoffs with design systems in mindDesign handoffs with design systems in mind
Design handoffs with design systems in mind
 
Make money with industrial design. Jukola7 presentation 2014
Make money with industrial design. Jukola7 presentation 2014Make money with industrial design. Jukola7 presentation 2014
Make money with industrial design. Jukola7 presentation 2014
 
QM-008-Design for Six Sigma 
QM-008-Design for Six Sigma QM-008-Design for Six Sigma 
QM-008-Design for Six Sigma 
 
Applying NLP to product comparison at visual meta
Applying NLP to product comparison at visual metaApplying NLP to product comparison at visual meta
Applying NLP to product comparison at visual meta
 
Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalog
 
Samples of Database and Website Design
Samples of Database and Website DesignSamples of Database and Website Design
Samples of Database and Website Design
 
Writing a specification
Writing a specificationWriting a specification
Writing a specification
 
DC Group
DC Group DC Group
DC Group
 
Boosting Product Categorization with Machine Learning
Boosting Product Categorization with Machine LearningBoosting Product Categorization with Machine Learning
Boosting Product Categorization with Machine Learning
 
Just ask: Designing intent-driven algos
Just ask: Designing intent-driven algosJust ask: Designing intent-driven algos
Just ask: Designing intent-driven algos
 
Vb introduction.
Vb introduction.Vb introduction.
Vb introduction.
 
SXSW After Party
SXSW After PartySXSW After Party
SXSW After Party
 
Building complex UI on Android
Building complex UI on AndroidBuilding complex UI on Android
Building complex UI on Android
 
4.Data-Visualization.pptx
4.Data-Visualization.pptx4.Data-Visualization.pptx
4.Data-Visualization.pptx
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 
New Principles for Digital Experiences That Perform
New Principles for Digital Experiences That PerformNew Principles for Digital Experiences That Perform
New Principles for Digital Experiences That Perform
 

Mehr von OpenSource Connections

Test driven relevancy
Test driven relevancyTest driven relevancy
Test driven relevancy
OpenSource Connections
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
OpenSource Connections
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
OpenSource Connections
 
Payloads and OCR with Solr
Payloads and OCR with SolrPayloads and OCR with Solr
Payloads and OCR with Solr
OpenSource Connections
 
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie HullHaystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
OpenSource Connections
 
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim AllisonHaystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
OpenSource Connections
 
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
OpenSource Connections
 
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj BharadwajHaystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
OpenSource Connections
 
Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...
Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...
Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...
OpenSource Connections
 
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlHaystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
OpenSource Connections
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
OpenSource Connections
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
OpenSource Connections
 
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
OpenSource Connections
 
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
OpenSource Connections
 
Haystack 2019 - Architectural considerations on search relevancy in the conte...
Haystack 2019 - Architectural considerations on search relevancy in the conte...Haystack 2019 - Architectural considerations on search relevancy in the conte...
Haystack 2019 - Architectural considerations on search relevancy in the conte...
OpenSource Connections
 
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
OpenSource Connections
 
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
OpenSource Connections
 
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
OpenSource Connections
 
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
OpenSource Connections
 
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
OpenSource Connections
 

Mehr von OpenSource Connections (20)

Test driven relevancy
Test driven relevancyTest driven relevancy
Test driven relevancy
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
 
Payloads and OCR with Solr
Payloads and OCR with SolrPayloads and OCR with Solr
Payloads and OCR with Solr
 
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie HullHaystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
Haystack 2019 Lightning Talk - The Future of Quepid - Charlie Hull
 
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim AllisonHaystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
Haystack 2019 Lightning Talk - State of Apache Tika - Tim Allison
 
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
Haystack 2019 Lightning Talk - Relevance on 17 million full text documents - ...
 
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj BharadwajHaystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
Haystack 2019 Lightning Talk - Solr Cloud on Kubernetes - Manoj Bharadwaj
 
Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...
Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...
Haystack 2019 Lightning Talk - Quaerite a Search relevance evaluation toolkit...
 
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlHaystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
 
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
 
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
Haystack 2019 - Improving Search Relevance with Numeric Features in Elasticse...
 
Haystack 2019 - Architectural considerations on search relevancy in the conte...
Haystack 2019 - Architectural considerations on search relevancy in the conte...Haystack 2019 - Architectural considerations on search relevancy in the conte...
Haystack 2019 - Architectural considerations on search relevancy in the conte...
 
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
Haystack 2019 - Custom Solr Query Parser Design Option, and Pros & Cons - Ber...
 
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...Haystack 2019 - Establishing a relevance focused culture in a large organizat...
Haystack 2019 - Establishing a relevance focused culture in a large organizat...
 
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
 
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
2019 Haystack - How The New York Times Tackles Relevance - Jeremiah Via
 
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
Haystack 2019 - Addressing variance in AB tests: Interleaved evaluation of ra...
 

KĂŒrzlich hochgeladen

leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-UniversitÀt
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 

KĂŒrzlich hochgeladen (20)

leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 

Encores

  • 1. Encores? - Going beyond matching and ranking of search results Berlin Buzzwords 2021 Eric Pugh, RenĂ© Kriegler
  • 2. Who we are @renekrie @dep4b Combined 30 years of experience in search Open Source enthusiasts ASF member, Committers on: Solr, Querqy, SMUI, Quepid,
  • 4. Search - beyond matching and ranking We tend to focus on matching and ranking. Other search features are almost treated like an afterthought, like ‘encores’ that follow the main performance: ● Facets ● Query auto-completion ● Spelling correction ● Query relaxation => BUT: These are essential features that help the user formulate the query, understand and narrow down the results
  • 5. Our main act today (not encores!) ● Facets ● Query auto-completion ● Spelling correction ● Query relaxation Learn about solutions that come out of the box (in Solr) Typical challenges and how to overcome them Advanced solutions: understand the concepts, create your own
  • 6. The Art of Facets
  • 7. Facets help the user ... ● understand the search results (see ‘what is there’, learn about the domain) ● narrow down search results Chorus Electronics Project: https://github.com/querqy/chorus Try the Demo Ecommerce Shop: http://chorus.dev.o19s.com:4000/
  • 8. Facets help the user ... ● understand the search results (see ‘what is there’, learn about the domain) ● narrow down search results
  • 9. Challenges Getting the counts right in e-commerce search Showing the best facets in the best order Selecting the facet values to show
  • 10. “Qui numerare incipit errare incipit” Facet Counts
  • 11. Facets and ïŹlters A trivial example: query=t-shirts filter=color:black Challenge: We still need to count all colours in the facets, even if the search result contains only black t-shirts Solution: Tagging and exclusion of ïŹlters
  • 12. Facets and ïŹlters: tagging and exclusion Tagging: fq={!tag=f_color}color:black Exclusion: Facet param facet.field={!ex=f_color}color JSON facets "facet": { "color": { "type": "terms", "field": "color", "domain": { "excludeTags":"f_color" } } }
  • 13. Challenge: product variants Product ID: 9739, brand: “inteemate” Size: XS Price: 11.99 Size: XL Price: 11.99 Size: S Price: 12.99 Size: S Price: 12.99 Size: M Price: 13.99 Size: L Price: 13.99
  • 14. Challenge: product variants Product ID: 9739, brand: “inteemate” Size: XS Price: 11.99 Size: XL Price: 11.99 Size: S Price: 12.99 Size: S Price: 12.99 Size: M Price: 13.99 Size: L Price: 13.99 color: [green, yellow, blue] size: [XS, S, M, L, XL] price: [11.99, 12.99, 13.99] Merge into single document?? Facets would work great but false matches for filter color:green AND size:M
  • 15. Challenge: product variants Best solution (in our opinion): ● Index one document for each variant ● Group variants at query time using the collapse query parser: fq={!collapse field=productId} => Boolean ïŹlters work as expected => Great ïŹ‚exibility for counting facets => Fast enough
  • 16. Challenge: product variants Size: XS Price: 11.99 Product: 9739 Brand: inteemate Size: XL Price: 11.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate filter query fq=brand:inteemate
  • 17. Challenge: product variants Size: XS Price: 11.99 Product: 9739 Brand: inteemate Size: XL Price: 11.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate filter query fq=brand:inteemate filter query fq=color:blue
  • 18. Challenge: product variants Size: XS Price: 11.99 Product: 9739 Brand: inteemate Size: XL Price: 11.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate filter query fq=brand:inteemate filter query fq=color:blue query q=t-shirt
  • 19. Challenge: product variants Size: XS Price: 11.99 Product: 9739 Brand: inteemate Size: XL Price: 11.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate filter query fq=brand:inteemate filter query fq=color:blue query q=t-shirt post filter query fq={!collapse field=productId}
  • 20. Challenge: product variants in facets Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate ... post filter query fq={!collapse field=productId} Facet counts will be correct for product attributes (“brand”)
  • 21. Challenge: product variants in facets Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate ... post filter query fq={!collapse field=productId tag=coll} For facet counts of variant attributes we’ll have to tag and exclude collapse ïŹlter: "facet": { "size": { "type": "terms", "field": "size", "domain": { "excludeTags":"coll" } } } Sizes S, M, L all shown as ‘1 result’ in facets ✅
  • 22. Challenge: product variants in facets Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate ... post filter query fq={!collapse field=productId tag=coll} For facet counts of variant attributes we’ll have to tag and exclude collapse ïŹlter: "facet": { "color": { "type": "terms", "field": "color", "domain": { "excludeTags":"coll" } } } Sizes S, M, L all shown as ‘1 result’ in facets ✅ Color ‘Blue’ shown as ‘3 results’ in facets ❌
  • 23. Challenge: product variants in facets Size: S Price: 12.99 Product: 9739 Brand: inteemate Size: M Price: 13.99 Product: 9739 Brand: inteemate Size: L Price: 13.99 Product: 9739 Brand: inteemate ... post filter query fq={!collapse field=productId tag=coll} For facet counts of variant attributes we’ll have to tag and exclude collapse ïŹlter: "facet": { "color": { "type": "terms", "field": "color", "domain": { "excludeTags":"coll" }, facet: { "numProducts":"unique(productId)" } } } Sizes S, M, L all shown as ‘1 result’ in facets ✅ Color ‘Blue’ shown as ‘1 result’ in facets ✅
  • 24. Collapse query parser - notes on implementation ● Beware of high cardinality of product IDs. ○ If you have 10M diïŹ€erent product IDs in your index, the collapse query parser will allocate heap space for 2 arrays (ïŹ‚oat/int) x 10M elements (ca. 80 MB) per request! ○ Solution: ■ Many products have just 1 variant. It’s better to leave the productId empty in this case. ■ Combine with nullPolicy=expand, which avoids reserving array space for products without a productId: fq={!collapse field=productId nullPolicy=expand} ● All variants of a product must be indexed to the same shard
  • 26. Which facets should we show? Some domains are rich in attributes. For example, electronics could use 10k diïŹ€erent attributes. Even if we reduced the number of attributes to be used in facets at index time, we could be left with several hundreds of candidates for facetting. Building a request for hundreds of facets is not feasible. We’ll show a simple solution, that will just use the search engine to select facets. At the other end of the spectrum, you could train a model, that predicts which facets to show for a given query.
  • 27. Which facets should we show? - Solution Index a ïŹeld multivalued ïŹeld that holds the names of the facettable ïŹelds... Doc1: screenSize: 17, .... facettableFields: [ “screenSize”, “ramGB”, “height”, “width”, ... ] Doc2: ... facettableFields: [ “screenSize”, “numHDMIPorts”, “height”, “width”, ... ]
  • 28. Which facets should we show? - Solution ... and execute an additional, prior facet request on this ïŹeld. Add the facet values returned by this request as facet parameters to the main request: "facet": { "facettable_fields": { "type": "terms", "field": "facettable_fields" } } (query/ïŹlter queries are the same like in the ‘main request’) "facets":{ "facettable_fields":{ "buckets":[{ "val":"screenSize", "count":12}, { "val":"ramGB", "count":4}, { "val":"height", "count":3}, ] } }
  • 29. Which facets should we show? - Solution Index additional information together with the names of facettable ïŹelds facettableFields: [ "00010;screenSize;Screen size ", "00100;ramGB;Memory (GB) ", "00005;height;Height ", "00005;width;Width ", ...] Importance (padding makes values sortable!) Field name Label
  • 31. Which facet values? Category Pills being dynamically included IF the entropy model says they are meaningful for ïŹltering the data
  • 34. Autocompletion - Using a Suggester <searchComponent name="suggest" class="solr.SuggestComponent"> <lst name="suggester"> <str name="name">mySuggester</str> <str name="lookupImpl">FuzzyLookupFactory</str> <str name="suggestAnalyzerFieldType">text_general</str> <str name="buildOnCommit">true</str> <str name="field">dictionary</str> </lst> </searchComponent> Experiment with combinations of Lookups & Dictionary implementations.
  • 35. Spellchecking - Using Solr component Two ïŹ‚avours: “coïŹ€fee --> coïŹ€ee”, collations: “expresso machine” -->“espresso machine” <str name="spellcheck">true</str> <str name="spellcheck.dictionary">title</str> <str name="spellcheck.onlyMorePopular">true</str> <str name="spellcheck.extendedResults">true</str> <str name="spellcheck.collate">true</str> <str name="spellcheck.maxCollations">100</str> <str name="spellcheck.maxCollationTries">5</str> <str name="spellcheck.count">5</str> <str name="spellcheck.collateParam.mm">100%</str> <searchComponent name="suggest" class="solr.SuggestComponent"> <lst name="suggester"> <str name="name">mySuggester</str> <str name="lookupImpl">FuzzyLookupFactory</str> <str name="suggestAnalyzerFieldType">text_general</str> <str name="buildOnCommit">true</str> <str name="field">dictionary</str> </lst> </searchComponent> Good collations are what we want!
  • 36. Autocompletion - using a query index User enters: ja Show the best query completions in the best order! Using ideas from Joshua Bacher/Christine Bellstedt, Search Suggestions - The Underestimated Killer Feature of your Online Shop. Berlin Buzzwords 2018
  • 37. Autocompletion - using a query index User enters: ja Match prefix in field “match” q=*:*&fq=match:ja indexed with EdgeNGramFilter, lowercase, remove accents/ASCII folding, ... Optionally index and match spelling variants (jacket/jakcet)
  • 38. Autocompletion - using a query index Sort by “weight desc” q=*:*&fq=match:ja &sort=weight desc Sorting might get slow for short prefixes if the query index is large - tag the top N queries for lengths 1 and 2 and add another filter (fewer matches to sort, nicely cacheable): q=*:*&fq=match:ja &sort=weight desc &fq=top_len_2:true
  • 39. Autocompletion - using a query index If two queries have the same fingerprint, drop the one with the lower weight Fingerprint: concatenated sorted, normalised query tokens This increases the diversity of the suggestions.
  • 40. Autocompletion - using a query index Suggest the Labels as query completions!
  • 41. Autocompletion - using a query index You can show the best matching category for disambiguation and affirmation: * jacket * jacket in Fashion
  • 42. Spelling correction - using a query index Structure similar to query index for autocompletion Copy of the ‘match’ field indexed as n-grams
  • 43. Spelling correction - using a query index Filters on edit distance and rank based on n-grams (via TF*IDF) q=jakc jones &defType=edismax &qf=match_ngram &sow=false &fq=match:jakc jones~2
  • 44. Spelling correction - using a query index Add boost by weight (or a function of it) q=jakc jones &defType=edismax &qf=match_ngram &sow=false &fq=match:jakc jones~2 &boost=weight
  • 45. General model for spelling correction & autocompletion Noisy Channel Model / Bayesian Inference (Kernighan et. al., 1990; Jurafsky & Martin, 2009) Our ‘Weight’ field Edit distance, n-gram model, keyboard layout (Symspell!), ... prefix match for autocompletion
  • 47. Query relaxation Which query term should we drop if we can’t match all of them together? jacket xs green jacket xs green jacket xs green jacket xs green iphone 12 iphone 12 iphone 12
  • 48. Query relaxation - ‘mm’ anti-pattern Loosening ‘minimum should match’ (mm) constraint to < 100% iphone 12 iphone 12 You’ll get matches for “12” She will just see probably imprecise results that don’t match her query exactly. You cannot tell the user what happened and which term you dropped. She wouldn’t know what to do in order to improve the query. Don’t do this! At least not in e-commerce search
  • 49. Query relaxation - Solutions RenĂ©.Kriegler, Query Relaxation - a rewriting technique between search and recommendations. Haystack Conference 2019
  • 50. Query relaxation - Solutions Try searching with each term individually and drop the one from the query that yields the fewest results (might require additional rules to avoid just keeping number terms)
  • 51. Query relaxation - Solutions Multi-layer Neural Network, Word embeddings as input to represent terms
  • 52. Encores? Facets, autocompletion, spelling correction, query relaxation are important features of a search application. We’ve shown simple out-of-the-box solutions and a path to implement more advanced approaches.