SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Relevance in the Wild
Daniel Gómez Villanueva
Findwise
Agenda
•  Introduction
•  Similarity functions
•  Edismax
•  Evaluating
•  Real cases
3
Introduc)on
01
Daniel Gómez Villanueva
Search specialist consultant
•  @ Findwise
•  @ Stockholm, Sweden
•  Studies -> UPM, KTH
•  Technnologies -> Search
•  Roles -> Code Java + .NET
•  + 10 projects
•  + 10 trainings public / private
01
Findwise
Search driven solutions
•  Founded in 2005
•  100 employees
•  Vendor independent consultants
•  Sweden, Denmark, Norway, Finland & Poland
01
Relevance definition
•  Dictionary.com:- 
the condition of being relevant, or connected with the 
matter at hand
•  Cambridge:
-  the degree to which something is related or useful to
what is happening or being talked about
•  Merriam-Webster (2):
-  the ability (as of an information retrieval system) to
retrieve material that satisfies the needs of the user
http://www.dictionary.com/browse/relevance
http://dictionary.cambridge.org/us/dictionary/english/relevance
https://www.merriam-webster.com/dictionary/relevance
01
Measuring relevance of a result set
•  Precision: % of the results that are relevant
•  Recall: % of the relevant set that is in the results
•  Only evaluates the quality of the result set
•  What is the size of the relevant set?
•  We need a measure for individual documents
https://en.wikipedia.org/wiki/Precision_and_recall
01
What about sorting?
A list of results is only as good as its order
•  Sort by any type of field:
-  Sorting by text
-  Sorting by number
-  Sorting by rank
•  Sort by relevance:
-  Sorting by similarity score
9
Similarity
How	
  does	
  Solr	
  calculate	
  the	
  score	
  
01
Similarity score
Score for individual documents
•  Similarity defines the components of Lucene scoring
•  Calculated per query per field (​ 𝑠 𝑐𝑜𝑟𝑒↓𝑞, 𝑓 )
-  BM25Similarity
-  TFIDFSimilarity
-  BooleanSimilarity
-  MultiSimilarity
-  PerFieldSimilarityWrapper
https://lucene.apache.org/core/6_6_0/
core/org/apache/lucene/search/similarities/Similarity.html
01
Similarity parameters
•  tf: term frequency
-  How many times a term appears in a document
-  The more times the term appears the more relevant it
is
•  idf: inverted document frequency:
-  One divided by how many documents in the whole
corpus contains the term
-  The more document contain the document the less
semantically meaningful and therefore relevant it is
01
Similarity parameters
•  norm: field norm
-  Inversed length of the field
-  The longer the field the less relevant a term
appearance is
•  C: coordinator factor:
-  Factor multiplied to the sum of the scores for all
terms in the query
-  Makes the score end in a reasonable interval of
values
01
TF-IDF
Term Frequency x Inversed document frequency
•  Classical Information retrieval score
•  Default in Solr/Lucene from first versions to 6
​ 𝒔 𝒄𝒐𝒓𝒆↓𝒒, 𝒅 =​ 𝑪↓𝒒, 𝒅 ∑𝒕∈ 𝒒↑▒​ 𝒕 𝒇↓𝒕, 𝒅 ∙​ 𝒊 𝒅𝒇↓𝒕 ∙​ 𝒏 𝒐𝒓𝒎↓𝒅  
https://lucene.apache.org/core/6_6_0/
core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
01
TF-IDF – Math normalizations
•  TF: squared
•  Idf: logarithm
•  length: squared
​ 𝒊 𝒅𝒇↓𝒕 =​log⁠(​ 𝒏 𝒖𝒎𝑫𝒐𝒄𝒔/​
𝒅𝒐𝒄𝑭𝒓𝒆𝒒↓𝒕  +𝟏) +𝟏
​ 𝒕 𝒇↓𝒕, 𝒅 =√ 𝟐&​
𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒅  
​ 𝒏 𝒐𝒓𝒎↓𝒅 =​ 𝟏/√ 𝟐&| 𝑫|  
01
BM25
Area Gas
Monitor
Body piercing
jewelry store
Soviet multiple
rocket launcher
01
Okapi BM25
Best Matching 25
•  Calculates relevance based on probability
•  The effect of term frequency has a ceiling: k1
•  The length of the field affects the tf curve: b
•  Default from version 6
​ 𝒔 𝒄𝒐𝒓𝒆↓𝒒, 𝒅 =​ 𝑪↓𝒒, 𝒅 ∑𝒕∈ 𝒒↑▒∙​ 𝒊 𝒅𝒇↓𝒕 ∙​​ 𝒕 𝒇↓𝒕, 𝒅 ∙(​ 𝒌↓ 𝟏 +𝟏)/​ 𝒕 𝒇↓𝒕, 𝒅 
+​ 𝒌↓ 𝟏 ∙(𝟏− 𝒃+ 𝒃∙​​| 𝒅|/𝒂𝒗𝒈(|𝒅|)  )  
​ 𝒌↓ 𝟏 =𝟏.𝟐
𝒃=𝟎.𝟕𝟓
01
Okapi BM25 – Math normalizations
•  TF: log
•  Idf: logarithm
•  length: included in tf
•  k1 & b: parameters
​ 𝒊 𝒅𝒇↓𝒕 =​log⁠(​ 𝒏 𝒖𝒎𝑫𝒐𝒄𝒔−​
𝒅𝒐𝒄𝑭𝒓𝒆𝒒↓𝒕 +𝟎.𝟓/​ 𝒅 𝒐𝒄𝑭𝒓𝒆𝒒↓𝒕 
+𝟎.𝟓 +𝟏) 
​ 𝒕 𝒇↓𝒕, 𝒅 =​(​ 𝒌↓ 𝟏 +𝟏)∙​
𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒒 /​ 𝒌↓ 𝟏 +​
𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒒  
𝒃∈ [𝟎,𝟏]​ 𝒌↓ 𝟏 
∈[𝟎,∞)
​ 𝒕 𝒇↓𝒕, 𝒅 =​(​ 𝒌↓ 𝟏 +𝟏)∙​
𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒒 /​ 𝒕 𝒇↓𝒕, 𝒅 +​ 𝒌↓ 𝟏 
∙(𝟏− 𝒃+ 𝒃∙​​| 𝒅|/𝒂𝒗𝒈(|𝒅|)  ) 
01
Configuring similarity
Similarity element in Schema
•  Global similarity
-  Using configuration template (API pending:
SOLR-7242)
•  Similarity per field type
-  Field type action in Schema API from 5.3
(SOLR-7679)
{	
  
"add-­‐field-­‐type":	
  {	
  
	
  "name":	
  "fieldTypeWithSimilarity",	
  
	
  "class":	
  "org.apache.solr.schema.TextField",	
  
	
  "analyzer":	
  {	
  ...	
  },	
  
	
  "similarity":	
  {	
  
	
   	
  "class":	
  "org.apache.lucene.misc.SweetSpotSimilarity"	
  
	
  }	
  
}	
  
}	
  
<similarity	
  class="solr.BM25SimilarityFactory">	
  
	
  <float	
  name="k1">1.2</float>	
  
	
  <float	
  name="b">0.76</float>	
  
</similarity>	
  
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
01
Custom similarity
You can write your own
•  For very specific cases
•  Just extend org.apache.lucene.search.similarities.BaseSimilarity
-­‐  float	
  score(BasicStats	
  stats,	
  float	
  freq,	
  float	
  docLen)	
  	
  
-­‐  String	
  toString()	
  
-­‐  void	
  explain(List<Explanation>	
  subExpls,	
  BasicStats	
  stats,	
  
-­‐  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  int	
  doc,	
  float	
  freq,	
  float	
  docLen)
https://lucene.apache.org/core/6_6_0/core/
org/apache/lucene/search/similarities/SimilarityBase.html
20
Edismax
The	
  de-­‐facto	
  Solr	
  query	
  parser	
  
01
Edismax
Extended disyuntive maximum
•  Relevance on several fields
-  fq=field1^boost field2&q=query
•  Get the score of the best (maximum) match
-  Except if use tie parameter to factor in other fields
•  AND, OR, or Minimum match
-  Can manipulate how many term matches required
01
Understanding the calculations
To understand the order of scores use debug parameter
•  debug=results or debugQuery
1.8534669	
  =	
  weight(_text_:book	
  in	
  3)	
  [],	
  result	
  of:	
  
	
  	
  1.8534669	
  =	
  score(doc=3,freq=6.0	
  =	
  termFreq=6.0	
  
	
  	
  ),	
  product	
  of:	
  
	
  	
  	
  	
  1.1108824	
  =	
  idf(docFreq=13,	
  docCount=40)	
  
	
  	
  	
  	
  1.6684636	
  =	
  tfNorm,	
  computed	
  from:	
  
	
  	
  	
  	
  	
  	
  6.0	
  =	
  termFreq=6.0	
  
	
  	
  	
  	
  	
  	
  1.2	
  =	
  parameter	
  k1	
  
	
  	
  	
  	
  	
  	
  0.75	
  =	
  parameter	
  b	
  
	
  	
  	
  	
  	
  	
  142.975	
  =	
  avgFieldLength	
  
	
  	
  	
  	
  	
  	
  256.0	
  =	
  fieldLength	
  
01
Edismax parameters
Parameter Relevance Filter Operator Syntax
q! YES! YES! +! Query!
fq! NO! YES! NO! Query!
bq! YES! NO! +! Query!
bf! YES! NO! +! Function!
boost! YES! NO! *! Function!
24
Evalua)ng
01
Relevance testing
How do we work with relevance?
1.  Change the relevance model
2.  Evaluate
3.  Improve
How to quantitatively evaluate the relevance model?
•  Relevance testing
01
Relevance testing
•  Get a domain expert
•  Get the most popular search queries
•  Select the relevant document[s] for those queries
•  Test [automated/scripted] in what position the relevant
document is in the result set
•  Aggregate all the test cases
-  In 1-3 first results -> OK
-  In 4-pagesize results -> Can be improved
-  After -> Somethings wrong
01
Relevance testing
Query Context Relevant result
Result
position
google! google.com! 1!
Solr
revolution! Year:2017! lucenerevolution.org! 1!
Relevance
in the wild!
https://
lucenesolrrevolution2017.sch
ed.com/event/BAwt/
relevance-in-the-wild!
4!
edismax!
https://cwiki.apache.org/
confluence/display/solr/The
+Extended+DisMax+Query
+Parser!
6!
01
Behavior analysis and overoptimization
Track your systems queries and click-throughs
•  It is a great start for relevance testing
•  Most common queries improvements will have the biggest
impact
•  Users know what is relevant for them when they see it
•  Do not focus only on the top X cases
29
Real  cases
01
Real cases
•  Oversimplified list of cases and recommendations
•  There are many ways of solving the same problem
-  Use your domain knowledge
-  Choose the best fitting alternative performance/
flexibility
•  All parameters can be
-  configured as defaults in a request handler
-  sent independently in each request
•  Remember to normalize your boosts
01
Classical Information model
Have an article database: library, intranet, website,
catalogue …
•  Multiple metadata on different fields
•  Information has variable importance
•  Use edismax fq parameter to weight different fields
-  fq=id^5 title^3 description
comments^0.1
01
Synonym expansion
Sometimes queries contain similar terms not appearing on
the index
•  Important domain concepts have several names for the
user
•  They are always referred in one way on the index
•  Use synonym expansion to find all the similar matches
-  <filter
class="solr.ManagedSynonymFilterFactory" >
-  {"Vacation”:["Holiday"]}
01
Document type tiering
Have a set of information belonging different categories
•  Certain documents types are just more important
•  Add a field with the document type factor boost and
use the boost parameter
-  boost=documenttypeboost
-  Multiplicative, reindex for different boosts
•  Use boost query for each type
-  bq=documenttype:important^boost
-  Additive, more complex query
01
Date promotion slope
Information is less relevant when old
•  News search, Event related information
•  Add a date decay from today to X months based on the
last updated field
-  boost=recip(ms(NOW,mydatefield),
3.16e-11,1,1)
•  There are approximately 3.16e10 milliseconds in a year,
so one can scale dates to fractions of a year with the
inverse, or 3.16e-11
01
User customization
Applications with registered users know their context
•  Intranet system know user department (employee
registry)
•  Website with registration knows users preferences
(history)
•  Add boost queries with matching information between
metadata and user context
-  bq=department:IT
-  bq=category:electronics
01
Physical proximity relevance
Most applications can request user location
•  Promote the shops/restaurants that are closeby
•  Add boost function calculating distance to the user
-  bf=div(1/geodist(docloc,
36.092938,-115.173179))
01
Rating/Popularity system
Catalogues often provide the possibility of article review
•  Promote articles that get most views
•  Promote articles that get a good average rating
•  Add boost function applying the score of the article
-  bf=sqrt(articlepageviews)- 
bf=if(gte(reviews,req),sqrt(articleavgrevie
w),0)
01
Commerce/Sales parameters
Search on e-shops have many parameters to take into
account
•  Prefer higher profit margin
-  bf=price-cost
•  Encourage last items or many in stock
-  bf=stock OR bf=1/stock
•  Campaing recommendations
-  bq=category:campaigncategory
01
Combine them
All of this techniques can be applied together
•  Just add more parameters
•  Must balance each one
•  Do not forget to evaluate changes
01
Summary
•  Work on relevance à Improve user perception
•  There are plenty of tools/parameters available in
Solr
•  Improve your data and usage tracking
•  Evaluate changes à Guaranteed improvement
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and SparkLucidworks
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Lucene And Solr Document Classification
Lucene And Solr Document ClassificationLucene And Solr Document Classification
Lucene And Solr Document ClassificationAlessandro Benedetti
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingSolr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingLucidworks
 
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, StubhubDeduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, StubhubLucidworks
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy SokolenkoProvectus
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesLucidworks (Archived)
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solrguest432cd6
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkJake Mannix
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Explainability for Learning to Rank
Explainability for Learning to RankExplainability for Learning to Rank
Explainability for Learning to RankSease
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Enginelucenerevolution
 

Was ist angesagt? (20)

Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Lucene And Solr Document Classification
Lucene And Solr Document ClassificationLucene And Solr Document Classification
Lucene And Solr Document Classification
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingSolr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
 
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, StubhubDeduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and Spark
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Explainability for Learning to Rank
Explainability for Learning to RankExplainability for Learning to Rank
Explainability for Learning to Rank
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Elasticsearch speed is key
Elasticsearch speed is keyElasticsearch speed is key
Elasticsearch speed is key
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Engine
 

Ähnlich wie Relevance in the Wild - Daniel Gomez Vilanueva, Findwise

Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Petter Skodvin-Hvammen
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comSimon Hughes
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation EnginesTrey Grainger
 
SFDC Introduction to Apex
SFDC Introduction to ApexSFDC Introduction to Apex
SFDC Introduction to ApexSujit Kumar
 
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...Lucidworks
 
Large Data Volume Salesforce experiences
Large Data Volume Salesforce experiencesLarge Data Volume Salesforce experiences
Large Data Volume Salesforce experiencesCidar Mendizabal
 
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull lucenerevolution
 
Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engineth0masr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with LuceneWO Community
 
Webinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's NewWebinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's NewLucidworks
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 

Ähnlich wie Relevance in the Wild - Daniel Gomez Vilanueva, Findwise (20)

Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
SFDC Introduction to Apex
SFDC Introduction to ApexSFDC Introduction to Apex
SFDC Introduction to Apex
 
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
 
Large Data Volume Salesforce experiences
Large Data Volume Salesforce experiencesLarge Data Volume Salesforce experiences
Large Data Volume Salesforce experiences
 
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
 
Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engine
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
Webinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's NewWebinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's New
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 

Mehr von Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Mehr von Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Kürzlich hochgeladen

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Relevance in the Wild - Daniel Gomez Vilanueva, Findwise

  • 1. Relevance in the Wild Daniel Gómez Villanueva Findwise
  • 2. Agenda •  Introduction •  Similarity functions •  Edismax •  Evaluating •  Real cases
  • 4. 01 Daniel Gómez Villanueva Search specialist consultant •  @ Findwise •  @ Stockholm, Sweden •  Studies -> UPM, KTH •  Technnologies -> Search •  Roles -> Code Java + .NET •  + 10 projects •  + 10 trainings public / private
  • 5. 01 Findwise Search driven solutions •  Founded in 2005 •  100 employees •  Vendor independent consultants •  Sweden, Denmark, Norway, Finland & Poland
  • 6. 01 Relevance definition •  Dictionary.com:-  the condition of being relevant, or connected with the  matter at hand •  Cambridge: -  the degree to which something is related or useful to what is happening or being talked about •  Merriam-Webster (2): -  the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user http://www.dictionary.com/browse/relevance http://dictionary.cambridge.org/us/dictionary/english/relevance https://www.merriam-webster.com/dictionary/relevance
  • 7. 01 Measuring relevance of a result set •  Precision: % of the results that are relevant •  Recall: % of the relevant set that is in the results •  Only evaluates the quality of the result set •  What is the size of the relevant set? •  We need a measure for individual documents https://en.wikipedia.org/wiki/Precision_and_recall
  • 8. 01 What about sorting? A list of results is only as good as its order •  Sort by any type of field: -  Sorting by text -  Sorting by number -  Sorting by rank •  Sort by relevance: -  Sorting by similarity score
  • 9. 9 Similarity How  does  Solr  calculate  the  score  
  • 10. 01 Similarity score Score for individual documents •  Similarity defines the components of Lucene scoring •  Calculated per query per field (​ 𝑠 𝑐𝑜𝑟𝑒↓𝑞, 𝑓 ) -  BM25Similarity -  TFIDFSimilarity -  BooleanSimilarity -  MultiSimilarity -  PerFieldSimilarityWrapper https://lucene.apache.org/core/6_6_0/ core/org/apache/lucene/search/similarities/Similarity.html
  • 11. 01 Similarity parameters •  tf: term frequency -  How many times a term appears in a document -  The more times the term appears the more relevant it is •  idf: inverted document frequency: -  One divided by how many documents in the whole corpus contains the term -  The more document contain the document the less semantically meaningful and therefore relevant it is
  • 12. 01 Similarity parameters •  norm: field norm -  Inversed length of the field -  The longer the field the less relevant a term appearance is •  C: coordinator factor: -  Factor multiplied to the sum of the scores for all terms in the query -  Makes the score end in a reasonable interval of values
  • 13. 01 TF-IDF Term Frequency x Inversed document frequency •  Classical Information retrieval score •  Default in Solr/Lucene from first versions to 6 ​ 𝒔 𝒄𝒐𝒓𝒆↓𝒒, 𝒅 =​ 𝑪↓𝒒, 𝒅 ∑𝒕∈ 𝒒↑▒​ 𝒕 𝒇↓𝒕, 𝒅 ∙​ 𝒊 𝒅𝒇↓𝒕 ∙​ 𝒏 𝒐𝒓𝒎↓𝒅   https://lucene.apache.org/core/6_6_0/ core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
  • 14. 01 TF-IDF – Math normalizations •  TF: squared •  Idf: logarithm •  length: squared ​ 𝒊 𝒅𝒇↓𝒕 =​log⁠(​ 𝒏 𝒖𝒎𝑫𝒐𝒄𝒔/​ 𝒅𝒐𝒄𝑭𝒓𝒆𝒒↓𝒕  +𝟏) +𝟏 ​ 𝒕 𝒇↓𝒕, 𝒅 =√ 𝟐&​ 𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒅   ​ 𝒏 𝒐𝒓𝒎↓𝒅 =​ 𝟏/√ 𝟐&| 𝑫|  
  • 15. 01 BM25 Area Gas Monitor Body piercing jewelry store Soviet multiple rocket launcher
  • 16. 01 Okapi BM25 Best Matching 25 •  Calculates relevance based on probability •  The effect of term frequency has a ceiling: k1 •  The length of the field affects the tf curve: b •  Default from version 6 ​ 𝒔 𝒄𝒐𝒓𝒆↓𝒒, 𝒅 =​ 𝑪↓𝒒, 𝒅 ∑𝒕∈ 𝒒↑▒∙​ 𝒊 𝒅𝒇↓𝒕 ∙​​ 𝒕 𝒇↓𝒕, 𝒅 ∙(​ 𝒌↓ 𝟏 +𝟏)/​ 𝒕 𝒇↓𝒕, 𝒅  +​ 𝒌↓ 𝟏 ∙(𝟏− 𝒃+ 𝒃∙​​| 𝒅|/𝒂𝒗𝒈(|𝒅|)  )   ​ 𝒌↓ 𝟏 =𝟏.𝟐 𝒃=𝟎.𝟕𝟓
  • 17. 01 Okapi BM25 – Math normalizations •  TF: log •  Idf: logarithm •  length: included in tf •  k1 & b: parameters ​ 𝒊 𝒅𝒇↓𝒕 =​log⁠(​ 𝒏 𝒖𝒎𝑫𝒐𝒄𝒔−​ 𝒅𝒐𝒄𝑭𝒓𝒆𝒒↓𝒕 +𝟎.𝟓/​ 𝒅 𝒐𝒄𝑭𝒓𝒆𝒒↓𝒕  +𝟎.𝟓 +𝟏)  ​ 𝒕 𝒇↓𝒕, 𝒅 =​(​ 𝒌↓ 𝟏 +𝟏)∙​ 𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒒 /​ 𝒌↓ 𝟏 +​ 𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒒   𝒃∈ [𝟎,𝟏]​ 𝒌↓ 𝟏  ∈[𝟎,∞) ​ 𝒕 𝒇↓𝒕, 𝒅 =​(​ 𝒌↓ 𝟏 +𝟏)∙​ 𝒕𝒆𝒓𝒎𝒇𝒓𝒆𝒒↓𝒕, 𝒒 /​ 𝒕 𝒇↓𝒕, 𝒅 +​ 𝒌↓ 𝟏  ∙(𝟏− 𝒃+ 𝒃∙​​| 𝒅|/𝒂𝒗𝒈(|𝒅|)  ) 
  • 18. 01 Configuring similarity Similarity element in Schema •  Global similarity -  Using configuration template (API pending: SOLR-7242) •  Similarity per field type -  Field type action in Schema API from 5.3 (SOLR-7679) {   "add-­‐field-­‐type":  {    "name":  "fieldTypeWithSimilarity",    "class":  "org.apache.solr.schema.TextField",    "analyzer":  {  ...  },    "similarity":  {      "class":  "org.apache.lucene.misc.SweetSpotSimilarity"    }   }   }   <similarity  class="solr.BM25SimilarityFactory">    <float  name="k1">1.2</float>    <float  name="b">0.76</float>   </similarity>   https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
  • 19. 01 Custom similarity You can write your own •  For very specific cases •  Just extend org.apache.lucene.search.similarities.BaseSimilarity -­‐  float  score(BasicStats  stats,  float  freq,  float  docLen)     -­‐  String  toString()   -­‐  void  explain(List<Explanation>  subExpls,  BasicStats  stats,   -­‐                           int  doc,  float  freq,  float  docLen) https://lucene.apache.org/core/6_6_0/core/ org/apache/lucene/search/similarities/SimilarityBase.html
  • 21. 01 Edismax Extended disyuntive maximum •  Relevance on several fields -  fq=field1^boost field2&q=query •  Get the score of the best (maximum) match -  Except if use tie parameter to factor in other fields •  AND, OR, or Minimum match -  Can manipulate how many term matches required
  • 22. 01 Understanding the calculations To understand the order of scores use debug parameter •  debug=results or debugQuery 1.8534669  =  weight(_text_:book  in  3)  [],  result  of:      1.8534669  =  score(doc=3,freq=6.0  =  termFreq=6.0      ),  product  of:          1.1108824  =  idf(docFreq=13,  docCount=40)          1.6684636  =  tfNorm,  computed  from:              6.0  =  termFreq=6.0              1.2  =  parameter  k1              0.75  =  parameter  b              142.975  =  avgFieldLength              256.0  =  fieldLength  
  • 23. 01 Edismax parameters Parameter Relevance Filter Operator Syntax q! YES! YES! +! Query! fq! NO! YES! NO! Query! bq! YES! NO! +! Query! bf! YES! NO! +! Function! boost! YES! NO! *! Function!
  • 25. 01 Relevance testing How do we work with relevance? 1.  Change the relevance model 2.  Evaluate 3.  Improve How to quantitatively evaluate the relevance model? •  Relevance testing
  • 26. 01 Relevance testing •  Get a domain expert •  Get the most popular search queries •  Select the relevant document[s] for those queries •  Test [automated/scripted] in what position the relevant document is in the result set •  Aggregate all the test cases -  In 1-3 first results -> OK -  In 4-pagesize results -> Can be improved -  After -> Somethings wrong
  • 27. 01 Relevance testing Query Context Relevant result Result position google! google.com! 1! Solr revolution! Year:2017! lucenerevolution.org! 1! Relevance in the wild! https:// lucenesolrrevolution2017.sch ed.com/event/BAwt/ relevance-in-the-wild! 4! edismax! https://cwiki.apache.org/ confluence/display/solr/The +Extended+DisMax+Query +Parser! 6!
  • 28. 01 Behavior analysis and overoptimization Track your systems queries and click-throughs •  It is a great start for relevance testing •  Most common queries improvements will have the biggest impact •  Users know what is relevant for them when they see it •  Do not focus only on the top X cases
  • 30. 01 Real cases •  Oversimplified list of cases and recommendations •  There are many ways of solving the same problem -  Use your domain knowledge -  Choose the best fitting alternative performance/ flexibility •  All parameters can be -  configured as defaults in a request handler -  sent independently in each request •  Remember to normalize your boosts
  • 31. 01 Classical Information model Have an article database: library, intranet, website, catalogue … •  Multiple metadata on different fields •  Information has variable importance •  Use edismax fq parameter to weight different fields -  fq=id^5 title^3 description comments^0.1
  • 32. 01 Synonym expansion Sometimes queries contain similar terms not appearing on the index •  Important domain concepts have several names for the user •  They are always referred in one way on the index •  Use synonym expansion to find all the similar matches -  <filter class="solr.ManagedSynonymFilterFactory" > -  {"Vacation”:["Holiday"]}
  • 33. 01 Document type tiering Have a set of information belonging different categories •  Certain documents types are just more important •  Add a field with the document type factor boost and use the boost parameter -  boost=documenttypeboost -  Multiplicative, reindex for different boosts •  Use boost query for each type -  bq=documenttype:important^boost -  Additive, more complex query
  • 34. 01 Date promotion slope Information is less relevant when old •  News search, Event related information •  Add a date decay from today to X months based on the last updated field -  boost=recip(ms(NOW,mydatefield), 3.16e-11,1,1) •  There are approximately 3.16e10 milliseconds in a year, so one can scale dates to fractions of a year with the inverse, or 3.16e-11
  • 35. 01 User customization Applications with registered users know their context •  Intranet system know user department (employee registry) •  Website with registration knows users preferences (history) •  Add boost queries with matching information between metadata and user context -  bq=department:IT -  bq=category:electronics
  • 36. 01 Physical proximity relevance Most applications can request user location •  Promote the shops/restaurants that are closeby •  Add boost function calculating distance to the user -  bf=div(1/geodist(docloc, 36.092938,-115.173179))
  • 37. 01 Rating/Popularity system Catalogues often provide the possibility of article review •  Promote articles that get most views •  Promote articles that get a good average rating •  Add boost function applying the score of the article -  bf=sqrt(articlepageviews)-  bf=if(gte(reviews,req),sqrt(articleavgrevie w),0)
  • 38. 01 Commerce/Sales parameters Search on e-shops have many parameters to take into account •  Prefer higher profit margin -  bf=price-cost •  Encourage last items or many in stock -  bf=stock OR bf=1/stock •  Campaing recommendations -  bq=category:campaigncategory
  • 39. 01 Combine them All of this techniques can be applied together •  Just add more parameters •  Must balance each one •  Do not forget to evaluate changes
  • 40. 01 Summary •  Work on relevance à Improve user perception •  There are plenty of tools/parameters available in Solr •  Improve your data and usage tracking •  Evaluate changes à Guaranteed improvement