See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011
Talk and presentation about how to use, understand and visualize Solr 'explain' information—essential output from Solr that lets you better tune and debug your search application. In the talk, I'll show the free software that is in development right now, that visualize Solr 'explain' information, such as how the score of the documents were counted, from what it is taken, how it was counted,which tokens mattered the most, and so on.
Gen AI in Business - Global Trends Report 2024.pdf
Understanding and visualizing solr explain information - Rafal Kuc
1. Understanding and visualising
Solr explain information
Rafał Kuć, Marek Rogoziński, Solr.pl
r.kuc@solr.pl, m.rogozinski@solr.pl, 18.10.2011
2. My Background
Rafał Kuć
• Working with Lucene since 2002
• Working with Solr since 2007
Solr.pl
• Co – founder (with Marek Rogoziński)
ń
Area of expertise
• Lucene and Solr consultant and architect in
many major e-commerce sites in Poland
• Author of „Solr 3.1 cookbook” by Packt
Publishing
• Father, husband, Starcraft II player and a
gardener after hours ☺
3
3. What I Will Cover
Understanding and visualising Solr explain
information
How to make the information given by
Apache Solr explain easily readable by a
Solr user (not much technical one)
Context
• Complicated explain made simple
• Explain other made even simpler
What’s next to come
4
5. The Challenge
Common questions like:
• Why this document was found ?
• Why this document wasn’t found ?
• Why this document is higher than the other one ?
• Why the results list look like this ?
Considerations
• Do we always have to anwser those questions ?
So how to make users get the answers they want ?
• That’s how http://explain.solr.pl was born
6
6. Let’s look at a typical example
You run a query
• q=ddr&defType=dismax&qf=name^1000+description^100&bf
=pow(price,1.5)&debugQuery=true&indent=true
And you see the explain information
1.6771803 = (MATCH) sum of:
0.64883727 = (MATCH) max of:
0.64883727 = (MATCH) weight(name:ddr^1000.0 in 6), product of:
0.99999994 = queryWeight(name:ddr^1000.0), product of:
1000.0 = boost
2.446919 = idf(docFreq=3, maxDocs=17)
4.0867718E-4 = queryNorm
0.6488373 = (MATCH) fieldWeight(name:ddr in 6), product of:
1.4142135 = tf(termFreq(name:ddr)=2)
2.446919 = idf(docFreq=3, maxDocs=17)
0.1875 = fieldNorm(field=name, doc=6)
1.028343 = (MATCH) FunctionQuery(pow(float(price),const(1.5))), product of:
2516.272 = pow(float(price)=185.0,const(1.5))
1.0 = boost
4.0867718E-4 = queryNorm
7
7. Some theory
tf – term’s frequency
df – document frequency
idf – inverse document frequency
norm – normalization factor
• queryNorm – query normalization factor
• fieldNorm – field normalization factor
coord – score factor
8
8. Let’s take a look at it again
1.6771803 = (MATCH) sum of:
0.64883727 = (MATCH) max of:
0.64883727 = (MATCH) weight(name:ddr^1000.0 in 6), product of:
0.99999994 = queryWeight(name:ddr^1000.0), product of:
1000.0 = boost
2.446919 = idf(docFreq=3, maxDocs=17)
4.0867718E-4 = queryNorm
0.6488373 = (MATCH) fieldWeight(name:ddr in 6), product of:
1.4142135 = tf(termFreq(name:ddr)=2)
2.446919 = idf(docFreq=3, maxDocs=17)
0.1875 = fieldNorm(field=name, doc=6)
1.028343 = (MATCH) FunctionQuery(pow(float(price),const(1.5))), product of:
2516.272 = pow(float(price)=185.0,const(1.5))
1.0 = boost
4.0867718E-4 = queryNorm
18. What you gain from explain.solr.pl
View Solr explain information in a human
readable form
Easily recognize the most influencing elements
of the scoring process
Answer the questions faster
More things to come in the future
19
19. Plans for the future
Support for more formats of Apache Solr
explain (right now, only Solr 3.x is supported)
Visualisation of additional data
More functionalities like:
• query problems analysis
• query syntax analysis and explanation
• query time analysis and visualization
• result comparison between cores or instances
Very distant future - additional web application
deployed along Solr to enable real time
analysis of boosts influence
20. Wrap Up
The http://explain.solr.pl should be available
very soon (probably end of October or mid
November)
Code of explain.solr.pl will be available on
GitHub soon after the initial release
There will be a Java version of the
http://explain.solr.pl which will cover much more
information
21
21. Sources
Links
• http://www.solr.pl
• http://explain.solr.pl
• http://lucene.apache.org ☺
We would like to thank:
• Łukasz Lewandowski (http://llewandowski.pl/) for
his work on the GUI
• Hubert ‘depesz’ Lubaczewski (http://depesz.com)
for idea ☺
22