3. 3
• work in Search for 6 years
• Apache Lucene/Solr committer for 2 years
• speak at LuceneRevolution, BerlinBuzzwords
• chief search engineer in EPAM
ABOUT ME
5. 5
ESTABLISHED & EXPANDING GLOBAL VERTICALS
Award-winning Wealth Management Platform
Deep Expertise in Current and
Emerging FinTech
Working with 5 of the 10 Largest
Investment Banks
Leading Digital Transformation for
Global Retailers
Working with largest online travel association (OTA)
& largest global hospitality company
Recognized M&E Leader by
Independent Research Analysts
Working with 4 out of the 4 Top Broadcast Networks
and 14 out of the top 30 TV Networks to transform
consumer-driven media
R&D Domain Experts with 700+ Complex
Solutions & Services Supporting the Entire
Drug Discovery Workflow
Working with 9 of the 10 Top
Pharma Companies
24-Year History of Leading
Product Development
Working with 30+ of the top 100 ISVs
FINANCIAL SERVICES TRAVEL & CONSUMER
SOFTWARE & HI-TECHLIFE SCIENCES AND HEALTHCARE
MEDIA & ENTERTAINMENT
EMERGING
Deep Expertise Offers
Innovative Solutions
Working with industries ranging from
Energy and Utilities to Telecom and Automotive
49. 49
AnalysingInfixSuggester FOR infix SEARCH
• feed AnalysingInfixSuggester with main index’s terms
• enable EdgeNGramFilter for AnalysingInfixSuggester
discipline
iscipline
scipline
cipline
ipline
pline
line
ine
ne
e
50. 50
• 14 M terms -> 79 M EdgeNGramms
• 10 min
• 3.3 G (25%)
BUILDING SUGGESTER INDEX
discipline
iscipline
scipline
cipline
ipline
pline
line
ine
ne
e
54. 54
AnalysingInfixSuggester FOR infix SEARCH
• feed AnalysingInfixSuggester with main index’s terms
• enable EdgeNGramFilter for AnalysingInfixSuggester
• override wildcard expansion by calling AnalysingInfixSuggester
63. 63
• A slight index format change
• many terms refer to the same postings list
• API is :
• indexWriter.deriveTerms(“name”, “name_edge”, new EdgeNgrammTokenFilter());
• search: name_edge:sci*
• Hijacking and Injecting codecs LUCENE-7863
• Promising for deep taxonomies.
Derivative terms
68. 68
REFERENCES
What is in a Lucene index? Adrien Grand
https://www.youtube.com/watch?v=T5RmMNDR5XI
Automata Invasion. Robert Muir, Michael Mccandless
https://www.youtube.com/watch?v=pd2jvy2IbJE
• Lucene Search Essentials: Scorers, Collectors and Custom Queries, Mikhail Khludnev
https://www.youtube.com/watch?v=X9YovpYj6uo
A new Lucene suggester based on infix matches
http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-
infix.html
69. 69
REFERENCES
What is in a Lucene index? Adrien Grand
https://www.youtube.com/watch?v=T5RmMNDR5XI
Automata Invasion. Robert Muir, Michael Mccandless
https://www.youtube.com/watch?v=pd2jvy2IbJE
В поисках Tommy Hilfiger, Михаил Хлуднев
https://www.youtube.com/watch?v=Azf4oUL-Dqc
A new Lucene suggester based on infix matches
http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-infix.html