Suche senden
Hochladen
Taxonomical Semantical Magical Search - Doug Turnbull, OpenSource Connections
•
5 gefällt mir
•
1,038 views
Lucidworks
Folgen
Presented at Lucene/Solr Revolution 2017
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 40
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Fuzzy Matching to the Rescue
Fuzzy Matching to the Rescue
Domino Data Lab
DN 2017 | Boosting Product Categorization with Machine Learning | Amadeus Mag...
DN 2017 | Boosting Product Categorization with Machine Learning | Amadeus Mag...
Dataconomy Media
Boosting Product Categorization with Machine Learning
Boosting Product Categorization with Machine Learning
Amadeus Magrabi
All About Keywords: Why they are important?
All About Keywords: Why they are important?
Sushan Sharma
Deep Learning at AWS: Embedding & Attention Models
Deep Learning at AWS: Embedding & Attention Models
Amazon Web Services
Tag-based Semantic Website Recommendation
Tag-based Semantic Website Recommendation
Onur Yılmaz
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Max Irwin
Effective web search techniques
Effective web search techniques
aliciafe0215
Empfohlen
Fuzzy Matching to the Rescue
Fuzzy Matching to the Rescue
Domino Data Lab
DN 2017 | Boosting Product Categorization with Machine Learning | Amadeus Mag...
DN 2017 | Boosting Product Categorization with Machine Learning | Amadeus Mag...
Dataconomy Media
Boosting Product Categorization with Machine Learning
Boosting Product Categorization with Machine Learning
Amadeus Magrabi
All About Keywords: Why they are important?
All About Keywords: Why they are important?
Sushan Sharma
Deep Learning at AWS: Embedding & Attention Models
Deep Learning at AWS: Embedding & Attention Models
Amazon Web Services
Tag-based Semantic Website Recommendation
Tag-based Semantic Website Recommendation
Onur Yılmaz
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Max Irwin
Effective web search techniques
Effective web search techniques
aliciafe0215
Learn more about Entity Extraction May 2014
Learn more about Entity Extraction May 2014
Anders Häggdahl
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
richwig
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
Andy Stretton
Fuzzy Matching on Apache Spark with Jennifer Shin
Fuzzy Matching on Apache Spark with Jennifer Shin
Databricks
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
National Information Standards Organization (NISO)
Swot Analysis Essay
Swot Analysis Essay
Jessica Hunter
Swot Analysis Essay.pdf
Swot Analysis Essay.pdf
Evelin Santos
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Gabriel Moreira
Key Phrases for Better Search
Key Phrases for Better Search
Sematext Group, Inc.
Constructing your search
Constructing your search
Jamie Bisset
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Lucidworks
#1NLab17 - Eight for Eight: Finishing Strong
#1NLab17 - Eight for Eight: Finishing Strong
One North
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Lucidworks
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
Amazon Web Services
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
PyData
Quepy
Quepy
dmoisset
yn
yn
Findability University
Information Architecture
Information Architecture
Olivier Tripet
Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?
Fred Leise
Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017
Amazon Web Services
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
Lucidworks
Weitere ähnliche Inhalte
Ähnlich wie Taxonomical Semantical Magical Search - Doug Turnbull, OpenSource Connections
Learn more about Entity Extraction May 2014
Learn more about Entity Extraction May 2014
Anders Häggdahl
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
richwig
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
Andy Stretton
Fuzzy Matching on Apache Spark with Jennifer Shin
Fuzzy Matching on Apache Spark with Jennifer Shin
Databricks
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
National Information Standards Organization (NISO)
Swot Analysis Essay
Swot Analysis Essay
Jessica Hunter
Swot Analysis Essay.pdf
Swot Analysis Essay.pdf
Evelin Santos
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Gabriel Moreira
Key Phrases for Better Search
Key Phrases for Better Search
Sematext Group, Inc.
Constructing your search
Constructing your search
Jamie Bisset
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Lucidworks
#1NLab17 - Eight for Eight: Finishing Strong
#1NLab17 - Eight for Eight: Finishing Strong
One North
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Lucidworks
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
Amazon Web Services
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
PyData
Quepy
Quepy
dmoisset
yn
yn
Findability University
Information Architecture
Information Architecture
Olivier Tripet
Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?
Fred Leise
Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017
Amazon Web Services
Ähnlich wie Taxonomical Semantical Magical Search - Doug Turnbull, OpenSource Connections
(20)
Learn more about Entity Extraction May 2014
Learn more about Entity Extraction May 2014
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
Fuzzy Matching on Apache Spark with Jennifer Shin
Fuzzy Matching on Apache Spark with Jennifer Shin
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
Swot Analysis Essay
Swot Analysis Essay
Swot Analysis Essay.pdf
Swot Analysis Essay.pdf
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Key Phrases for Better Search
Key Phrases for Better Search
Constructing your search
Constructing your search
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
#1NLab17 - Eight for Eight: Finishing Strong
#1NLab17 - Eight for Eight: Finishing Strong
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
Quepy
Quepy
yn
yn
Information Architecture
Information Architecture
Why Are Taxonomies Necessary?
Why Are Taxonomies Necessary?
Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017
Mehr von Lucidworks
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
Lucidworks
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Lucidworks
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Lucidworks
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
Mehr von Lucidworks
(20)
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
Kürzlich hochgeladen
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
panagenda
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
LoriGlavin3
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
marketing932765
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
Kari Kakkonen
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Inflectra
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
LoriGlavin3
2024 April Patch Tuesday
2024 April Patch Tuesday
Ivanti
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
Knoldus Inc.
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
Ingrid Airi González
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
Pixlogix Infotech
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Pim van der Noll
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
itnewsafrica
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
BookNet Canada
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
Manik S Magar
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
Bernd Ruecker
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
Kaya Weers
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
LoriGlavin3
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
Wes McKinney
A Framework for Development in the AI Age
A Framework for Development in the AI Age
Cprime
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
Kürzlich hochgeladen
(20)
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
2024 April Patch Tuesday
2024 April Patch Tuesday
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
A Framework for Development in the AI Age
A Framework for Development in the AI Age
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Taxonomical Semantical Magical Search - Doug Turnbull, OpenSource Connections
1.
Taxonomical Semantical Magical Search OpenSource
Connections Doug Turnbull Relevance Lead dturnbull@o19s.com @softwaredoug © OpenSource Connections, 2017
2.
Solr/ES consulting: team
100% focused on relevance Learn to rank – semantic search – relevance – personalization – findability Who are we?
3.
© OpenSource Connections,
2017 Reflect: What problem are you trying to solve when you jump to 'semantic search'?
4.
© OpenSource Connections,
2017 "We studied spontaneous word choice for objects in five application-related domains, and found the variability to be surprisingly large. In every case two people favored the same term with probability <0.20. " "Simulations show how this fundamental property of language limits the success of various design methodologies for vocabulary-driven interaction. "
5.
© OpenSource Connections,
2017 Solve with keyword stuffing? - Content creators guarantee every "shoe" has a "shoe" keyword somewhere! - And every wing-tip mentions dress shoes… - ...Ad infinitum…
6.
© OpenSource Connections,
2017 Solve with tagging? - Java is a type of JVM language. Should this be tagged JVM too? What is a "query string"? Which of these tags is useful for search? - Who tags everything? Is it consistent? What are the rules? (taken from Stackoverflow)
7.
© OpenSource Connections,
2017 Solve with synonyms? Yes! Synonyms can be a tool that can help us. But it's easy to mess up: shoes => dress shoes wing tips,shoes tennis shoes,shoes When I search for tennis shoes, why do I get wing tips; why do I get dresses?!?
8.
© OpenSource Connections,
2017 Talking teaches/reminds vocab (Searching) shoes dress shoes brown wing tips Searcher learning: results gives clues to help shopper refine further Searcher trusting: more confident on terms to use Searcher uncertain: uses broad queries to experiment
9.
© OpenSource Connections,
2017 Searchers get more specific... wing tips Hierarchy of Ideas: NP (item): "wing tips" type_of:"dress shoes" type_of:"shoe" shoes NP(item): "shoe" More specific
10.
© OpenSource Connections,
2017 … and try types of modifiers wing tips NP (item): "wing tips" type_of:"dress shoes" type_of:"shoe" sapphire wing tips NP (item): "wing tips" type_of:"dress shoes" type_of:"shoe" ADJ (color) "sapphire" type_of:"blue"
11.
© OpenSource Connections,
2017 Semantic search: enable semantic exploration Low term specificity: search term specifies a wide category to explore Searching for "shoes" High term specificity: search term too specific, try semantically broader/similar items "Show 'dress shoes' for 'oxfords' "
12.
© OpenSource Connections,
2017 Make Solr grok type-of relationships "wing tip" is a type of "dress shoe" is a type of "shoe" Search here, only show wing tips Search here, show all things that are a type-of shoe Beyond the actual terms used in docs
13.
© OpenSource Connections,
2017 Per-entity terms a taxonomy Shoes Athletic Shoes Dress Shoes High Heels Oxfords Wing Tips Running Shoes Tennis Shoes Blue Sapphire Sky blue A search taxonomy (not the taxonomy for your site nav)
14.
© OpenSource Connections,
2017 Index-time tax. expansion Item Color Size Substrings -> Entities Expand to broad/narrow tennis shoes => footwearshoesathletictennis_shoes sapphire => bluesapphire
15.
© OpenSource Connections,
2017 In Solr... Item Color Size Possible to build from simple keepwords Query or Index time synonyms uses TF*IDF of concept Substrings -> Entities Expand to broad/narrow tennis shoes => tennis_shoes,athletic_shoes,shoes,... sapphire => sapphire,blue
16.
© OpenSource Connections,
2017 In Solr, index time... (Input Text) You will love these maroon dress shoes (tokenization & maybe stemming) [you] [will] [love] [these] [maroon] [dress] [shoes] compound/decompound (syn filter) [you] [will] [love] [these] [maroon] [dress_shoes] Keepwords for entity [dress_shoes] Semantic expansion (syn filter) [dress_shoes] [shoes] (Input Text) You will love these maroon dress shoes (tokenization & maybe stemming) [you] [will] [love] [these] [maroon] [dress] [shoes] compound/decompound (syn filter) [you] [will] [love] [these] [maroon] [dress_shoes] Keepwords for entity [maroon] Semantic expansion (syn filter) [maroon] [brown] "Item" copy field "Color" copy field
17.
© OpenSource Connections,
2017 Index time solution (Input Text) brown wing tips (Item analyzer output) [wing_tips] [dress_shoes] [shoes] (Input Text) brown wing tips (Color analyzer output) [brown] Matches maroon, because at index time: maroon => brown, maroon IDF Highest for wing_tips Lowest for shoes (eliminate TF? norms?) q=brown wing tips &defType=edismax &sow=false &qf=item^100 color^10 (you'll want to search more than these semantic fields)
18.
© OpenSource Connections,
2017 Query-time tax. expansion How do users think of your items? Item Color Size Trained/built From Query logs Substrings -> Entities Expand to broad/narrow tennis shoes => item:"tennis shoes" OR item:"athletic shoes" OR item:"shoes" ... sapphire => color:blue OR color:sapphire sapphire tennis shoes
19.
© OpenSource Connections,
2017 Query Phrase In Solr... (Input Text) Brown wing tips Semantic expansion (syn filter) [wing tips] [dress shoes] [shoes] (Input Text) Brown wing tips Semantic expansion (syn filter) [brown] [maroon] Item Semantic Analyzer Color Semantic Analyzer Transform to description("dress shoes" OR "wing tips" OR shoes OR maroon OR brown) Problems: - two query analyzers for same field not possible in Solr - Can't re-tokenize [dress shoes] -> "dress shoes" phrase q
20.
© OpenSource Connections,
2017 Match Query Parserhttps://github.com/o19s/match-query-parser q=brown wing tips &defType=edismax &qf=description title &bq={!match analyze_as=item_tax search_with=phrase qf=description v=$q}^100 &bq={!match analyze_as=color_tax search_with=phrase qf=description v=$q} How to analyze query string Phrase: retokenize multi word tokens and do phrase search
21.
© OpenSource Connections,
2017 Other building blocks Auto Phrase Token Filter / Query Auto Filtering: - https://github.com/lucidworks/auto-phrase-tokenfilter - https://lucidworks.com/2015/02/17/introducing-query-autofiltering/ Health-on-net Lucene Synonyms - https://github.com/healthonnet/hon-lucene-synonyms Sematext Query Segmenter: - https://github.com/sematext/query-segmenter Shopping 24 Bmax Query Parser - https://github.com/shopping24/solr-bmax-queryparser
22.
© OpenSource Connections,
2017 Deriving Querqy rules from taxonomies https://github.com/renekrie/querqy
23.
© OpenSource Connections,
2017 Query Time vs Index Time Query Time: PROS - No need to reindex when updating managed vocab CONS - Relevance scoring of terms (boosts help) - Complex / slow queries Index Time: PROS - TF*IDF more accurate scoring (broad concepts score low, narrow score high) - Faster queries CONS - Reindexing for synonym changes
24.
© OpenSource Connections,
2017 Structure your docs for query understanding Relevance engineer's challenge: - Where can we begin with a taxonomy? - Reuse filters & facets - Reuse your page's navigational taxonomy? - Track which searches land on pages (old school click tracking)? - Zero results tracking? - How do we incentivize content creators to move away from keyword stuffing to organizing to search keyword taxonomy? - Finally: we don't care about the source data model, only what helps users find things
25.
© OpenSource Connections,
2017 SHReC Algorithm
26.
© OpenSource Connections,
2017 SHReC Algorithm Simple doc frequency in-content to look for super-concepts / sub-concepts term/phrase x subsumes y (x parent concept?) when: df(x) > df(y) df(x ∧ y) / df(y) >= α (α = 1 complete subsumption)
27.
© OpenSource Connections,
2017 SHReC Algorithm Example Shoes Wing Tips df("shoes") > df("wing tips") df("shoes" ∧ "wing tips") / df("wing tips") >= 0.8
28.
© OpenSource Connections,
2017 SHReC Algorithm with Solr Shoes Wing Tips df("shoes") > df("wing tips") df("shoes" ∧ "wing tips") / df("wing tips") >= 0.8 Cache doc freq (q=*:*&facet.field=item&facet=true) q=item:"wing tips" AND item:shoes, num results
29.
© OpenSource Connections,
2017 Unfortunately reality is messy Shoes Wing Tips Your data probably looks like
30.
© OpenSource Connections,
2017 Idea:mine other corpus? Shoes Wing Tips ● but still, what phrases do you test?
31.
© OpenSource Connections,
2017 Statistically sig. colocations Wing Tips WingTips Student t-test against null hypothesis that wing / tips unrelated
32.
© OpenSource Connections,
2017 Refinements shoe dress shoe (12%) wing tip (23%) tennis shoe (11%) blue dress shoe (1%) sapphire brooks brothers dress shoe (0.001%) brown dress shoe (20%) Colors scattered throughout Sub concepts, likely child phrases tennis shoe (11%) Siblings refine each other running shoe (34%) Should these be in supercategory "athletic shoes"?
33.
© OpenSource Connections,
2017 Refinement mining in Solr docs = [{ "query": "shoe" "refinement": "dress shoe" }, { "query": "shoe" "refinement": "brown shoe" }, { "query": "tie" "refinement": "brown tie" }] q=query:shoe& facet=true& facet.field=refinement Refinements: - dress shoe (4) - tennis shoe (2) - ...
34.
© OpenSource Connections,
2017 SHReC w/ Refinements docs = [{ "query": "shoe" "refinement": "dress shoe" }, { "query": "shoe" "refinement": "brown shoe" }, { "query": "tie" "refinement": "brown tie" }] q=query:shoe& facet=true& facet.field=refinement
35.
© OpenSource Connections,
2017 SHReC w/ Refinements q=query:shoe& facet=true& facet.field=refinement Num results for q=shoe (Slow, but you do this rarely) Seed the corpus exploration SHReC
36.
© OpenSource Connections,
2017 SHReC w/ sig terms scoreNodes( select( facet(collectionName, q="query:shoes", buckets="refinements", bucketSorts="count(*) desc", bucketSizeLimit="100", count(*)), refine_graph as node, "count(*)", replace(collection, null, withValue=collectionName), replace(field, null, withValue=refine_graph)) ) What's actually happening in SHReC is significance scoring, which is baked into Solr: Relationship of local vs global
37.
© OpenSource Connections,
2017 Other ways of measuring term stat. significance ● Trey G. Solr knowledge graph (hope you saw his talk)! https://lucidworks.com/video/leveraging-lucenesolr-as -a-knowledge-graph-and-intent-engine/ ● Mark Harwood Elastic Graph / Sig Terms https://www.elastic.co/elasticon/conf/2016/sf/graph-c apabilities-in-the-elastic-stack
38.
© OpenSource Connections,
2017 But word2vec, LDA, etc - Focused on content, not users: Focused on discovering topics/synonyms in content: we often need search query to content vernacular mappings - Traditional topic modeling flat - Hierarchies extracted from content don't reflect user's hierarchies & how they map to content - Don't confuse co-occurences with synonyms without extensive data modeling/munging to get your content here
39.
© OpenSource Connections,
2017 Questions? Further Reading: - Relevant Search! - Blog articles: - Building Entity-focused search w/ Keyphrases: - http://opensourceconnections.com/blog/2016/12/02/solr-elasticsearch-synony ms-better-patterns-keyphrases/ - Synonym best practices: - http://opensourceconnections.com/blog/2016/12/23/elasticsearch-synonyms-p atterns-taxonomies/ - Match Query Parser: - http://opensourceconnections.com/blog/2017/01/23/our-solution-to-solr-multite rm-synonyms/ Discount code: relsearch http://manning.com
40.
- <shoutout BLOOOMBERG!!> -
We built a learning to rank plugin for that other search engine... Shameless plug
Jetzt herunterladen