SlideShare a Scribd company logo
1 of 15
Download to read offline
To be or not to be.
Neo4j Full Text Search
Tips and Tricks
graphaware.com
@graph_aware
#NODES19 - Oct 10, 2019
Christophe Willemsen
CTO - GraphAware
@ikwattro
Agenda
● Full Text Search Syntax refresher
● Relevant Search using Graphs
● Advanced Full Text Search Queries
How is text data indexed ?
Default Stopwords for English
"a", "an", "and", "are", "as", "at", "be", "but",
"by", "for", "if", "in", "into", "is", "it",
"no", "not", "of", "on", "or", "such",
"that", "the", "their", "then",
"there","these",
"they", "this", "to", "was", "will", "with"
Default Stopwords for English
"a", "an", "and", "are", "as", "at", "be",
"but", "by", "for", "if", "in", "into", "is", "it",
"no", "not", "of", "on", "or", "such",
"that", "the", "their", "then",
"there","these",
"they", "this", "to", "was", "will", "with"
Custom Lucene Analyzer
Custom Lucene Analyzer
ikwattro@mbp666 ~/d/_/nodes19> ll plugins/
total 28264
drwxr-xr-x 4 ikwattro staff 128B Oct 10 09:46 .
drwxr-xr-x 17 ikwattro staff 544B Sep 22 20:25 ..
-rw-r--r--@ 1 ikwattro staff 13M Sep 22 20:30 apoc-3.5.0.4-all.jar
-rw-r--r-- 1 ikwattro staff 4.9K Sep 23 12:08 fts-extra-1.0.0.jar
Relevant Search
Relevant Search
● You first query the graph to build the user context
● You then use the context to make search results relevant not only
based on their text similarity, but also based on who the user is in
the network.
Relevant Search
community.neo4j.com
Relevant Search
Relevant Search
Hunger Games Questions for
"To be or not to be. Full Text Search tips and
tricks"
1. Easy: What type of properties can we index for full text search.
a. All properties
b. Only strings and integers
c. Only strings
2. Medium: How can I index stopwords such that they can be searched?
a. Using an analyzer available in Neo4j
b. Using a custom analyzer written in Cypher
c. Using a custom analyzer written in Java
3. Hard: How can you boost a specific field during a query ?
a. Using the ~ operator
b. Using the ^ operator
c. Using the + operator
Answer here: r.neo4j.com/hunger-games
Thank you
Christophe Willemsen
CTO - GraphAware
@ikwattro
graphaware.com

More Related Content

Similar to To be or not to be.

Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
tpoelzer
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
tpoelzer
 

Similar to To be or not to be. (20)

Short URLs, Big Fun
Short URLs, Big FunShort URLs, Big Fun
Short URLs, Big Fun
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
 
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
 
Selfish Accessibility: Government Digital Service
Selfish Accessibility: Government Digital ServiceSelfish Accessibility: Government Digital Service
Selfish Accessibility: Government Digital Service
 
Natural Language Search with Neo4j - Kenny Bastani @ GraphConnect NY 2013
Natural Language Search with Neo4j - Kenny Bastani @ GraphConnect NY 2013Natural Language Search with Neo4j - Kenny Bastani @ GraphConnect NY 2013
Natural Language Search with Neo4j - Kenny Bastani @ GraphConnect NY 2013
 
Elasticsearch at EyeEm
Elasticsearch at EyeEmElasticsearch at EyeEm
Elasticsearch at EyeEm
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
 
Selfish Accessibility — WordCamp Europe 2017
Selfish Accessibility — WordCamp Europe 2017Selfish Accessibility — WordCamp Europe 2017
Selfish Accessibility — WordCamp Europe 2017
 
Selfish Accessibility: WordCamp London 2017
Selfish Accessibility: WordCamp London 2017Selfish Accessibility: WordCamp London 2017
Selfish Accessibility: WordCamp London 2017
 
Scaling Saved Searches at eBay Kleinanzeigen
Scaling Saved Searches at eBay KleinanzeigenScaling Saved Searches at eBay Kleinanzeigen
Scaling Saved Searches at eBay Kleinanzeigen
 
Online research and research skills
Online research and research skillsOnline research and research skills
Online research and research skills
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW Technology
 
Sourcing languages
Sourcing languagesSourcing languages
Sourcing languages
 
AstriCon 2017 - Machine Learning, AI & Asterisk
AstriCon 2017  - Machine Learning, AI & AsteriskAstriCon 2017  - Machine Learning, AI & Asterisk
AstriCon 2017 - Machine Learning, AI & Asterisk
 
Team BuzzFeed: Project Presentation
Team BuzzFeed: Project PresentationTeam BuzzFeed: Project Presentation
Team BuzzFeed: Project Presentation
 
06. ElasticSearch : Mapping and Analysis
06. ElasticSearch : Mapping and Analysis06. ElasticSearch : Mapping and Analysis
06. ElasticSearch : Mapping and Analysis
 
You've Got (Big) Data! Now What?
You've Got (Big) Data! Now What?You've Got (Big) Data! Now What?
You've Got (Big) Data! Now What?
 
Taking the Reins: Website Redesign by the Librarians, for the Users
Taking the Reins: Website Redesign by the Librarians, for the UsersTaking the Reins: Website Redesign by the Librarians, for the Users
Taking the Reins: Website Redesign by the Librarians, for the Users
 

More from GraphAware

More from GraphAware (20)

Unparalleled Graph Database Scalability Delivered by Neo4j 4.0
Unparalleled Graph Database Scalability Delivered by Neo4j 4.0Unparalleled Graph Database Scalability Delivered by Neo4j 4.0
Unparalleled Graph Database Scalability Delivered by Neo4j 4.0
 
Challenges in knowledge graph visualization
Challenges in knowledge graph visualizationChallenges in knowledge graph visualization
Challenges in knowledge graph visualization
 
Social media monitoring with ML-powered Knowledge Graph
Social media monitoring with ML-powered Knowledge GraphSocial media monitoring with ML-powered Knowledge Graph
Social media monitoring with ML-powered Knowledge Graph
 
It Depends (and why it's the most frequent answer to modelling questions)
It Depends (and why it's the most frequent answer to modelling questions)It Depends (and why it's the most frequent answer to modelling questions)
It Depends (and why it's the most frequent answer to modelling questions)
 
How Boston Scientific Improves Manufacturing Quality Using Graph Analytics
How Boston Scientific Improves Manufacturing Quality Using Graph AnalyticsHow Boston Scientific Improves Manufacturing Quality Using Graph Analytics
How Boston Scientific Improves Manufacturing Quality Using Graph Analytics
 
When privacy matters! Chatbots in data-sensitive businesses
When privacy matters! Chatbots in data-sensitive businessesWhen privacy matters! Chatbots in data-sensitive businesses
When privacy matters! Chatbots in data-sensitive businesses
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
 
Signals from outer space
Signals from outer spaceSignals from outer space
Signals from outer space
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Graph-Powered Machine Learning
Graph-Powered Machine Learning Graph-Powered Machine Learning
Graph-Powered Machine Learning
 
(Big) Data Science
 (Big) Data Science (Big) Data Science
(Big) Data Science
 
Modelling Data in Neo4j (plus a few tips)
Modelling Data in Neo4j (plus a few tips)Modelling Data in Neo4j (plus a few tips)
Modelling Data in Neo4j (plus a few tips)
 
Intro to Neo4j (CZ)
Intro to Neo4j (CZ)Intro to Neo4j (CZ)
Intro to Neo4j (CZ)
 
Modelling Data as Graphs (Neo4j)
Modelling Data as Graphs (Neo4j)Modelling Data as Graphs (Neo4j)
Modelling Data as Graphs (Neo4j)
 
GraphAware Framework Intro
GraphAware Framework IntroGraphAware Framework Intro
GraphAware Framework Intro
 
Advanced Neo4j Use Cases with the GraphAware Framework
Advanced Neo4j Use Cases with the GraphAware FrameworkAdvanced Neo4j Use Cases with the GraphAware Framework
Advanced Neo4j Use Cases with the GraphAware Framework
 
Recommendations with Neo4j (FOSDEM 2015)
Recommendations with Neo4j (FOSDEM 2015)Recommendations with Neo4j (FOSDEM 2015)
Recommendations with Neo4j (FOSDEM 2015)
 
Machine Learning Powered by Graphs - Alessandro Negro
Machine Learning Powered by Graphs - Alessandro NegroMachine Learning Powered by Graphs - Alessandro Negro
Machine Learning Powered by Graphs - Alessandro Negro
 
Knowledge Graphs and Chatbots with Neo4j and IBM Watson - Christophe Willemsen
Knowledge Graphs and Chatbots with Neo4j and IBM Watson - Christophe WillemsenKnowledge Graphs and Chatbots with Neo4j and IBM Watson - Christophe Willemsen
Knowledge Graphs and Chatbots with Neo4j and IBM Watson - Christophe Willemsen
 
The power of polyglot searching
The power of polyglot searchingThe power of polyglot searching
The power of polyglot searching
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

To be or not to be.

  • 1. To be or not to be. Neo4j Full Text Search Tips and Tricks graphaware.com @graph_aware #NODES19 - Oct 10, 2019 Christophe Willemsen CTO - GraphAware @ikwattro
  • 2. Agenda ● Full Text Search Syntax refresher ● Relevant Search using Graphs ● Advanced Full Text Search Queries
  • 3. How is text data indexed ?
  • 4.
  • 5. Default Stopwords for English "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there","these", "they", "this", "to", "was", "will", "with"
  • 6. Default Stopwords for English "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there","these", "they", "this", "to", "was", "will", "with"
  • 8. Custom Lucene Analyzer ikwattro@mbp666 ~/d/_/nodes19> ll plugins/ total 28264 drwxr-xr-x 4 ikwattro staff 128B Oct 10 09:46 . drwxr-xr-x 17 ikwattro staff 544B Sep 22 20:25 .. -rw-r--r--@ 1 ikwattro staff 13M Sep 22 20:30 apoc-3.5.0.4-all.jar -rw-r--r-- 1 ikwattro staff 4.9K Sep 23 12:08 fts-extra-1.0.0.jar
  • 10. Relevant Search ● You first query the graph to build the user context ● You then use the context to make search results relevant not only based on their text similarity, but also based on who the user is in the network.
  • 14. Hunger Games Questions for "To be or not to be. Full Text Search tips and tricks" 1. Easy: What type of properties can we index for full text search. a. All properties b. Only strings and integers c. Only strings 2. Medium: How can I index stopwords such that they can be searched? a. Using an analyzer available in Neo4j b. Using a custom analyzer written in Cypher c. Using a custom analyzer written in Java 3. Hard: How can you boost a specific field during a query ? a. Using the ~ operator b. Using the ^ operator c. Using the + operator Answer here: r.neo4j.com/hunger-games
  • 15. Thank you Christophe Willemsen CTO - GraphAware @ikwattro graphaware.com