SlideShare a Scribd company logo
1 of 35
Download to read offline
Graphs, graphs everywhere
Zbyszko Papierski, Senior Dev@JIRA Cloud,
T:@ZPapierski
E: zbyszko.papierski@gmail.com
Lucene powered relation exploration
Agenda
1. Introduction to Lucene and friends
2. Evolution of data analysis by Solr and Elasticsearch
3. Graph capabilities of Elasticsearch(briefly)
4. Solr - QueryParserPlugin
5. Solr - Streaming Expressions
6. Examples
http://bit.do/graphs-src
1. Create a collection
2. Put schema
3. Run feeder
Lucene and friends
Lucene
Provides mechanism for fast searching of text data - both full-text
search(analyzed data) and exact match(non-analyzed, or docValues)
Step one - indexing
{kitty|kitten|cat|cats|kittens|pussy} —> cat
{is} —>
{GORGEOUS!!!} —> gorgeous, pretty, nice, etc.
Step one - searching
{very} —> very
{nice} —> nice
{kitty} —> cat
{nice, cat, …} {very, ugly, cat, …}
{very,nice, dog, …}
{very, nice, bear, …}
Step one - scoring
{very} —> very
{nice} —> nice
{kitty} —> cat
{nice, cat, …} {very, ugly, cat, …}
{very,nice, dog, …}
{very, nice, bear, …}
Winner!
nice and cat score
higher than very and nice
or very and cat
because cat is rarer than very
this is only an example, all cats are nice…
Solr
Older, works closer with Lucene
Elasticsearch
Newer, but with more toys
Waiter, there is a graph in my full-text search engine!
are relations
• full text searching
• faceting/aggregation
• statistical
• relationship exploration
How did we get here?
1. Your standard, full-text search
2. TF-IDF-ish relationship sorting
3. It’s already there
It’s still your standard Lucene index
• From Elasticsearch 2.3
• REST API - /_graph/explore
• visualization for Kibana
• Part of elastic commercial offering (named
from 5.0 X-Pack)
Elasticsearch+Kibana
Plugin for Elasticsearch and Kibana - Graph
picture from: https://www.elastic.co/guide/en/graph/current/graph-introduction.html
• Available from Solr 6.0
• experimental feature
• currently, works for single node, single core
applications (due to change)
• no 1st party visualization
• does not track edges of the traversal
Solr
built-in GraphQueryParser
picture from: http://solr.pl/2016/04/25/wizualizacja-grafow-przy-pomocy-solr-6/
• Available from Solr 5.5
• experimental feature
• no 1st party visualization
• does track edges of the traversal and level
Solr
built-in Streaming Expressions
picture from: http://solr.pl/2016/04/25/wizualizacja-grafow-przy-pomocy-solr-6/
fq={!graph from=email to=friends maxDepth=2}email:"susan.gardner@example.com"
Params
traversalFilter
Filter query used to filter out incoming nodes on each iteration
Params
returnRoot
Should the root set of documents (found by initial query) be returned. Default: true
Params
returnOnlyLeaf
Should only leaf documents be returned. Default: false
Streaming Expressions
• New alternative way of creating and processing queries
• allow chaining functions
• also experimental
• graph functions - shortestPath, gatherNodes, scoreNodes
Streaming Expressions
example
shortestPath
• one of the source functions - function producing tuple stream
• returns shortest path between to given nodes using iterative breadth-first search of the graph
shortestPath - params
• collection - collection to perform the search
• from - starting node
• to - ending node
• edge - definition of edge, in format <from-field>=<to_field>
• fq - filter query, which filters out nodes taken into account
• maxDepth - maximal depth of the traversal
gatherNodes
• transforms input document stream to stream of accessible, through graph
traversal, documents
• can return edges
• allows nesting functions
• works for multi-collection streams, irregardless of number of cluster nodes
• is also a source function
• currently does not support multivalued fields
gatherNodes - params
• collection - collection on which function will be performed
• walk - defines starting nodes and the field, e.g. „zpapierski@atlassian.com->from”
• gather - defines which fields are gathered
• scatter - parameter that can have values(one or both):
• leaves - emits only leaf nodes (outer-most ones)
• branches - emits nodes leading up to leaves (root node is a branch)
• fq - filter query that filters out nodes
• maxDocFreq - every node in the result over this number is filtered out
Aggregations, cross-collection gathering and combining with other streaming expressions
is possible
scoreNodes
• Function user only with output of gatherNodes
• Score document relevancy, using TF-IDF formula
• As TF - how often document appeared on graph traversal
• IDF is fetched from documents original collection
• Adds additional field, nodeScore, to the output stream
Thank you!

More Related Content

What's hot

Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Lucidworks
 

What's hot (18)

Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformExtending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
 
How Solr Search Works
How Solr Search WorksHow Solr Search Works
How Solr Search Works
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 
Doing Synonyms Right - John Marquiss, Wolters Kluwer
Doing Synonyms Right - John Marquiss, Wolters KluwerDoing Synonyms Right - John Marquiss, Wolters Kluwer
Doing Synonyms Right - John Marquiss, Wolters Kluwer
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with Solr
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Solr: 4 big features
Solr: 4 big featuresSolr: 4 big features
Solr: 4 big features
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Exploring Direct Concept Search - Steve Rowe, Lucidworks
Exploring Direct Concept Search - Steve Rowe, LucidworksExploring Direct Concept Search - Steve Rowe, Lucidworks
Exploring Direct Concept Search - Steve Rowe, Lucidworks
 
MongoDB & Machine Learning
MongoDB & Machine LearningMongoDB & Machine Learning
MongoDB & Machine Learning
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
 
Quepy
QuepyQuepy
Quepy
 
Where is my data (in the cloud) tamir dresher
Where is my data (in the cloud)   tamir dresherWhere is my data (in the cloud)   tamir dresher
Where is my data (in the cloud) tamir dresher
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
Terms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedTerms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explained
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to Graph
 

Viewers also liked

The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
Trey Grainger
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Trey Grainger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
Trey Grainger
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
Trey Grainger
 

Viewers also liked (8)

Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and Graph
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 

Similar to Graphs, Graphs everywhere - Lucene powered relation exploration

How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
琛琳 饶
 

Similar to Graphs, Graphs everywhere - Lucene powered relation exploration (20)

Interactive Questions and Answers - London Information Retrieval Meetup
Interactive Questions and Answers - London Information Retrieval MeetupInteractive Questions and Answers - London Information Retrieval Meetup
Interactive Questions and Answers - London Information Retrieval Meetup
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
 
Your Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
Your Big Data Stack is Too Big!: Presented by Timothy Potter, LucidworksYour Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
Your Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastore
 
Rails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineRails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search Engine
 
Solr Flair
Solr FlairSolr Flair
Solr Flair
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
pull requests I sent to scala/scala (ny-scala 2019)
pull requests I sent to scala/scala (ny-scala 2019)pull requests I sent to scala/scala (ny-scala 2019)
pull requests I sent to scala/scala (ny-scala 2019)
 
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
 
Wieldy remote apis with Kekkonen - ClojureD 2016
Wieldy remote apis with Kekkonen - ClojureD 2016Wieldy remote apis with Kekkonen - ClojureD 2016
Wieldy remote apis with Kekkonen - ClojureD 2016
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
Lucene/Solr 8: The next major release
Lucene/Solr 8: The next major releaseLucene/Solr 8: The next major release
Lucene/Solr 8: The next major release
 
Lucene/Solr 8: The Next Major Release Steve Rowe, Lucidworks
Lucene/Solr 8: The Next Major Release Steve Rowe, LucidworksLucene/Solr 8: The Next Major Release Steve Rowe, Lucidworks
Lucene/Solr 8: The Next Major Release Steve Rowe, Lucidworks
 
The Road to Lambda - Mike Duigou
The Road to Lambda - Mike DuigouThe Road to Lambda - Mike Duigou
The Road to Lambda - Mike Duigou
 
Functional Reactive Programming (FRP): Working with RxJS
Functional Reactive Programming (FRP): Working with RxJSFunctional Reactive Programming (FRP): Working with RxJS
Functional Reactive Programming (FRP): Working with RxJS
 

Recently uploaded

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Recently uploaded (20)

PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 

Graphs, Graphs everywhere - Lucene powered relation exploration

  • 1. Graphs, graphs everywhere Zbyszko Papierski, Senior Dev@JIRA Cloud, T:@ZPapierski E: zbyszko.papierski@gmail.com Lucene powered relation exploration
  • 2. Agenda 1. Introduction to Lucene and friends 2. Evolution of data analysis by Solr and Elasticsearch 3. Graph capabilities of Elasticsearch(briefly) 4. Solr - QueryParserPlugin 5. Solr - Streaming Expressions 6. Examples
  • 4. 1. Create a collection 2. Put schema 3. Run feeder
  • 6. Lucene Provides mechanism for fast searching of text data - both full-text search(analyzed data) and exact match(non-analyzed, or docValues)
  • 7. Step one - indexing {kitty|kitten|cat|cats|kittens|pussy} —> cat {is} —> {GORGEOUS!!!} —> gorgeous, pretty, nice, etc.
  • 8. Step one - searching {very} —> very {nice} —> nice {kitty} —> cat {nice, cat, …} {very, ugly, cat, …} {very,nice, dog, …} {very, nice, bear, …}
  • 9. Step one - scoring {very} —> very {nice} —> nice {kitty} —> cat {nice, cat, …} {very, ugly, cat, …} {very,nice, dog, …} {very, nice, bear, …}
  • 10. Winner! nice and cat score higher than very and nice or very and cat because cat is rarer than very this is only an example, all cats are nice…
  • 13. Waiter, there is a graph in my full-text search engine! are relations
  • 14. • full text searching • faceting/aggregation • statistical • relationship exploration How did we get here?
  • 15. 1. Your standard, full-text search 2. TF-IDF-ish relationship sorting 3. It’s already there
  • 16. It’s still your standard Lucene index
  • 17. • From Elasticsearch 2.3 • REST API - /_graph/explore • visualization for Kibana • Part of elastic commercial offering (named from 5.0 X-Pack) Elasticsearch+Kibana Plugin for Elasticsearch and Kibana - Graph picture from: https://www.elastic.co/guide/en/graph/current/graph-introduction.html
  • 18. • Available from Solr 6.0 • experimental feature • currently, works for single node, single core applications (due to change) • no 1st party visualization • does not track edges of the traversal Solr built-in GraphQueryParser picture from: http://solr.pl/2016/04/25/wizualizacja-grafow-przy-pomocy-solr-6/
  • 19. • Available from Solr 5.5 • experimental feature • no 1st party visualization • does track edges of the traversal and level Solr built-in Streaming Expressions picture from: http://solr.pl/2016/04/25/wizualizacja-grafow-przy-pomocy-solr-6/
  • 20. fq={!graph from=email to=friends maxDepth=2}email:"susan.gardner@example.com"
  • 21.
  • 22.
  • 23.
  • 24.
  • 25. Params traversalFilter Filter query used to filter out incoming nodes on each iteration
  • 26. Params returnRoot Should the root set of documents (found by initial query) be returned. Default: true
  • 27. Params returnOnlyLeaf Should only leaf documents be returned. Default: false
  • 28. Streaming Expressions • New alternative way of creating and processing queries • allow chaining functions • also experimental • graph functions - shortestPath, gatherNodes, scoreNodes
  • 30. shortestPath • one of the source functions - function producing tuple stream • returns shortest path between to given nodes using iterative breadth-first search of the graph
  • 31. shortestPath - params • collection - collection to perform the search • from - starting node • to - ending node • edge - definition of edge, in format <from-field>=<to_field> • fq - filter query, which filters out nodes taken into account • maxDepth - maximal depth of the traversal
  • 32. gatherNodes • transforms input document stream to stream of accessible, through graph traversal, documents • can return edges • allows nesting functions • works for multi-collection streams, irregardless of number of cluster nodes • is also a source function • currently does not support multivalued fields
  • 33. gatherNodes - params • collection - collection on which function will be performed • walk - defines starting nodes and the field, e.g. „zpapierski@atlassian.com->from” • gather - defines which fields are gathered • scatter - parameter that can have values(one or both): • leaves - emits only leaf nodes (outer-most ones) • branches - emits nodes leading up to leaves (root node is a branch) • fq - filter query that filters out nodes • maxDocFreq - every node in the result over this number is filtered out Aggregations, cross-collection gathering and combining with other streaming expressions is possible
  • 34. scoreNodes • Function user only with output of gatherNodes • Score document relevancy, using TF-IDF formula • As TF - how often document appeared on graph traversal • IDF is fetched from documents original collection • Adds additional field, nodeScore, to the output stream