SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Solr 6.0 Graph Query Overview
Kevin Watters KMW Technology
kwatters@kmwllc.com
http://www.kmwllc.com/
03/29/2016
KMW Technology Overview
 Boston based software consulting and
professional services organization.
 Founded in 2010.
 Seven consultants with deep industry
experience.
 Boutique firm specializing in Search
and Big Data technologies.
 Custom Connectors, Pipelines,
Search, Analytics, and UI
development.
Search, Join, vs Graph
 Which query should I use?
 Search is for flat data, no relationships
◦ Data often de-normalized, updates require large
amounts of re-indexing potentially.
 Join is for one level of relationships
◦ Data is normalized, but for more than 2 tables
involved, join queries must be nested.
 Graph is for arbitrary depth/levels of
relationships.
◦ Data can be completely normalized, arbitrary
numbers of tables can be joined together.
 A one level hop on a graph is roughly
equivalent to a join query.
What is a Graph?
A generic representation of all data
models.
“One data model to rule them all”!
G = <V,E> ?!?!
 Vertices/Nodes
◦ Can have properties as key value pairs.
 Edges
◦ Can have properties as key value pairs
Graph Traversal
There are many graph traversal /
exploration algorithms. DFS, BFS, A*,
Alpha–beta, etc…
Solr graph query implements “BFS”
Breadth-first search, each hop expands
the “Frontier” of the graph. It explores
all current edges in a single step, also
known as a “hop”
Key Features and Design Goals
“Graph is a Filter on top of your data”
-someone
 Designed for large scale and large number of
edges and very deep traversals.
 Limited memory usage for traversal
 Cycle detection for “free”
 Highly cacheable
 Support multiValued fields for nodes and/or
edges
 Support filters during the traversal
 Follow Every Edge! No edge left behind!
 Works with Facets & Facet Queries!
A Word about Memory Usage
 One bit set to rule them all!
 BitSet provides cycle detection implicitly.
(Have I been here before?)
 BitSet is equal to the size of the index.
 100 Million doc index only uses about 12
MB per query! (Same size as 1 filter
cache entry!)
 Additional bitsets may be used during
query execution depending on query
params. (leaf nodes and root nodes
bitsets)
Graph Query Parser Syntax
Parameter Default Description
from field containing the node id
to Field contaning the edge id(s)
maxDepth -1
The number of hops to traverse from the root of the graph. -1 means
traverse until all edges and documents have been collected. maxDepth=1
is similar behavior to a JOIN.
traversalFilter null arbitrary query string to apply at each hop of the traversal
returnRoot true
true|false – indication of if the documents matching the root query should
be returned.
leafNodesOnly false
true|false – indication to return only documents in the result set that do not
have a value in the “to” field.
useAutn True Performance trade off based on use case. Mileage may vary.
Uses Solr’s query parser plugin and “local params” syntax
{!graph param=”value” … }
Princeton Wordnet
Princeton Wordnet has an ontology for many of the
words in the English language. These
relationships contain hierarchies of words that
represent a more general and a more specific class
of relatonships.
 https://wordnet.princeton.edu/
 Words have a “sense”, or meaning.
 Hypernym is a more specific related word.
 Hyponem is a more general related word.
◦ Jaguar is a type of Cat
◦ Large Cat is a type of Animal
 Intersections of this hierachy can answer
questions: “Is a jaguar an animal?”
Wordnet Hypernym Traversal
Start traversing from the word sense “jaguar” up the hypernym graph 9 levels.
+{!graph from="synset_id" to="hypernym_id" maxDepth=9}sense_lemma:jaguar
Wordnet Graph Intersections
Is a jaguar an animal? Query for an
intersection between the two graphs.
If a graph intersection exists, the answer is yes!
OpenCV, Video Recognition
 Imagine indexing each frame of video
from security cameras. Pass each
frame of video through OpenCV for
object recognition & face recognition.
 Each frame has a frame number of it’s
frame and the previous frame.
 Search for object/face “A” detected,
followed by object/face “B” detected,
across all of your video streams.
Users , Items and Actions
 Model your browsing/purchase history as
◦ Users (have an ID)
◦ Items (have an ID, metadata, category, etc)
◦ Actions (link between user and Items, such
as rating, purchase, like/dislike)
User -> Action -> Item -> Action -> User …
Use Graph + maxDepth to get from a user to
an item. maxDepth = 2… gets from a user to
an Item. maxDepth = 4 .. Gets from one user
to a new set of users, and on and on.
Actions occur over time
 These events can’t easily be
aggregated or flattened onto a record.
 Model this as a “person” record, with a
set of “action” records.
 Each action record has the id of the
“previous” action.
 Search for an action, graph traverse
based on person id to another action,
then finally to the person record.
Find similar users
 Graph traversal from a user (or set of
users) through their actions to items
they like, to find similar users, and out
to items they like.
 Now, exclude the original starting set
 “returnRoot=false”
Graph Query For Security
 Graph queries are elegant and simple
to use for traversing security
hierarchies such as LDAP and AD
 Custom security models that are
hierarchical or folder based in nature.
Example Company with Security Model
Document/Security Model within the Solr
Index
Graph Traversal for User 1
Graph Traversal for User 2
Security Query
 Single security query term to traverse the entire graph
{!graph from=“node_id” to=“edge_ids” returnOnlyLeaf=“true”}id:user_1
 The query is applied as a FilterQuery to the query request,
normal query is user for filtering against documents
FoaF
 Friend of a Friend of a Friend of a Friend…
 2 ways to model in the index.
 Multi-valued “friendid” field that points to other
person records.
◦ More efficient and faster search.
◦ Filter traversal based on metadata on the person
record.
 Single value field and on a document that
represents the link/edge between two person
records.
◦ More flexible slower search.
◦ Can filter edges with metadata about the edge
record..
Graph Analytics via Faceting
What do my friend’s friends like that live in
Boston?
 Identify a graph/ dataset with a graph query
to identify the people records.
 Use facets to generate analytics on the result
set based on the values in the person record
“like” field.
 Use drill down to understand characteristics
of different demographics/cohorts.
 Get counts at various levels using maxDepth
graph queries as facet queries.
What next?
 Edge weights & Relevancy
◦ Based on tf/idf or bm25?
◦ Based on numerical field values (min/max/sum/avg
weight application)?
 Min distance computation
 Better support for D3.js and other Visualization
tools
 Driving directions?
 Distributed Traversal via Kafka frontier query
broker
 SparkRDD Support? GraphX?
 minDepth parameter? Only return records that
are at least N hops away?
Additional Detail
 Graph Query Solr tickets:
https://issues.apache.org/jira/browse/SOLR-7543
https://issues.apache.org/jira/browse/SOLR-8632
https://issues.apache.org/jira/browse/SOLR-8176
Questions?
info@kmwllc.com
http://www.kmwllc.com/

Weitere ähnliche Inhalte

Was ist angesagt?

A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksA Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksLucidworks
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation EnginesTrey Grainger
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...Lucidworks
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsTrey Grainger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrTrey Grainger
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
 
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
 Click-through relevance ranking in solr &  lucid works enterprise - By Andrz... Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...lucenerevolution
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchTrey Grainger
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Lucidworks
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 

Was ist angesagt? (20)

A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksA Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine: Presented by T...
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Vespa, A Tour
Vespa, A TourVespa, A Tour
Vespa, A Tour
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
 Click-through relevance ranking in solr &  lucid works enterprise - By Andrz... Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered Search
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 

Andere mochten auch

The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelTrey Grainger
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development TutorialErik Hatcher
 
motores de combustión interna
motores de combustión internamotores de combustión interna
motores de combustión internaJosue Ariza
 
Engines - Campus Party CMX3
Engines - Campus Party CMX3Engines - Campus Party CMX3
Engines - Campus Party CMX3mikealebrije
 
20150627 bigdatala
20150627 bigdatala20150627 bigdatala
20150627 bigdatalagethue
 
SF Solr Meetup - Interactively Search and Visualize Your Big Data
SF Solr Meetup - Interactively Search and Visualize Your Big DataSF Solr Meetup - Interactively Search and Visualize Your Big Data
SF Solr Meetup - Interactively Search and Visualize Your Big Datagethue
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningHeiko Paulheim
 
Working with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache SolrWorking with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache SolrAnshum Gupta
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksLucidworks
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Alexandre Rafalovitch
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon
 

Andere mochten auch (19)

The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 
Serarch engines y Solr, JUG Argentina
Serarch engines y Solr, JUG ArgentinaSerarch engines y Solr, JUG Argentina
Serarch engines y Solr, JUG Argentina
 
motores de combustión interna
motores de combustión internamotores de combustión interna
motores de combustión interna
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
 
Engines - Campus Party CMX3
Engines - Campus Party CMX3Engines - Campus Party CMX3
Engines - Campus Party CMX3
 
20150627 bigdatala
20150627 bigdatala20150627 bigdatala
20150627 bigdatala
 
SF Solr Meetup - Interactively Search and Visualize Your Big Data
SF Solr Meetup - Interactively Search and Visualize Your Big DataSF Solr Meetup - Interactively Search and Visualize Your Big Data
SF Solr Meetup - Interactively Search and Visualize Your Big Data
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
 
Conferencia 2: El esquema
Conferencia 2: El esquemaConferencia 2: El esquema
Conferencia 2: El esquema
 
Working with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache SolrWorking with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache Solr
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 
Formación apache Solr
Formación apache SolrFormación apache Solr
Formación apache Solr
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
 

Ähnlich wie Solr 6.0 Graph Query Overview

How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?Samet KILICTAS
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphTrey Grainger
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphTrey Grainger
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital.AI
 
Graph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jGraph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jijtsrd
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data LakeFishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data LakeArangoDB Database
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneDoug Needham
 
136 latest dot net interview questions
136  latest dot net interview questions136  latest dot net interview questions
136 latest dot net interview questionssandi4204
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachAndre Freitas
 
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...Big Data Spain
 
Family tree of data – provenance and neo4j
Family tree of data – provenance and neo4jFamily tree of data – provenance and neo4j
Family tree of data – provenance and neo4jM. David Allen
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearchMinsoo Jun
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesKonstantinos Xirogiannopoulos
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesPyData
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect WorldVital.AI
 
Graph Gurus 23: Best Practices To Model Your Data Using A Graph Database
Graph Gurus 23: Best Practices To Model Your Data Using A Graph DatabaseGraph Gurus 23: Best Practices To Model Your Data Using A Graph Database
Graph Gurus 23: Best Practices To Model Your Data Using A Graph DatabaseTigerGraph
 
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Gabriel Moreira
 

Ähnlich wie Solr 6.0 Graph Query Overview (20)

How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Graph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jGraph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4j
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data LakeFishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZone
 
136 latest dot net interview questions
136  latest dot net interview questions136  latest dot net interview questions
136 latest dot net interview questions
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
 
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
 
Family tree of data – provenance and neo4j
Family tree of data – provenance and neo4jFamily tree of data – provenance and neo4j
Family tree of data – provenance and neo4j
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
 
Graph Gurus 23: Best Practices To Model Your Data Using A Graph Database
Graph Gurus 23: Best Practices To Model Your Data Using A Graph DatabaseGraph Gurus 23: Best Practices To Model Your Data Using A Graph Database
Graph Gurus 23: Best Practices To Model Your Data Using A Graph Database
 
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
 

Kürzlich hochgeladen

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 

Kürzlich hochgeladen (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 

Solr 6.0 Graph Query Overview

  • 1. Solr 6.0 Graph Query Overview Kevin Watters KMW Technology kwatters@kmwllc.com http://www.kmwllc.com/ 03/29/2016
  • 2. KMW Technology Overview  Boston based software consulting and professional services organization.  Founded in 2010.  Seven consultants with deep industry experience.  Boutique firm specializing in Search and Big Data technologies.  Custom Connectors, Pipelines, Search, Analytics, and UI development.
  • 3. Search, Join, vs Graph  Which query should I use?  Search is for flat data, no relationships ◦ Data often de-normalized, updates require large amounts of re-indexing potentially.  Join is for one level of relationships ◦ Data is normalized, but for more than 2 tables involved, join queries must be nested.  Graph is for arbitrary depth/levels of relationships. ◦ Data can be completely normalized, arbitrary numbers of tables can be joined together.  A one level hop on a graph is roughly equivalent to a join query.
  • 4. What is a Graph? A generic representation of all data models. “One data model to rule them all”! G = <V,E> ?!?!  Vertices/Nodes ◦ Can have properties as key value pairs.  Edges ◦ Can have properties as key value pairs
  • 5. Graph Traversal There are many graph traversal / exploration algorithms. DFS, BFS, A*, Alpha–beta, etc… Solr graph query implements “BFS” Breadth-first search, each hop expands the “Frontier” of the graph. It explores all current edges in a single step, also known as a “hop”
  • 6. Key Features and Design Goals “Graph is a Filter on top of your data” -someone  Designed for large scale and large number of edges and very deep traversals.  Limited memory usage for traversal  Cycle detection for “free”  Highly cacheable  Support multiValued fields for nodes and/or edges  Support filters during the traversal  Follow Every Edge! No edge left behind!  Works with Facets & Facet Queries!
  • 7. A Word about Memory Usage  One bit set to rule them all!  BitSet provides cycle detection implicitly. (Have I been here before?)  BitSet is equal to the size of the index.  100 Million doc index only uses about 12 MB per query! (Same size as 1 filter cache entry!)  Additional bitsets may be used during query execution depending on query params. (leaf nodes and root nodes bitsets)
  • 8. Graph Query Parser Syntax Parameter Default Description from field containing the node id to Field contaning the edge id(s) maxDepth -1 The number of hops to traverse from the root of the graph. -1 means traverse until all edges and documents have been collected. maxDepth=1 is similar behavior to a JOIN. traversalFilter null arbitrary query string to apply at each hop of the traversal returnRoot true true|false – indication of if the documents matching the root query should be returned. leafNodesOnly false true|false – indication to return only documents in the result set that do not have a value in the “to” field. useAutn True Performance trade off based on use case. Mileage may vary. Uses Solr’s query parser plugin and “local params” syntax {!graph param=”value” … }
  • 9. Princeton Wordnet Princeton Wordnet has an ontology for many of the words in the English language. These relationships contain hierarchies of words that represent a more general and a more specific class of relatonships.  https://wordnet.princeton.edu/  Words have a “sense”, or meaning.  Hypernym is a more specific related word.  Hyponem is a more general related word. ◦ Jaguar is a type of Cat ◦ Large Cat is a type of Animal  Intersections of this hierachy can answer questions: “Is a jaguar an animal?”
  • 10. Wordnet Hypernym Traversal Start traversing from the word sense “jaguar” up the hypernym graph 9 levels. +{!graph from="synset_id" to="hypernym_id" maxDepth=9}sense_lemma:jaguar
  • 11. Wordnet Graph Intersections Is a jaguar an animal? Query for an intersection between the two graphs. If a graph intersection exists, the answer is yes!
  • 12. OpenCV, Video Recognition  Imagine indexing each frame of video from security cameras. Pass each frame of video through OpenCV for object recognition & face recognition.  Each frame has a frame number of it’s frame and the previous frame.  Search for object/face “A” detected, followed by object/face “B” detected, across all of your video streams.
  • 13. Users , Items and Actions  Model your browsing/purchase history as ◦ Users (have an ID) ◦ Items (have an ID, metadata, category, etc) ◦ Actions (link between user and Items, such as rating, purchase, like/dislike) User -> Action -> Item -> Action -> User … Use Graph + maxDepth to get from a user to an item. maxDepth = 2… gets from a user to an Item. maxDepth = 4 .. Gets from one user to a new set of users, and on and on.
  • 14. Actions occur over time  These events can’t easily be aggregated or flattened onto a record.  Model this as a “person” record, with a set of “action” records.  Each action record has the id of the “previous” action.  Search for an action, graph traverse based on person id to another action, then finally to the person record.
  • 15. Find similar users  Graph traversal from a user (or set of users) through their actions to items they like, to find similar users, and out to items they like.  Now, exclude the original starting set  “returnRoot=false”
  • 16. Graph Query For Security  Graph queries are elegant and simple to use for traversing security hierarchies such as LDAP and AD  Custom security models that are hierarchical or folder based in nature.
  • 17. Example Company with Security Model
  • 21. Security Query  Single security query term to traverse the entire graph {!graph from=“node_id” to=“edge_ids” returnOnlyLeaf=“true”}id:user_1  The query is applied as a FilterQuery to the query request, normal query is user for filtering against documents
  • 22. FoaF  Friend of a Friend of a Friend of a Friend…  2 ways to model in the index.  Multi-valued “friendid” field that points to other person records. ◦ More efficient and faster search. ◦ Filter traversal based on metadata on the person record.  Single value field and on a document that represents the link/edge between two person records. ◦ More flexible slower search. ◦ Can filter edges with metadata about the edge record..
  • 23. Graph Analytics via Faceting What do my friend’s friends like that live in Boston?  Identify a graph/ dataset with a graph query to identify the people records.  Use facets to generate analytics on the result set based on the values in the person record “like” field.  Use drill down to understand characteristics of different demographics/cohorts.  Get counts at various levels using maxDepth graph queries as facet queries.
  • 24. What next?  Edge weights & Relevancy ◦ Based on tf/idf or bm25? ◦ Based on numerical field values (min/max/sum/avg weight application)?  Min distance computation  Better support for D3.js and other Visualization tools  Driving directions?  Distributed Traversal via Kafka frontier query broker  SparkRDD Support? GraphX?  minDepth parameter? Only return records that are at least N hops away?
  • 25. Additional Detail  Graph Query Solr tickets: https://issues.apache.org/jira/browse/SOLR-7543 https://issues.apache.org/jira/browse/SOLR-8632 https://issues.apache.org/jira/browse/SOLR-8176 Questions? info@kmwllc.com http://www.kmwllc.com/