Recorded webinar: neotechnology.com/webinar-five-graphs-love
The iDating industry cares about interactions and connections. Those two concepts are closely linked. If someone has a connection to another person, through a shared friend or a shared interest, they are much more likely to interact. Graph databases are optimized for querying connections between people, things, interests, or really anything that can be connected.
Dating sites and apps worldwide have begun to use graph databases to achieve competitive gain. Neo4j provides thousand-fold performance improvements and massive agility benefits over relational databases, enabling new levels of performance and insight. Amanda Laucher discusses the five graphs of love, and how companies like eHarmony, Hinge and AreYouInterested.com, are now using graph algorithms to create more interactions and connections.
4. The 5 Graphs of Love
• The Friends-of-Friends Graph
!
!
!
!
!
!
!
4
5. The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
!
!
!
!
5
6. The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
• The Location Graph
!
!
!
6
7. The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
• The Location Graph
!
• The Safety Graph
!
7
8. The 5 Graphs of Love
• The Friends-of-Friends Graph
!
• The Passion Graph
!
• The Location Graph
!
• The Safety Graph
!
• The Poser Graph
8
9. Meet Jeremy...
๏ from: California
๏ appearance: very handsome
๏ personality: super friendly nerd
๏ interests: piano, coding
Jeremy
10. Jeremy has some friends
๏ Kerstin: his sister
๏ Peter: his buddy
๏ Andreas: his coworker
Peter
Andreas
Jeremy
Kerstin
11. His friends introduced more friends
๏ Michael: master hacker, divorced, 2 kids
๏ Johan: technology sage, likes fast cars
๏ Madelene: polyglot journalist, loves dogs
๏ Allison: marketing maven, likes long walks on the
beach
Michael
Peter
Andreas
Johan
Jeremy
Madelene
Allison
Kerstin
12. So, we have a bunch of people
๏ how do we know they are friends?
๏ either ask each pair: are you friends?
๏ or, we can add explicit connections
๏ Twitter, Facebook, LinkedIn, etc.
Michael
Peter
Andreas
Johan
Jeremy
Madelene
Allison
Kerstin
13. This is really just data
๏ it's just a graph
Michael
Peter
Johan
Jeremy
Allison
Anna
Andreas
Madelene
Kerstin
Adam
15. Yes, a graph...
๏ you know the common data structures
•linked lists, trees, object "graphs"
๏ a graph is the general purpose data structure
•suitable for any connected data
๏ well-understood patterns and algorithms
•studied since Leonard Euler's 7 Bridges (1736)
•Codd's Relational Model (1970)
•not a new idea, just an idea who's time is now
15
16. How can you use this?
With a Graph Database
16
17. A graph database...
๏ optimized for the connections between records
๏ really, really fast at querying across records
๏ a database: transactional with the usual
operations
๏ “A relational database may tell you
the average age of everyone here,
but a graph database will tell you
who is most likely to buy you a beer later.”
17
20. According to SNAP Interactive if you are a
!
female user, you have a:
๏ 4% likelihood of interacting with a stranger
๏ 10% likelihood of interacting with friend of friend
๏ 7% chance of interacting with 3rd degree connection (friend
of friend of friend)
๏ Connections mean a much larger number of interactions!
Michael
Peter
Johan
Jeremy
Allison
Anna
Andreas
Madelene
Jennifer
Adam
32. Friends of Friends of Friends
:WANTS_TO_DATE
:WANTS_TO_DATE
:WORKS_FOR
:FRIENDS
:FRIENDS
Andreas
:WANTS_TO_DATE
Peter
Jennifer
Jake
:NO_DATE
:NO_DATE
:WANTS_TO_DATE
72. Performance Challenges with Connected Data
RDBMS/Other vs. Native Graph Database
Response Time
1000x faster
# Hops: Tens to Hundreds
Degree: Thousands+
Size: Billions+
# Hops: 0-2
Degree: < 3
Size: Thousands
RDBMS / Other NOSQL
Neo4j
Connectedness of Data Set
73. Neo4j Adoption Snapshot
Select Commercial Customers* (some NDA)
Core Industries
& Use Cases:
Software
Financial
Services
Core Industries
Use Cases:
Network & Data
&
Center Management
Telecommu
nications
Health Care &
Life Sciences
Web / ISV
Web Social,
HR & Recruiting
Financial
Services
Media &
Publishing
Energy, Services,
Automotive, Gov’t,
Logistics, Education,
Gaming, Other
Telecommunications
MDM / System of
Record
Network & Data Center
Management
Social
Geo
Accenture
Master Data
Management
Recommend-ations
Identity & Access
Mgmt
Social
Content
Management
Aviation
Geo
BI, CRM, Impact Analysis,
Fraud Detection, Resource
Optimization, etc.
*Community Users Not Included
Neo Technology, Inc Confidential
74. Graph Database Deployment
Ad-hoc visual navigation &
discovery
End User
Graph-
Dashboards
&
Ad-hoc
Analysis
Graph
Visualization
Application
Other
Databases
Reporting
Bulk Analytic
Infrastructure
ETL
ETL
Graph Database
Cluster
(e.g. Graph Compute Engine)
Graph Mining &
Aggregation
Ad-Hoc
Analysis
Data Storage &
Business Rules Execution
Data Scientist
Neo Technology, Inc Confidential
75. Experiencing Query Pain
Actual HR Query* (in SQL)
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM (
SELECT manager.pid AS directReportees, 0 AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
UNION
SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM (
SELECT manager.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
UNION
SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
(continued from previous page...)
SELECT depth1Reportees.pid AS directReportees,
count(depth2Reportees.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM(
SELECT reportee.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS
count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT L2Reportees.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
)
!
*“Find all direct reports and how many they manage, up to 3 levels down”
76. Experiencing Query Pain
Same Query*, using Cypher
MATCH
(boss)-‐[:MANAGES*0..3]-‐>(sub),
(sub)-‐[:MANAGES*1..3]-‐>(report)
WHERE
boss.name
=
“John
Doe”
RETURN
sub.name
AS
Subordinate,
count(report)
AS
Total
*“Find all direct reports and how many they manage, up to 3 levels down”