Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

The Graph Database Universe: Neo4j Overview

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Graph database Use Cases
Graph database Use Cases
Wird geladen in …3
×

Hier ansehen

1 von 108 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie The Graph Database Universe: Neo4j Overview (20)

Anzeige

Weitere von Neo4j (20)

Aktuellste (20)

Anzeige

The Graph Database Universe: Neo4j Overview

  1. 1. The Neo4j Universe Kevin Van Gundy April 2018 in 44 minutes or less
  2. 2. We're going to connect a lot of dots…very quickly
  3. 3. Buckle Up!
  4. 4. What We'll Discuss Today: Data Ecosystems: Where it All Fits Neo4j, the Database: How and Why it Works Beyond the Database: Building Graph Native Organizations Our Long-Term Vision for a Connected Enterprise
  5. 5. The State of Data in 2018
  6. 6. What Happened?
  7. 7. Amazon, Google, and Facebook.
  8. 8. DATA
  9. 9. StructureData Storage ON STAGE BEHIND THE SCENE ` "Customer Journey"
  10. 10. CEO StructureData Storage ON STAGE BEHIND THE SCENE ` "Customer Journey"
  11. 11. StructureData Storage ON STAGE BEHIND THE SCENE ` "Customer Journey"
  12. 12. StructureData Storage ON STAGE BEHIND THE SCENE "Customer Journey"
  13. 13. graph database applications services products raw data data structurekey value storecolumn storedocument storerelational database ?relational database Conway's Law Any organization that designs a system or application are constrained to produce a design whose structure is a copy of the organization's (data) communication structure. 
  14. 14. graph database applications services products raw data data structurekey value storecolumn storedocument storerelational database Conway's Law Any organization that designs a system or application are constrained to produce a design whose structure is a copy of the organization's (data) communication structure. 
  15. 15. graph database applications services products raw data data structurekey value storecolumn storedocument store Conway's Law Any organization that designs a system or application are constrained to produce a design whose structure is a copy of the organization's (data) communication structure. 
  16. 16. graph database applications services products raw data data structurekey value storecolumn store Conway's Law Any organization that designs a system or application are constrained to produce a design whose structure is a copy of the organization's (data) communication structure. 
  17. 17. graph database applications services products raw data data structurekey value store Conway's Law Any organization that designs a system or application are constrained to produce a design whose structure is a copy of the organization's (data) communication structure. 
  18. 18. graph database applications services products raw data data structure Conway's Law Any organization that designs a system or application are constrained to produce a design whose structure is a copy of the organization's (data) communication structure. 
  19. 19. We've Entered the Era of the Data Structure
  20. 20. On Stage Business Processes Behind the Scene Data Structure
  21. 21. Hierarchies On Stage Business Processes Behind the Scene Data Structure Traditional Supply Chain Information
  22. 22. Hierarchies On Stage Business Processes Behind the Scene Data Structure Traditional Supply Chain Information
  23. 23. Hierarchies On Stage Business Processes Behind the Scene Data Structure Traditional Supply Chain Information
  24. 24. On Stage Behind the Scene Business Processes Data Structure Linear Supply Chain InformationOrganizations
  25. 25. On Stage Behind the Scene Business Processes Data Structure InformationOrganizations Dynamic Supply Chain
  26. 26. On Stage Behind the Scene Business Processes Data Structure Organizations Dynamic Supply Chain Knowledge
  27. 27. What is Good at What?
  28. 28. Real-Time Storage & Retrieval RDBMS & NoSQL Databases Store & Retrieve What is Good at What?
  29. 29. Real-Time Storage & Retrieval RDBMS & NoSQL Databases Store & Retrieve Hadoop/Spark 
 Aggregates & Filters Long-Running Queries Aggregation & Filtering What is Good at What?
  30. 30. Real-Time Storage & Retrieval RDBMS & NoSQL Databases Store & Retrieve Hadoop/Spark 
 Aggregates & Filters Long-Running Queries Aggregation & Filtering Neo4j Reveal Connections Real-Time
 Connected Insights What is Good at What?
  31. 31. Neo4j: The Database Purpose Built for Connected Data
  32. 32. A Brief History…
  33. 33. On Stage Behind the Scenes Personal Computer: Mainstream Movement Toward Client-Server SQL and RDBMs Consumes Market Teeny Tiny RAM Paper Rules Spinning Patters
  34. 34. On Stage Consumer Internet Behind the Scenes Distributed Systems Become Commonplace GBs RAM SSDs prohibitively expensive Batch Compute 
 Commodity Hardware
  35. 35. On Stage Mobile Behind the Scenes NoSQL and Cloud Native Deployment Cloud / On Demand Hardware Map Reduce Your Insights OSS Rules the Roost
  36. 36. On Stage Behind the Scenes The Internet of Me:
 Mainstream AI Behind the Scenes Graphs Abundant Cheap RAM TB+
 FGPAs for Algos Dynamic real world systems
  37. 37. The Property Graph Model
  38. 38. A way of representing data DATA DATA
  39. 39. Relational Database Good for: • Well-understood data structures that don’t change too frequently A way of representing data • Known problems involving discrete parts of the data, or minimal connectivity DATA
  40. 40. Graph Database Relational Database A way of representing data Good for: • Dynamic systems: where the data topology is difficult to predict • Dynamic requirements: 
 that evolve with the business • Problems where the relationships in data contribute meaning & value Good for: • Well-understood data structures that don’t change too frequently • Known problems involving discrete parts of the data, or minimal connectivity
  41. 41. CAR Anatomy of The Property Graph Model Nodes • Can have name-value properties • Can have Labels to classify nodes Relationships • Relate nodes by type and direction • Can have name-value properties PERSON PERSON 27
  42. 42. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” Anatomy of The Property Graph Model Nodes • Can have name-value properties • Can have Labels to classify nodes Relationships • Relate nodes by type and direction • Can have name-value properties PERSON PERSON 27
  43. 43. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” Anatomy of The Property Graph Model Nodes • Can have name-value properties • Can have Labels to classify nodes Relationships • Relate nodes by type and direction • Can have name-value properties MARRIED TO PERSON PERSON 27
  44. 44. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” Anatomy of The Property Graph Model Nodes • Can have name-value properties • Can have Labels to classify nodes Relationships • Relate nodes by type and direction • Can have name-value properties MARRIED TO LIVES WITH PERSON PERSON 27
  45. 45. CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: 
 Jan 10, 2011 brand: “Volvo” model: “V70” Anatomy of The Property Graph Model Nodes • Can have name-value properties • Can have Labels to classify nodes Relationships • Relate nodes by type and direction • Can have name-value properties MARRIED TO LIVES WITH OW NS PERSON PERSON 27
  46. 46. The Whiteboard Model Is the Physical Model

  47. 47. The Whiteboard Model Is the Physical Model

  48. 48. The Whiteboard Model Is the Physical Model

  49. 49. Project Agility Benefits • Easily understood • Easily evolved • Easy collaboration between business and IT The Whiteboard Model Is the Physical Model

  50. 50. On Building a Platform
  51. 51. Graph
 Transactions The Neo4j Graph Platform Vision
  52. 52. Neo4j Core Datebase 1 Index-Free Adjacency In memory and on flash/disk 2 vs ACID Foundation Required for safe writes 3 Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety 5 Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime, … 4
  53. 53. At Write Time: Data is connected as it is stored Index-Free Adjacency:
  54. 54. At Write Time: Data is connected as it is stored At Read Time: Lightning-fast retrieval of data and relationships via pointer chasing Index-Free Adjacency:
  55. 55. How Fast is Fast? • Sample Social Graph with roughly 1,000 persons • On average each person has 50 friends • pathExists(a,b) limited to depth 4 • Caches warmed up to eliminate disk I/O
  56. 56. How Fast is Fast? DATABASE # OF PERSONS QUERY TIME • Sample Social Graph with roughly 1,000 persons • On average each person has 50 friends • pathExists(a,b) limited to depth 4 • Caches warmed up to eliminate disk I/O
  57. 57. How Fast is Fast? DATABASE # OF PERSONS QUERY TIME MySQL 1,000 2,000 ms • Sample Social Graph with roughly 1,000 persons • On average each person has 50 friends • pathExists(a,b) limited to depth 4 • Caches warmed up to eliminate disk I/O
  58. 58. How Fast is Fast? DATABASE # OF PERSONS QUERY TIME MySQL 1,000 2,000 ms Neo4j 1,000 2 ms • Sample Social Graph with roughly 1,000 persons • On average each person has 50 friends • pathExists(a,b) limited to depth 4 • Caches warmed up to eliminate disk I/O
  59. 59. How Fast is Fast? DATABASE # OF PERSONS QUERY TIME MySQL 1,000 2,000 ms Neo4j 1,000 2 ms Neo4j 10,000,000 2 ms • Sample Social Graph with roughly 1,000 persons • On average each person has 50 friends • pathExists(a,b) limited to depth 4 • Caches warmed up to eliminate disk I/O
  60. 60. 34 Real-Time Query Performance Neo4j Versus Rela.onal and Other NoSQL Databases Connectedness and Size of Data Set Response Time 0 to 2 hops 0 to 3 degrees Thousands of connec;ons Tens to hundreds of hops Thousands of degrees Billions of connec;ons Rela;onal and Other NoSQL Databases Neo4j Neo4j is 1000x faster Reduces minutes to milliseconds 100s of Hops 1000s of Degrees Billions of Connections 0 to 2 Hops 0 to 3 Degrees Thousands of Connections ResponseTime t = O (1) t= O (log(n)) Neo4j : Index-free adjacent traversals JOIN:IndexScans Connected-Data Query Performance
  61. 61. Querying the graph Person Location Pattern: Persons who live in San Francisco city: “San Francisco” LIVES_IN
  62. 62. Querying the graph Person Location city: “San Francisco” LIVES_IN ( ) ( ) Pattern: Persons who live in San Francisco
  63. 63. city: “San Francisco” Querying the graph Person Location city: “San Francisco” LIVES_IN ( ) ( )p loc Pattern: Persons who live in San Francisco
  64. 64. city: “San Francisco” Querying the graph Person Location city: “San Francisco” LIVES_IN ( )loc( )p :Person Pattern: Persons who live in San Francisco
  65. 65. city: “San Francisco” Querying the graph Person Location city: “San Francisco” LIVES_IN ( )loc( )p:Person Pattern: Persons who live in San Francisco
  66. 66. Querying the graph Person Location city: “San Francisco” LIVES_IN ( )loc city: “San Francisco”:Location( )p:Person Pattern: Persons who live in San Francisco
  67. 67. {city: “San Francisco”} Querying the graph Person Location city: “San Francisco” LIVES_IN ( )loc :Location( )p :Person Pattern: Persons who live in San Francisco
  68. 68. Querying the graph Person Location city: “San Francisco” LIVES_IN ( loc :Location( )p:Person ->- Pattern: Persons who live in San Francisco {city: “San Francisco”} )
  69. 69. Querying the graph Person Location city: “San Francisco”(loc :Location( )p:Person ->- [:LIVES_IN] Pattern: Persons who live in San Francisco {city: “San Francisco”} )
  70. 70. {city: “San Francisco”} Querying the graph Person Location ( )loc :Location( )p:Person ->- [:LIVES_IN]MATCH RETURN p Pattern: Persons who live in San Francisco
  71. 71. Cypher vs. SQL
  72. 72.  (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.pid AS directReportees, 0 AS count    FROM person_reportee manager    WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION    SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.directly_manages AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT depth1Reportees.pid AS directReportees, count(depth2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count    FROM(    SELECT reportee.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT L2Reportees.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") )
  73. 73. MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total;
  74. 74. MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total;
  75. 75. Cypher: Key Benefits Example HR Query in SQL The Same Query using Cypher MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, 
 count(report) AS Total Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting
  76. 76. Graph
 Transactions The Neo4j Graph Platform Vision
  77. 77. Graph
 Analytics Graph
 Transactions AI The Neo4j Graph Platform Vision
  78. 78. Graph
 Analytics Graph
 Transactions Data Integration AI The Neo4j Graph Platform Vision
  79. 79. Common Integration Patterns
  80. 80. Common Integration Patterns From Tabular Data To Connected Data
  81. 81. Common Integration Patterns From Disparate Silos To Cross-Silo Connections From Tabular Data To Connected Data
  82. 82. Common Integration Patterns From Disparate Silos To Cross-Silo Connections From Tabular Data To Connected Data From Data Lake Analytics to Real-Time Operations
  83. 83. Graph
 Analytics Graph
 Transactions Data Integration AI The Neo4j Graph Platform Vision
  84. 84. DEVELOPERS Graph
 Analytics Data Integration Drivers & APIs APPLICATIONS AI Graph
 Transactions The Neo4j Graph Platform Vision
  85. 85. Native Language Drivers • Java • .Net • Python • Javascript • more to come… • Massive Community Support (Go, Ruby R, Perl, Clojure, C/C++…) • Partners like GraphAware (PHP Client) Drivers and APIs
  86. 86. DEVELOPERS Graph
 Analytics Data Integration Drivers & APIs APPLICATIONS AI Graph
 Transactions The Neo4j Graph Platform Vision
  87. 87. The Neo4j Graph Platform Vision BUSINESS USERS DEVELOPERS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS Drivers & APIs APPLICATIONS AI Graph
 Transactions
  88. 88. Graph Discovery & Visualization
 Software that allows users to realize insights by interacting directly with their data Neo4j Browser Custom / JS Libraries Partner Applications
  89. 89. The Neo4j Graph Platform Vision BUSINESS USERS DEVELOPERS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS Drivers & APIs APPLICATIONS AI Graph
 Transactions
  90. 90. The Neo4j Graph Platform Vision Development & Administration BUSINESS USERS DEVELOPERS ADMINS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS Drivers & APIs APPLICATIONS AI Graph
 Transactions
  91. 91. Neo4j Desktop and Browser Admin Tooling
  92. 92. Development & Administration BUSINESS USERS DEVELOPERS ADMINS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS Drivers & APIs APPLICATIONS AI Graph
 Transactions The Neo4j Graph Platform Vision
  93. 93. Development & Administration Analytics
 Tooling BUSINESS USERS DEVELOPERS ADMINS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS DATA
 SCIENTISTS Drivers & APIs APPLICATIONS AI Graph
 Transactions The Neo4j Graph Platform Vision
  94. 94. Cypher for Apache® Spark™
  95. 95. Multiple Sources, Multiple Graphs Relational Graph Graph Subgraph Output Graph
  96. 96. Multiple Sources, Multiple Graphs Relational Graph Graph Subgraph Output Graph Cypher for Apache® Spark™
  97. 97. Finds the optimal path or evaluates route availability and quality Evaluates how a group is clustered or partitioned Determines the importance of distinct nodes in the network Neo4j Graph Algorithm Library
  98. 98. One More Thing!
  99. 99. Development & Administration Analytics
 Tooling BUSINESS USERS DEVELOPERS ADMINS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS DATA
 SCIENTISTS Drivers & APIs APPLICATIONS AI Graph
 Transactions The Neo4j Graph Platform Vision
  100. 100. Development & Administration Analytics
 Tooling BUSINESS USERS DEVELOPERS ADMINS Graph
 Analytics Data Integration Discovery & Visualization DATA
 ANALYSTS DATA
 SCIENTISTS Drivers & APIs APPLICATIONS AI Graph
 Transactions The Neo4j Graph Platform Vision
  101. 101. Neo4j-Cloud: Preview The Challenge • Neo4j without the headache of operations • Lower cost of entry to enterprise-ready features • Scale cost with value The Solution • Neo4j as a Service on the public cloud • Cloud-hosted database in < 5 minutes • Automated backup, upgrades, and restores The Preview Program for Graph Tour • Looking for a wide variety of participants • Join the list at http://neo4j.com/cloud
  102. 102. • Graphs are fundamentally the most ergonomic and humane way of working with data at scale • Ecosystems are most powerful with seamless access to adjacent technologies • The best technology is the one that is easiest to use Why We Believe in Building a Graph Platform
  103. 103. Thanks!

×