DataStax: Titan 1.0: Scalable real time and analytic graph queries

•

5 likes•1,591 views

Titan is a scalable graph database build on top of Apache Cassandra that supports both real time and analytic graph queries across distributed clusters. This talk focuses on the recently released Titan 1.0 and the new features it introduces. Titan implements the most recent version of the popular Apache Gremlin graph query language through a custom rewrite engine and query optimizer to efficiently execute deep traversal queries. Graph analytics and global breadth-first execution of Gremlin queries is executed by Apache Spark through the Cassandra-Spark connector. These features are demonstrated on a social use case to highlight how Titan can deliver relationship value with little development effort.

Technology

Titan 1.0
A journey and lessons learned
Matthias Broecheler
@mbroecheler

Lesson 1
Don't reinvent basic
database memory
management

g.V().has("user","name","Marko")
.out("reviewed").values("title")

Lesson 3
To use graph effectively
you need to “think”
graph

Lesson 4
Use cases:
Highly connected and
heterogenous

g.V().hasLabel("product")
.has("title",textContains("Titan"))
.inE("reviewed")
.has("score",5)
.values("summary")

g.V().hasLabel("product")
.has("title",textContains("Titan"))
.inE("reviewed")
.has("score",5)
.values("summary")
Graph-global
retrieval
Graph-local
walk

g.V().hasLabel("product")
.has("title",textContains("Titan"))
.inE("reviewed")
.has("score",5)
.values("summary")
TitanGraphStep
TitanVertexStep

Lesson 5
Flexible and adaptive
query optimization

g.V().as("user")
.filter(
outE("reviewed").has("helpfulness")
.count().is(gt(10)))
.outE("reviewed")
.has("helpfulness").as("r")
.group().by(select("user"))
.by(select("r").by("helpfulness"))
.iterate()

Lesson 6
Graph OLTP and OLAP
are converging

g.V().hasLabel("product")
.has("title",textContains("Titan"))
.inE("reviewed")
.has("score",5)
.values("summary")
.map{it.substring(10)}

g.V().has("ASIN","B000654OV0").match(
__.as("p1").in("hasProduct").as("c"),
__.as("p1").in("reviewed").as("u"),
__.as("u").out("reviewed").as("p2"),
__.as("p2").in("hasProduct").as("c")
).where("p1", neq("p2")).dedup("u")
.select("u").by("userId")

Lesson 7
Graph needs declarative
and imperative query
language constructs

Let’s discuss!
Questions, comments, feedback?
We are hiring

Abstract
Titan is a scalable graph database build on top of Apache Cassandra that
supports both real time and analytic graph queries across distributed clusters.
This talk focuses on the recently released Titan 1.0 and the new features it
introduces. Titan implements the most recent version of the popular Apache
Gremlin graph query language through a custom rewrite engine and query
optimizer to efficiently execute deep traversal queries. Graph analytics and
global breadth-first execution of Gremlin queries is executed by Apache Spark
through the Cassandra-Spark connector. These features are demonstrated on
a social use case to highlight how Titan can deliver relationship value with little
development effort.

Viewers also liked

Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...DataStax Academy

Introduction to Dating Modeling for CassandraDataStax Academy

Cassandra Summit 2014: Apache Cassandra at Telefonica CBSDataStax Academy

Production Ready Cassandra (Beginner)DataStax Academy

Cassandra Summit 2014: Monitor Everything!DataStax Academy

Coursera's Adoption of CassandraDataStax Academy

Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2DataStax Academy

New features in 3.0DataStax Academy

The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy

Introduction to .Net DriverDataStax Academy

Spark Cassandra Connector: Past, Present and FurureDataStax Academy

Playlists at SpotifyDataStax Academy

Oracle to Cassandra Core Concepts Guide Pt. 2DataStax Academy

Lessons Learned with Cassandra and Spark at the US Patent and Trademark OfficeDataStax Academy

Using Event-Driven Architectures with CassandraDataStax Academy

Signal Digital: The Skinny on Wide RowsDataStax Academy

Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and SparkDataStax Academy

Q2 teenagersBrandon Hill

Types by Adform Research, Saulius ValatkaVasil Remeniuk

Viewers also liked (19)

Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...

Introduction to Dating Modeling for Cassandra

Cassandra Summit 2014: Apache Cassandra at Telefonica CBS

Production Ready Cassandra (Beginner)

Cassandra Summit 2014: Monitor Everything!

Coursera's Adoption of Cassandra

Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2

New features in 3.0

The Last Pickle: Distributed Tracing from Application to Database

Introduction to .Net Driver

Spark Cassandra Connector: Past, Present and Furure

Playlists at Spotify

Oracle to Cassandra Core Concepts Guide Pt. 2

Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office

Using Event-Driven Architectures with Cassandra

Signal Digital: The Skinny on Wide Rows

Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark

Q2 teenagers

Types by Adform Research, Saulius Valatka

Similar to DataStax: Titan 1.0: Scalable real time and analytic graph queries

Hibernate TutorialSyed Shahul

R Brown-bag seminars : Seminar-8Muhammad Nabi Ahmad

Kiss PageObjects [01-2017]Iakiv Kramarenko

Eat whatever you can with PyBabeDataiku

Radoslav Stankov - Handling GraphQL with React and ApolloFDConf

Untangling spring week7Derek Jacoby

KAPT Annotation processing & Code generationLINE Corporation

Green daoDroidcon Berlin

Reproducible Computational Research in RSamuel Bosch

The Ring programming language version 1.5.4 book - Part 10 of 185Mahmoud Samir Fayed

Ejb3 Struts Tutorial EnAnkur Dongre

Building DSLs with GroovySten Anderson

Is java8a truefunctionallanguageSamir Chekkal

Is java8 a true functional programming languageSQLI

Hands on Mahout!OSCON Byrum

The Ring programming language version 1.5.3 book - Part 10 of 184Mahmoud Samir Fayed

Javascript unit testing, yes we can e bigAndy Peterson

Hibernate An IntroductionNguyen Cao

Effective Java with Groovy & Kotlin How Languages Influence Adoption of Good ...Naresha K

Similar to DataStax: Titan 1.0: Scalable real time and analytic graph queries (20)

Hibernate Tutorial

R Brown-bag seminars : Seminar-8

Kiss PageObjects [01-2017]

Eat whatever you can with PyBabe

Radoslav Stankov - Handling GraphQL with React and Apollo

Untangling spring week7

KAPT Annotation processing & Code generation

Green dao

Reproducible Computational Research in R

The Ring programming language version 1.5.4 book - Part 10 of 185

Ejb3 Struts Tutorial En

Building DSLs with Groovy

Is java8a truefunctionallanguage

Is java8 a true functional programming language

Hands on Mahout!

The Ring programming language version 1.5.3 book - Part 10 of 184

Javascript unit testing, yes we can e big

Hibernate An Introduction

Effective Java with Groovy & Kotlin How Languages Influence Adoption of Good ...

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

🐬 The future of MySQL is Postgres 🐘RTylerCroy

MINDCTI Revenue Release Quarter One 2024MIND CTI

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Why Teams call analytics are critical to your entire businesspanagenda

A Year of the Servo Reboot: Where Are We Now?Igalia

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

🐬 The future of MySQL is Postgres 🐘

MINDCTI Revenue Release Quarter One 2024

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Artificial Intelligence Chap.5 : Uncertainty

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Why Teams call analytics are critical to your entire business

A Year of the Servo Reboot: Where Are We Now?

Boost Fertility New Invention Ups Success Rates.pdf

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

presentation ICT roal in 21st century education

How to Troubleshoot Apps for the Modern Connected Worker

Apidays New York 2024 - The value of a flexible API Management solution for O...

Axa Assurance Maroc - Insurer Innovation Award 2024

Strategies for Landing an Oracle DBA Job as a Fresher

Powerful Google developer tools for immediate impact! (2023-24 C)

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

HTML Injection Attacks: Impact and Mitigation Strategies

DataStax: Titan 1.0: Scalable real time and analytic graph queries

1. Titan 1.0 A journey and lessons learned Matthias Broecheler @mbroecheler

2. Titan 1.0.0 titandb.io

3. June 14th 2012

5. Final Lesson It’s early days for Graph

7. Lesson 1 Don't reinvent basic database memory management

9. Lesson 2 Need for Scale-Out

10.

11.

12.

13.

14.

15. g.V().has("user","name","Marko") .out("reviewed").values("title")

16. Lesson 3 To use graph effectively you need to “think” graph

17. June 14th 2012

18.

19.

20. Lesson 4 Use cases: Highly connected and heterogenous

21. g.V().hasLabel("product") .has("title",textContains("Titan")) .inE("reviewed") .has("score",5) .values("summary")

22. g.V().hasLabel("product") .has("title",textContains("Titan")) .inE("reviewed") .has("score",5) .values("summary") Graph-global retrieval Graph-local walk

23. g.V().hasLabel("product") .has("title",textContains("Titan")) .inE("reviewed") .has("score",5) .values("summary") TitanGraphStep TitanVertexStep

24. g.V().hasLabel("product") .has("title",textContains("Titan")) .inE("reviewed") .has("score",5) .values("summary")

25. Lesson 5 Flexible and adaptive query optimization

26. g.V().map(outE().count()).mean()

27. g.V().map(outE().count()).mean()

28.

29. g.V().as("user") .filter( outE("reviewed").has("helpfulness") .count().is(gt(10))) .outE("reviewed") .has("helpfulness").as("r") .group().by(select("user")) .by(select("r").by("helpfulness")) .iterate()

30. Lesson 6 Graph OLTP and OLAP are converging

31. g.V().hasLabel("product") .has("title",textContains("Titan")) .inE("reviewed") .has("score",5) .values("summary")

32. g.V().hasLabel("product") .has("title",textContains("Titan")) .inE("reviewed") .has("score",5) .values("summary") .map{it.substring(10)}

33. g.V().has("ASIN","B000654OV0").match( __.as("p1").in("hasProduct").as("c"), __.as("p1").in("reviewed").as("u"), __.as("u").out("reviewed").as("p2"), __.as("p2").in("hasProduct").as("c") ).where("p1", neq("p2")).dedup("u") .select("u").by("userId")

34. Lesson 7 Graph needs declarative and imperative query language constructs

35. Final Lesson It's early days for Graph

36. Let’s discuss! Questions, comments, feedback? We are hiring

37. Abstract Titan is a scalable graph database build on top of Apache Cassandra that supports both real time and analytic graph queries across distributed clusters. This talk focuses on the recently released Titan 1.0 and the new features it introduces. Titan implements the most recent version of the popular Apache Gremlin graph query language through a custom rewrite engine and query optimizer to efficiently execute deep traversal queries. Graph analytics and global breadth-first execution of Gremlin queries is executed by Apache Spark through the Cassandra-Spark connector. These features are demonstrated on a social use case to highlight how Titan can deliver relationship value with little development effort.

DataStax: Titan 1.0: Scalable real time and analytic graph queries

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (19)

Similar to DataStax: Titan 1.0: Scalable real time and analytic graph queries

Similar to DataStax: Titan 1.0: Scalable real time and analytic graph queries (20)

More from DataStax Academy

More from DataStax Academy (20)

Recently uploaded

Recently uploaded (20)

DataStax: Titan 1.0: Scalable real time and analytic graph queries