SlideShare ist ein Scribd-Unternehmen logo
1 von 36
www.Objectivity.com

Making Sense of the
Graph Revolution
Nick Quinn, Principal
Engineer, InfiniteGraph
11/21/13

1
Why Call it a Revolution?
• “a forcible overthrow of the current order in
favor of a new system.”
• NoSQL (Not Only SQL)
– Driven by Choice + Big Data Needs
•
•
•
•

Scalable
Performing
Distributed
Highly Available
Big Data + Graph = Big Graph Data
• Social Scale
– 1 billion vertices, 100 billion edges

• Web Scale
– 50 billion vertices, 1 trillion edges

• Brain Scale
– 100 billion vertices, 100 trillion edges
AND GROWING!
Why Call it a Graph Revolution?
• After 2011, NoSQL and Graph database begin to
follow same trend line and forecast.
The Growing Graph Database Landscape
What is a Graph Database?
• A graph database is a native storage engine that
enables efficient storage and retrieval of graph
structured data.
• Graph databases are typically used:
– When the data source is highly connected,
– Where the connections are important (add value to the
data), and
– When the user access pattern requires traversals of
those connections.
What is a Graph Database
• Graph Databases have a unique data model
(Vertices and Edges).
VERTEX

2

N

EDGE

• They are optimized around concurrent access of
persisted data, so users can navigate the data as it
is being added or updated.
Why Use a Graph Database?
Relational Database
Think about the SQL query for finding all links between the two “blue” rows... Good luck!
Table_A

Table_B

Table_C

Table_D

Table_E

Table_F

Relational databases aren’t good at handling complex relationships!

Table_G
Why use a Graph Database?

Relational Database
Table_A

Table_B

Table_C

Table_D

Table_E

Table_F

Table_G

Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code
A3

G4
Why use a Graph Database?
Specialized Graph Use Cases
• Cyber Security – Identifying potential cyber threats
and their targets
• Network Management – Offer answers to very
complex navigational queries on a social network
that needs near real-time answers
• Targeted Advertising – Customize marketing to
the consumer by compiling a large knowledge
graph with an integrated recommendation engine
Example 1 - Ad Placement Networks
Smartphone Ad placement - based on the the user’s profile and location data
captured by opt-in applications.

•

The location data can be stored and distilled in a key-value and column store
hybrid database, such as Cassandra

•

The locations are matched with geospatial data to deduce user interests.

•

As Ad placement orders arrive, an application built on a graph database such
as InfiniteGraph, matches groups of users with Ads:

•

Maximizes relevance for the user.

•

Yields maximum value for the advertiser and the placer.
Example 2 - Market Analysis
The 10 companies that control a majority of U.S. consumer goods brands
Example 3 - Seed To Consumer Tracking

?
Supply Chain Management Use Case
• Identify the optimal route for a fleet of trucks at a
particular time of the year is quite complex.
– number of drivers to pay and their salaries
– gas, weather patterns, timing requirements, container
sizes, distances, roads, hazards, repairs

• Find the most optimal route during the winter in
which certain highways will tend to become
hazardous around the Great Lakes.
Supply Chain Management Use Case
• Find the most cost-effective route in December with
weather conditions X and highway conditions Y, and
stay below Z latitude while optimizing costs to
achieve a rush delivery
GraphView myView = new GraphView();
myView.excludeClass(myGraphDb.getTypeId(Highway.class.
getName()),“(weather.precipitation > precipitationX &&
weather.temperature < temperatureX) || traffic.speed <
speedY || traffic.accidents > accidentsY ”);
myView.excludeClass(myGraphDb.getTypeId(City.class.get
Name()), “latitude >= Z”);
Supply Chain Management Use Case
City origin,target = …; // Use query or index to lookup “origin” & “target” city
VertexIdentifier resultQualifier = new VertexIdentifier(target);
// Set policies
PolicyChain myPolicies = new PolicyChain();
myPolicies.addPolicy(new MaximumPathDepthPolicy(MAXIMUM_STEPS));
myPolicies.addPolicy(new NoRevisitPolicy()); // Don’t revisit the cities more than once
// Define logic on how to process results
NavigationResultHandler myNavHandler = new NavigationResultHandler()
{
@Override
public void handleResultPath(Path result)
{
// The first path returned is the shortest path, but may not be the cheapest
float cost = calculateCost(result);
float time = calculateTime(result);
// Minimize cost
…
}
@Override
public void handleNavigatorFinished(Navigator navigator){}
};
Navigator navigator = origin.navigate(myView, Guide.DEPTH_FIRST_SEARCH, Qualifier.ANY
/** Path Qualifier **/, resultQualifier, myPolicies, myNavHandler);
navigator.start();
Graph Database Challenge #1:
Reading Distributed Data
• If your graph data is distributed, traversing a
desired path across partitions can be extremely
difficult and slow.
Graph Database Challenge #1:
Reading Distributed Data
• Mitigate bottlenecks and optimize performance by
using the following strategies:
– Custom Placement: data isolation/localization of
logically related information (to achieve close to
subgraph partitioning) in order to minimize the number of
network calls
– Distributed Navigation Engine: Distributes the load on
the partitions where the data is located.
Reading Distributed Data:
Custom Placement

• Consider the case where you are placing medical data for
hospitals and patients. Using a custom placement model
you can achieve fairly high isolation of the subgraphs.
– Doctor ↔ Hospitals, Patients ↔ Visits.
Reading Distributed Data:
Distributed Navigation Engine
• Google Pregel (2010)
– Batch algorithms on large graphs
– Avoids passing graph state instead sends messages
– Apache Giraph, Jpregel, Hama
while any vertex is active or max iterations not reached:
for each vertex:  this loop is run in parallel
process messages from neighbors (update internal state)
send messages to neighbors
possibly synchronize results
set active flag (unless no messages or state doesn’t change)
Reading Distributed Data:
Distributed Navigation Engine
• Pregel is optimized for large distributed graph analytics
• Limitation on Pregel logic: When the traversal is
occurring locally, the logic is to still execute by sending
messages from vertex to vertex
• Ideally, when local, the traversal should be executed in
memory and when remote, pregel logic should be used.
– InfiniteGraph’s Distributed Navigation Engine uses the
QueryServer (oqs) to achieve this optimized behavior.
Graph Database Challenge #2:
Supernodes
• A supernode is a vertex with a disproportionally
high number of outgoing edges.
– Inefficient to traverse through these vertices
Supernodes (Avoid the Tonight Show!)
In the IMDB data set, some examples of supernodes may be talk
shows, awards shows, compilations or variety shows.
Supernodes:
GraphViews and Policies
• With InfiniteGraph, we offer two strategies to
addressing the supernode problem within the
navigation context.
– Use GraphViews to filter out vertex or edge types
– Globally limit the number of edges traversed using the
FanoutLimitPolicy
Supernodes:
GraphViews and Policies
• Consider calculating number of links to interesting
companies on LinkedIn.
– If you are connected to recruiters, the navigation result
set can be slowed down and possibly polluted if
traversing through these recruiters.
GraphView myView = new GraphView();
myView.excludeClass(myGraphDb.getTypeId(Person.class
.getName()), “CONTAINS(profession, ‘recruiter’)”;
PolicyChain chain = new PolicyChain();
// Limits # of edges traversed to 10
chain.addPolicy(new FanoutLimitPolicy(10));
Supernodes:
Edge Discovery Methods
• If walking the graph, edge discovery methods are
available on the vertex API allows for easy lookup.
Vertex start = …; // lookup by query or index
// Get all ‘Facebook’ connections
EdgePredicate edgeQualifier = new
EdgePredicate(Knows.class, “how == ‘Facebook’”);
Iterable edgeHandles = start.getEdges(edgeQualifier);

• More edge discovery methods and optimizations
are coming!
Graph Database Challenge #3:
Writing Distributed Data
App-1
(E1 2{ V1V21)
(Ingest V })

App-2
(E23{ V2V32)
(Ingest V})

App-3
(Ingest V3)

InfiniteGraph
Objectivity/DB Persistence Layer

VV1
1

EE12
12

VV2
2

EE23
23

VV3
3
Graph Database Challenge #3:
Writing Distributed Data
• Concurrent writes (multithreaded, multiprocess
and/or multiuser access) to a database that holds
highly connected data
highly contentious locking behavior
poor write performance retrying transactions

• NoSQL databases with relaxed consistency modes
typically offer higher write performance
– System maintains data integrity (ACID), handles lock
conflicts, optimizes batch processing
Writing Distributed Data:
Accelerated Ingest (Pipelining)
• InfiniteGraph offers relaxed consistency ingest
mode, Accelerated Ingest.
– Vertex, Edge objects are placed immediately
– Edge updates are “pipelined” (no lock contention) and
updates are batch processed (optimized)
– Graph is built up in background
– Achieves highest rate of ingest in distributed
environments
Writing Distributed Data:
Accelerated Ingest (Pipelining)
IG Core/API

EE23
23

Target Containers

EE12
12

E(2->3)

E(1->2)

E(3->1)

E(2->1)

E(1->2)

E(2->3)

E(2->3)
E(3->1)

E(1->2)
E(3->2)

E(2->1)
E(2->3)

E(3->1)

Pipeline
CC1
1

Pipeline Containers

E(1->2)

CC2
2

E(3->1)
E(3->2)

Agent

CC3
3
Acclerated Ingest Performance Results
Graph Database Challenge #4:
Tools
• Typically, when databases don’t offer tools for
analysis or visualization, the tools that are used are
general purpose.
• Tools offered by databases are generally integrated
well with native features.
– Sometimes exposing “hidden” features
– These tools can generally be useful for debugging and
development of applications built on top of the database.
Tools: The IG Visualizer
• Excellent for development and debugging of
application built on top of IG database.
Why InfiniteGraph ?
™

• Objectivity/DB is a proven foundation
– Building distributed databases since 1993
– A complete database management system
• Concurrency, transactions, cache, schema, query, indexing

• It’s a Graph Specialist !
– Simple but powerful API tailored for data navigation.
– Easy to configure distribution model
QUESTIONS?

Weitere ähnliche Inhalte

Was ist angesagt?

Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningDatabricks
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Carol McDonald
 
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...Databricks
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Spark Summit
 
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Databricks
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyDatabricks
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceWei Di
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopTed Dunning
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...
Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...
Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...Spark Summit
 
HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon
 
Energy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshopEnergy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshopQuantUniversity
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Introduction to Nebula Graph, an Open-Source Distributed Graph Database
Introduction to Nebula Graph, an Open-Source Distributed Graph DatabaseIntroduction to Nebula Graph, an Open-Source Distributed Graph Database
Introduction to Nebula Graph, an Open-Source Distributed Graph DatabaseNebula Graph
 
Breaking Down Analytical and Computational Barriers Across the Energy Industr...
Breaking Down Analytical and Computational Barriers Across the Energy Industr...Breaking Down Analytical and Computational Barriers Across the Energy Industr...
Breaking Down Analytical and Computational Barriers Across the Energy Industr...Spark Summit
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to MahoutTed Dunning
 

Was ist angesagt? (20)

Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
 
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise Company
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
 
Managing a Multi-Tenant Data Lake
Managing a Multi-Tenant Data LakeManaging a Multi-Tenant Data Lake
Managing a Multi-Tenant Data Lake
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...
Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...
Using GraphX/Pregel on Browsing History to Discover Purchase Intent by Lisa Z...
 
HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBase
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
Energy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshopEnergy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshop
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Introduction to Nebula Graph, an Open-Source Distributed Graph Database
Introduction to Nebula Graph, an Open-Source Distributed Graph DatabaseIntroduction to Nebula Graph, an Open-Source Distributed Graph Database
Introduction to Nebula Graph, an Open-Source Distributed Graph Database
 
Breaking Down Analytical and Computational Barriers Across the Energy Industr...
Breaking Down Analytical and Computational Barriers Across the Energy Industr...Breaking Down Analytical and Computational Barriers Across the Energy Industr...
Breaking Down Analytical and Computational Barriers Across the Energy Industr...
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to Mahout
 
Tailored for Spark
Tailored for SparkTailored for Spark
Tailored for Spark
 

Andere mochten auch

Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLRichard Schneeman
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersMongoDB
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesOsama Jomaa
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 

Andere mochten auch (11)

Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
 
SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
 
Big Data
Big DataBig Data
Big Data
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Ähnlich wie Making sense of the Graph Revolution

IEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slidesIEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slidesNish Parikh
 
Large scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log miningLarge scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log miningitstuff
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-Systeminside-BigData.com
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!dclsocialmedia
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016Eduard Lazar
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaGoDataDriven
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web developmentTung Nguyen
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphsStanka Dalekova
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesKonstantinos Xirogiannopoulos
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesPyData
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Big Data, Bigger Analytics
Big Data, Bigger AnalyticsBig Data, Bigger Analytics
Big Data, Bigger AnalyticsItzhak Kameli
 
TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016Eduard Lazar
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputinginside-BigData.com
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareMaking Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareData Con LA
 

Ähnlich wie Making sense of the Graph Revolution (20)

A Swarm of Ads
A Swarm of AdsA Swarm of Ads
A Swarm of Ads
 
IEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slidesIEEE.BigData.Tutorial.2.slides
IEEE.BigData.Tutorial.2.slides
 
Large scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log miningLarge scale Click-streaming and tranaction log mining
Large scale Click-streaming and tranaction log mining
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Big Data, Bigger Analytics
Big Data, Bigger AnalyticsBig Data, Bigger Analytics
Big Data, Bigger Analytics
 
TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareMaking Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
 

Mehr von InfiniteGraph

Making Sense of Graph Databases
Making Sense of Graph DatabasesMaking Sense of Graph Databases
Making Sense of Graph DatabasesInfiniteGraph
 
Webinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive ValueWebinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive ValueInfiniteGraph
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessInfiniteGraph
 
The Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesThe Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesInfiniteGraph
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataInfiniteGraph
 
PowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLPowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLInfiniteGraph
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseInfiniteGraph
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph DatabasesInfiniteGraph
 
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data StoresUsing A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data StoresInfiniteGraph
 
Turning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph TechnologiesTurning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph TechnologiesInfiniteGraph
 
NoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive AnalyticsNoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive AnalyticsInfiniteGraph
 
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph ProblemHow we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph ProblemInfiniteGraph
 
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...InfiniteGraph
 
Vodafone xone fev142013v3 ext
Vodafone xone fev142013v3 extVodafone xone fev142013v3 ext
Vodafone xone fev142013v3 extInfiniteGraph
 
Dbta Webinar Realize Value of Big Data with graph 011713
Dbta Webinar Realize Value of Big Data with graph  011713Dbta Webinar Realize Value of Big Data with graph  011713
Dbta Webinar Realize Value of Big Data with graph 011713InfiniteGraph
 
Oracle no sql overview brief
Oracle no sql overview briefOracle no sql overview brief
Oracle no sql overview briefInfiniteGraph
 
Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012InfiniteGraph
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyInfiniteGraph
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012InfiniteGraph
 
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...InfiniteGraph
 

Mehr von InfiniteGraph (20)

Making Sense of Graph Databases
Making Sense of Graph DatabasesMaking Sense of Graph Databases
Making Sense of Graph Databases
 
Webinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive ValueWebinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive Value
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
 
The Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesThe Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use Cases
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big Data
 
PowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLPowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQL
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL Database
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data StoresUsing A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
 
Turning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph TechnologiesTurning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph Technologies
 
NoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive AnalyticsNoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive Analytics
 
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph ProblemHow we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
 
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
 
Vodafone xone fev142013v3 ext
Vodafone xone fev142013v3 extVodafone xone fev142013v3 ext
Vodafone xone fev142013v3 ext
 
Dbta Webinar Realize Value of Big Data with graph 011713
Dbta Webinar Realize Value of Big Data with graph  011713Dbta Webinar Realize Value of Big Data with graph  011713
Dbta Webinar Realize Value of Big Data with graph 011713
 
Oracle no sql overview brief
Oracle no sql overview briefOracle no sql overview brief
Oracle no sql overview brief
 
Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012
 
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
 

Kürzlich hochgeladen

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Kürzlich hochgeladen (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Making sense of the Graph Revolution

  • 1. www.Objectivity.com Making Sense of the Graph Revolution Nick Quinn, Principal Engineer, InfiniteGraph 11/21/13 1
  • 2. Why Call it a Revolution? • “a forcible overthrow of the current order in favor of a new system.” • NoSQL (Not Only SQL) – Driven by Choice + Big Data Needs • • • • Scalable Performing Distributed Highly Available
  • 3. Big Data + Graph = Big Graph Data • Social Scale – 1 billion vertices, 100 billion edges • Web Scale – 50 billion vertices, 1 trillion edges • Brain Scale – 100 billion vertices, 100 trillion edges AND GROWING!
  • 4. Why Call it a Graph Revolution? • After 2011, NoSQL and Graph database begin to follow same trend line and forecast.
  • 5. The Growing Graph Database Landscape
  • 6. What is a Graph Database? • A graph database is a native storage engine that enables efficient storage and retrieval of graph structured data. • Graph databases are typically used: – When the data source is highly connected, – Where the connections are important (add value to the data), and – When the user access pattern requires traversals of those connections.
  • 7. What is a Graph Database • Graph Databases have a unique data model (Vertices and Edges). VERTEX 2 N EDGE • They are optimized around concurrent access of persisted data, so users can navigate the data as it is being added or updated.
  • 8. Why Use a Graph Database? Relational Database Think about the SQL query for finding all links between the two “blue” rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Relational databases aren’t good at handling complex relationships! Table_G
  • 9. Why use a Graph Database? Relational Database Table_A Table_B Table_C Table_D Table_E Table_F Table_G Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code A3 G4
  • 10. Why use a Graph Database?
  • 11. Specialized Graph Use Cases • Cyber Security – Identifying potential cyber threats and their targets • Network Management – Offer answers to very complex navigational queries on a social network that needs near real-time answers • Targeted Advertising – Customize marketing to the consumer by compiling a large knowledge graph with an integrated recommendation engine
  • 12. Example 1 - Ad Placement Networks Smartphone Ad placement - based on the the user’s profile and location data captured by opt-in applications. • The location data can be stored and distilled in a key-value and column store hybrid database, such as Cassandra • The locations are matched with geospatial data to deduce user interests. • As Ad placement orders arrive, an application built on a graph database such as InfiniteGraph, matches groups of users with Ads: • Maximizes relevance for the user. • Yields maximum value for the advertiser and the placer.
  • 13. Example 2 - Market Analysis The 10 companies that control a majority of U.S. consumer goods brands
  • 14. Example 3 - Seed To Consumer Tracking ?
  • 15. Supply Chain Management Use Case • Identify the optimal route for a fleet of trucks at a particular time of the year is quite complex. – number of drivers to pay and their salaries – gas, weather patterns, timing requirements, container sizes, distances, roads, hazards, repairs • Find the most optimal route during the winter in which certain highways will tend to become hazardous around the Great Lakes.
  • 16. Supply Chain Management Use Case • Find the most cost-effective route in December with weather conditions X and highway conditions Y, and stay below Z latitude while optimizing costs to achieve a rush delivery GraphView myView = new GraphView(); myView.excludeClass(myGraphDb.getTypeId(Highway.class. getName()),“(weather.precipitation > precipitationX && weather.temperature < temperatureX) || traffic.speed < speedY || traffic.accidents > accidentsY ”); myView.excludeClass(myGraphDb.getTypeId(City.class.get Name()), “latitude >= Z”);
  • 17. Supply Chain Management Use Case City origin,target = …; // Use query or index to lookup “origin” & “target” city VertexIdentifier resultQualifier = new VertexIdentifier(target); // Set policies PolicyChain myPolicies = new PolicyChain(); myPolicies.addPolicy(new MaximumPathDepthPolicy(MAXIMUM_STEPS)); myPolicies.addPolicy(new NoRevisitPolicy()); // Don’t revisit the cities more than once // Define logic on how to process results NavigationResultHandler myNavHandler = new NavigationResultHandler() { @Override public void handleResultPath(Path result) { // The first path returned is the shortest path, but may not be the cheapest float cost = calculateCost(result); float time = calculateTime(result); // Minimize cost … } @Override public void handleNavigatorFinished(Navigator navigator){} }; Navigator navigator = origin.navigate(myView, Guide.DEPTH_FIRST_SEARCH, Qualifier.ANY /** Path Qualifier **/, resultQualifier, myPolicies, myNavHandler); navigator.start();
  • 18. Graph Database Challenge #1: Reading Distributed Data • If your graph data is distributed, traversing a desired path across partitions can be extremely difficult and slow.
  • 19. Graph Database Challenge #1: Reading Distributed Data • Mitigate bottlenecks and optimize performance by using the following strategies: – Custom Placement: data isolation/localization of logically related information (to achieve close to subgraph partitioning) in order to minimize the number of network calls – Distributed Navigation Engine: Distributes the load on the partitions where the data is located.
  • 20. Reading Distributed Data: Custom Placement • Consider the case where you are placing medical data for hospitals and patients. Using a custom placement model you can achieve fairly high isolation of the subgraphs. – Doctor ↔ Hospitals, Patients ↔ Visits.
  • 21. Reading Distributed Data: Distributed Navigation Engine • Google Pregel (2010) – Batch algorithms on large graphs – Avoids passing graph state instead sends messages – Apache Giraph, Jpregel, Hama while any vertex is active or max iterations not reached: for each vertex:  this loop is run in parallel process messages from neighbors (update internal state) send messages to neighbors possibly synchronize results set active flag (unless no messages or state doesn’t change)
  • 22. Reading Distributed Data: Distributed Navigation Engine • Pregel is optimized for large distributed graph analytics • Limitation on Pregel logic: When the traversal is occurring locally, the logic is to still execute by sending messages from vertex to vertex • Ideally, when local, the traversal should be executed in memory and when remote, pregel logic should be used. – InfiniteGraph’s Distributed Navigation Engine uses the QueryServer (oqs) to achieve this optimized behavior.
  • 23. Graph Database Challenge #2: Supernodes • A supernode is a vertex with a disproportionally high number of outgoing edges. – Inefficient to traverse through these vertices
  • 24. Supernodes (Avoid the Tonight Show!) In the IMDB data set, some examples of supernodes may be talk shows, awards shows, compilations or variety shows.
  • 25. Supernodes: GraphViews and Policies • With InfiniteGraph, we offer two strategies to addressing the supernode problem within the navigation context. – Use GraphViews to filter out vertex or edge types – Globally limit the number of edges traversed using the FanoutLimitPolicy
  • 26. Supernodes: GraphViews and Policies • Consider calculating number of links to interesting companies on LinkedIn. – If you are connected to recruiters, the navigation result set can be slowed down and possibly polluted if traversing through these recruiters. GraphView myView = new GraphView(); myView.excludeClass(myGraphDb.getTypeId(Person.class .getName()), “CONTAINS(profession, ‘recruiter’)”; PolicyChain chain = new PolicyChain(); // Limits # of edges traversed to 10 chain.addPolicy(new FanoutLimitPolicy(10));
  • 27. Supernodes: Edge Discovery Methods • If walking the graph, edge discovery methods are available on the vertex API allows for easy lookup. Vertex start = …; // lookup by query or index // Get all ‘Facebook’ connections EdgePredicate edgeQualifier = new EdgePredicate(Knows.class, “how == ‘Facebook’”); Iterable edgeHandles = start.getEdges(edgeQualifier); • More edge discovery methods and optimizations are coming!
  • 28. Graph Database Challenge #3: Writing Distributed Data App-1 (E1 2{ V1V21) (Ingest V }) App-2 (E23{ V2V32) (Ingest V}) App-3 (Ingest V3) InfiniteGraph Objectivity/DB Persistence Layer VV1 1 EE12 12 VV2 2 EE23 23 VV3 3
  • 29. Graph Database Challenge #3: Writing Distributed Data • Concurrent writes (multithreaded, multiprocess and/or multiuser access) to a database that holds highly connected data highly contentious locking behavior poor write performance retrying transactions • NoSQL databases with relaxed consistency modes typically offer higher write performance – System maintains data integrity (ACID), handles lock conflicts, optimizes batch processing
  • 30. Writing Distributed Data: Accelerated Ingest (Pipelining) • InfiniteGraph offers relaxed consistency ingest mode, Accelerated Ingest. – Vertex, Edge objects are placed immediately – Edge updates are “pipelined” (no lock contention) and updates are batch processed (optimized) – Graph is built up in background – Achieves highest rate of ingest in distributed environments
  • 31. Writing Distributed Data: Accelerated Ingest (Pipelining) IG Core/API EE23 23 Target Containers EE12 12 E(2->3) E(1->2) E(3->1) E(2->1) E(1->2) E(2->3) E(2->3) E(3->1) E(1->2) E(3->2) E(2->1) E(2->3) E(3->1) Pipeline CC1 1 Pipeline Containers E(1->2) CC2 2 E(3->1) E(3->2) Agent CC3 3
  • 33. Graph Database Challenge #4: Tools • Typically, when databases don’t offer tools for analysis or visualization, the tools that are used are general purpose. • Tools offered by databases are generally integrated well with native features. – Sometimes exposing “hidden” features – These tools can generally be useful for debugging and development of applications built on top of the database.
  • 34. Tools: The IG Visualizer • Excellent for development and debugging of application built on top of IG database.
  • 35. Why InfiniteGraph ? ™ • Objectivity/DB is a proven foundation – Building distributed databases since 1993 – A complete database management system • Concurrency, transactions, cache, schema, query, indexing • It’s a Graph Specialist ! – Simple but powerful API tailored for data navigation. – Easy to configure distribution model