SlideShare ist ein Scribd-Unternehmen logo
1 von 30
http://bigdata.com 
http://mapgraph.io 
SYSTAP, LLC 
Graphs 
Graph Databases 
Graph Analytics on GPUs 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
1 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Graphs 
• This talk is about recent advances in large scale graph processing on 
GPUs. 
– The motivation is extreme performance. 
– Everything we do (as a company) is focused on graphs. 
• Graph Database 
• Graph processing 
• Common characteristics: 
– irregular data shape, irregular access patterns, and irregular parallelism. 
• A lot of data can be mapped onto graphs 
– Sparse matrices and graphs are very close data structures 
– Graphs, as we deal with them, have attributes on vertices and edges. 
• A lot of algorithms can be mapped onto graphs 
– Including many machine learning algorithms. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
2 
http://www.bigdata.com/blog
Small Business, Founded 2006 100% Employee Owned 
http://bigdata.com 
http://mapgraph.io 
SYSTAP, LLC 
Graph Database 
• High performance, Scalable 
– 50B edges/node 
– High level query language 
– Efficient Graph Traversal 
– High 9s solution 
• Open Source 
– Subscriptions 
GPU Analytics 
• Extreme Performance 
– 5-100x faster than graphlab 
– 10,000x faster than graphdbs 
• DARPA funding 
• Disruptive technology 
– Early adopters 
– Huge ROIs 
• Open Source 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
3 
9/19/2014
Related “Graph” Technologies 
Redpoint repositions existing technology, 
adding interoperability for blueprints and 
gremlin. 
http://bigdata.com 
http://mapgraph.io 
STTR 
Pair up bigdata 
and MapGraph 
MapGraph compares favorably with high end 
hardware solutions from YARC, Oracle, and 
SAP, but is open source and uses commodity 
hardware. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
4 
9/19/2014
Embedded, Single Server, HA, Scale-out 
• RDF/SPARQL 
• Property graphs 
– Blueprints, gremlin, rexter 
• REST API (NSS) 
• Extension points 
– Stored queries for custom 
application logic on the 
server. 
– Custom services & indices 
– Custom functions 
– Vertex-centric programs 
http://bigdata.com 
http://mapgraph.io 
• Embedded Server 
Journal 
JVM 
• Standalone Server 
Journal 
WAR 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
5 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
High Availability 
• Shared nothing architecture 
– Same data on each node 
– Coordinate only at commit 
– Transparent load balancing 
• Scaling 
– 50 billion triples or quads 
– Query throughput scales linearly 
• Self healing 
– Automatic failover 
– Automatic resync after disconnect 
– Online single node disaster recovery 
• Online Backup 
– Online snapshots (full backups) 
– HA Logs (incremental backups) 
• Point in time recovery (offline) 
HAService 
Quorum 
k=3 
size=3 
leader 
follower 
HAService 
HAService 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
6 
9/19/2014
Embedded, Single Server, HA, Scale-out 
http://bigdata.com 
http://mapgraph.io 
RDF Data and SPARQL Query 
Client Service 
Distributed Index Management and Query 
Management Functions 
Client Service 
Registrar 
Data Service 
Client Service 
Data Service Data Service Data Service 
Data Service Data Service Data Service 
Zookeeper 
Shard Locator 
Transaction Mgr 
Load Balancer 
Unified API 
Application 
Client 
Application 
Client 
Application 
Client 
Application 
Client 
Application 
Client 
Client Service 
SPARQL XML 
SPARQL JSON 
RDF/XML 
N-Triples 
N-Quads 
Turtle 
TriG 
RDF/JSON 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
7 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
And now on GPUs 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
8 
9/19/2014
Similar models, different problems 
• Graph query and graph analytics (traversal/mining) 
– Related data models 
– Very different computational requirements 
• Many technologies are a bad match or limited solution 
– Key-value stores (bigtable, Accumulo, Cassandra, HBase) 
– Map-reduce 
• Anti-pattern 
– Dump all data into “big bucket” 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
9 
9/19/2014
Similar models, different problems 
• Graph query and graph analytics (traversal/mining) 
– Related data models 
– Very different computational requirements 
• Many technologies are a bad match or limited solution 
– Key-value stores (bigtable, Accumulo, Cassandra, HBase) 
– Map-reduce 
• Anti-pattern 
– Dump all data into “big bucket” 
Storage and computation patterns must be 
correctly matched for high performance. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
10 
9/19/2014
Optimize for the right problem 
• Graph analytics 
– Parallelism – work must be distributed and balanced. 
– Memory bandwidth – memory, not disk, is the bottleneck 
– 2D partitioning – O(log(N)) communications pattern (versus O(N*N)) 
• 1D design looses locality when updating link weights for reverse indices. 
BFS PR 
• Storage and computation patterns must be correctly matched 
for high performance. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
11 
9/19/2014
GPUs – A Game Changer for Graph Analytics 
• Graphs are a hard problem 
• Non-locality 
• Data dependent parallelism 
• Memory, PCIe bus and network are 
bottlenecks 
• Recent performance gains driven by 
innovations in bottom-up search, 
data layout, and partitioning. 
• GPUs deliver effective parallelism 
• 10x CPU FLOPS 
• 10x CPU/RAM bandwidth 
• Significant speeds up over CPU 
• 3 GTEPS on one GPU 
• 32 GTEPS on 64 GPU cluster 
http://bigdata.com 
http://mapgraph.io 
3500 
3000 
2500 
2000 
1500 
1000 
500 
0 
Breadth-First Search on Graphs 
NVIDIA Tesla C2050 
Multicore per socket 
Sequential 
0 
2 
1 10 100 1000 10000 100000 
Million Traversed Edges per 
Second 
Average Traversal Depth 
1 
1 
2 
1 
1 
2 
2 
2 
1 
3 
2 
3 
2 
1 2 
2 
10x Speedup on GPUs
http://bigdata.com 
http://mapgraph.io 
GPU Hardware Trends 
• K40 GPU (today) 
• 12G RAM/GPU 
• 288 GB/s bandwidth 
• PCIe Gen 3 
• Pascal GPU (Q1 2016) 
• 24G RAM/GPU 
• 1 TB/s bandwidth 
• Unified memory 
across CPU, GPUs 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
13 
9/19/2014
Full Bandwidth Access to CPU RAM 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
14 
9/19/2014
Architecture shapes performance 
• The data was a scale-free graph with 2.7M vertices and 5.6M 
– MapGraph used a larger version of the graph (24M vertices, 25M edges) 
• The query was a 5-degree subgraph (depth-limited BFS) 
• Two main takeaways 
– Horizontal scaling for titan is very expensive – wrong abstraction. 
– GPUs are ridiculously fast. 
platform load (s) 
http://bigdata.com 
http://mapgraph.io 
query 
(ms) comments 
titan 497.00 935 4 node cluster using Cassandra 
neo4j 608.00 668 single node community edition 
bigdata 396.00 281 single node (open source) 
MapGraph 0.08 27 NVIDIA K20 GPU 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
15 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
MapGraph 
Graph Processing on GPUs 
http://MapGraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
16 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Think Like a Vertex 
• Simple APIs 
pageRank(Message m) { 
total = m.value(); 
vertex.val = .15 * .85 + total; 
for(nbr : out_neighbors) { 
SendMsg(nbr, vertex.val/num_out_nbrs); 
} 
} 
• Lots of algorithms 
– BFS, SSSP, Page Rank, Connected Components, Louvain Modularity, 
Jaccard Distance, k-means clustering, Betweenness-Centrality, 
Personalized Page Rank, Loopy Belief Propagation, Graph search (crisp 
and approximate), etc. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
17 
9/19/2014
GAS – a Graph-Parallel Abstraction 
• Graph-Parallel Vertex-Centric API ala GraphLab 
• “Think like a vertex” 
• Gather: collect information 
about my neighborhood 
• Apply: update my value 
• Scatter: signal adjacent vertices 
• Can write all sorts of graph algorithms this way 
– BFS, PageRank, Connected Component, Triangle Counting, Max Flow, 
etc. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
18 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
MapGraph 
• High-level graph processing framework 
• High programmability 
GPU architecture 
Optimization techniques 
CUDA 
• High performance 
Comparable to low-level approach 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
19 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
MapGraph 
• High-level graph processing framework 
• High programmability 
GPU architecture 
Optimization techniques 
CUDA 
• High performance 
Comparable to low-level approach 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
20 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Single GPU MapGraph (BFS) 
Dataset #vertices #edges Max Degree Milliseconds 
Webbase 1,000,005 3,105,536 23 1.2 
Delaunay 2,097,152 6,291,408 4,700 24.5 
Bitcoin 6,297,539 28,143,065 4,075,472 345.3 
Wiki 3,566,907 45,030,389 7,061 51.0 
Kron 1,048,576 89,239,674 131,505 47.7 
154.0 
513.6 
74.8 
821.3 
1870.9 
2,000 
1,800 
1,600 
1,400 
1,200 
1,000 
800 
600 
400 
200 
0 
Webbase Delaunay Bitcoin Wiki Kron 
MTEPS 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
21 
9/19/2014
BFS Results : MapGraph vs GraphLab 
1,000.00 
100.00 
10.00 
http://bigdata.com 
http://mapgraph.io 
1.00 
0.10 
Webbase Delaunay Bitcoin Wiki Kron 
Speedup 
MapGraph Speedup vs GraphLab (BFS) 
GL-2 
GL-4 
GL-8 
GL-12 
MPG 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
22 
9/19/2014
PageRank : MapGraph vs GraphLab 
100.00 
10.00 
http://bigdata.com 
http://mapgraph.io 
1.00 
0.10 
Webbase Delaunay Bitcoin Wiki Kron 
Speedup 
MapGraph Speedup vs GraphLab (Page Rank) 
GL-2 
GL-4 
GL-8 
GL-12 
MPG 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
23 
9/19/2014
Graph Mining on GPU Clusters 
• 2D partitioning (aka vertex cuts) 
• Minimizes the communication volume. 
• Batch parallel Gather in row, Scatter in 
column. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
24 
9/19/2014
Accelerated Graph Analytics 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
25 
9/19/2014
• Work spans multiple orders of magnitude. 
10,000,000 
1,000,000 
100,000 
10,000 
1,000 
100 
http://bigdata.com 
http://mapgraph.io 
10 
1 
Scale 25 Traversal 
0 1 2 3 4 5 6 
Frontier Size 
Iteration 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
26 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Strong Scaling 
• Speedup on a constant problem size with more GPUs 
• Problem scale 25 
– 2^25 vertices (33,554,432) 
– 2^26 directed edges (1,073,741,824) 
23 
22 
21 
20 
19 
18 
17 
16 
15 
14 
13 
10 20 30 40 50 60 70 
GTEPS 
#GPUs 
Strong scaling 
GPUs GTEPS Time (s) 
16 14.3 0.075 
25 16.4 0.066 
36 18.1 0.059 
64 22.7 0.047 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
27 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Weak Scaling 
• Scaling the problem size with more GPUs 
35 
30 
25 
20 
15 
10 
5 
0 
1 4 16 64 
GTEPS 
#GPUs 
Weak scaling 
GPUs Scale Vertices Edges Time (s) GTEPS 
1 21 2,097,152 67,108,864 0.0254 3 
4 23 8,388,608 268,435,456 0.0429 6 
16 25 33,554,432 1,073,741,824 0.0715 15 
64 27 134,217,728 4,294,967,296 0.1478 29 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
28 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Highlights 
• For algorithms on large graphs 
– Memory is the bottleneck 
• CPUs quickly saturate the memory bus. 
• CPU cache thrashing limits scaling for graph traversal. 
• Continued performance gains for CPUs focus on reducing the #of visited edges to reduce 
bandwidth. 
• Hybrid CPU/GPU architectures offload either small degree vertices (reduce cache 
thrashing) or high degree vertices (if the algorithm is FLOPS bound on the CPU, e.g., BC) 
– Many core is the future. 
– GPUs are primarily known for their FLOPS, but they have high memory bandwidth 
and can deliver effective parallelism on parallel graph problems (with sophisticated 
kernels). 
• Scaling to very large graphs on large compute clusters 
– Communications bound. 
• Communication must be constant for perfect scaling 
– Hybrid partitioning seeks to reduce #of messages, size of messages, and optimize 
for asynchronous communications and degree-aware layouts for bottom-up search 
to reduce memory bandwidth. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
29 
http://www.bigdata.com/blog
http://bigdata.com 
http://mapgraph.io 
Bryan Thompson 
SYSTAP, LLC 
bryan@systap.com 
http://bigdata.com http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
30 
9/19/2014

Weitere ähnliche Inhalte

Was ist angesagt?

MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesDataWorks Summit
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleMapR Technologies
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Tugdual Grall
 
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkNear Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkAhsan Javed Awan
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInDataWorks Summit
 
MySql to HBase in 5 Steps
MySql to HBase in 5 StepsMySql to HBase in 5 Steps
MySql to HBase in 5 StepsScott Cinnamond
 
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UNMariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN✔ Eric David Benari, PMP
 
Geospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkGeospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkDatabricks
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse OffloadJohn Berns
 
Cost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceCost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceDatabricks
 
Performing Simulation-Based, Real-time Decision Making with Cloud HPC
Performing Simulation-Based, Real-time Decision Making with Cloud HPCPerforming Simulation-Based, Real-time Decision Making with Cloud HPC
Performing Simulation-Based, Real-time Decision Making with Cloud HPCinside-BigData.com
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkSpark Summit
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQLEDB
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...DataWorks Summit
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryDataWorks Summit/Hadoop Summit
 
Improving MySQL performance with Hadoop
Improving MySQL performance with HadoopImproving MySQL performance with Hadoop
Improving MySQL performance with HadoopSagar Jauhari
 
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsHarnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsSimeon Fitch
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 

Was ist angesagt? (20)

MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at Scale
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkNear Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedIn
 
MySql to HBase in 5 Steps
MySql to HBase in 5 StepsMySql to HBase in 5 Steps
MySql to HBase in 5 Steps
 
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UNMariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
 
Geospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkGeospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache Spark
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse Offload
 
Cost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceCost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark Service
 
Performing Simulation-Based, Real-time Decision Making with Cloud HPC
Performing Simulation-Based, Real-time Decision Making with Cloud HPCPerforming Simulation-Based, Real-time Decision Making with Cloud HPC
Performing Simulation-Based, Real-time Decision Making with Cloud HPC
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
 
Improving MySQL performance with Hadoop
Improving MySQL performance with HadoopImproving MySQL performance with Hadoop
Improving MySQL performance with Hadoop
 
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsHarnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data Payloads
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 

Ähnlich wie Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL

Deploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudDeploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudJeffrey Carpenter
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph✔ Eric David Benari, PMP
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR Technologies
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark DataWorks Summit/Hadoop Summit
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsPetr Novotný
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016Mathieu Dumoulin
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningSergey Karayev
 
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...Facultad de Informática UCM
 
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017MLconf
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Anthony Baker
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batchboorad
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57AAMIR FAROOQUI
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingGlobus
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016Eduard Lazar
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode
 
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...Caserta
 

Ähnlich wie Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL (20)

Deploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudDeploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloud
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
 
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
El Punto Neutro de Internet en Cataluña
El Punto Neutro de Internet en CataluñaEl Punto Neutro de Internet en Cataluña
El Punto Neutro de Internet en Cataluña
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data Sharing
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CIT
 
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
 

Mehr von MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Mehr von MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Kürzlich hochgeladen

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL

  • 1. http://bigdata.com http://mapgraph.io SYSTAP, LLC Graphs Graph Databases Graph Analytics on GPUs SYSTAP™, LLC © 2006-2014 All Rights Reserved 1 9/19/2014
  • 2. http://bigdata.com http://mapgraph.io Graphs • This talk is about recent advances in large scale graph processing on GPUs. – The motivation is extreme performance. – Everything we do (as a company) is focused on graphs. • Graph Database • Graph processing • Common characteristics: – irregular data shape, irregular access patterns, and irregular parallelism. • A lot of data can be mapped onto graphs – Sparse matrices and graphs are very close data structures – Graphs, as we deal with them, have attributes on vertices and edges. • A lot of algorithms can be mapped onto graphs – Including many machine learning algorithms. SYSTAP™, LLC © 2006-2014 All Rights Reserved 2 http://www.bigdata.com/blog
  • 3. Small Business, Founded 2006 100% Employee Owned http://bigdata.com http://mapgraph.io SYSTAP, LLC Graph Database • High performance, Scalable – 50B edges/node – High level query language – Efficient Graph Traversal – High 9s solution • Open Source – Subscriptions GPU Analytics • Extreme Performance – 5-100x faster than graphlab – 10,000x faster than graphdbs • DARPA funding • Disruptive technology – Early adopters – Huge ROIs • Open Source SYSTAP™, LLC © 2006-2014 All Rights Reserved 3 9/19/2014
  • 4. Related “Graph” Technologies Redpoint repositions existing technology, adding interoperability for blueprints and gremlin. http://bigdata.com http://mapgraph.io STTR Pair up bigdata and MapGraph MapGraph compares favorably with high end hardware solutions from YARC, Oracle, and SAP, but is open source and uses commodity hardware. SYSTAP™, LLC © 2006-2014 All Rights Reserved 4 9/19/2014
  • 5. Embedded, Single Server, HA, Scale-out • RDF/SPARQL • Property graphs – Blueprints, gremlin, rexter • REST API (NSS) • Extension points – Stored queries for custom application logic on the server. – Custom services & indices – Custom functions – Vertex-centric programs http://bigdata.com http://mapgraph.io • Embedded Server Journal JVM • Standalone Server Journal WAR SYSTAP™, LLC © 2006-2014 All Rights Reserved 5 9/19/2014
  • 6. http://bigdata.com http://mapgraph.io High Availability • Shared nothing architecture – Same data on each node – Coordinate only at commit – Transparent load balancing • Scaling – 50 billion triples or quads – Query throughput scales linearly • Self healing – Automatic failover – Automatic resync after disconnect – Online single node disaster recovery • Online Backup – Online snapshots (full backups) – HA Logs (incremental backups) • Point in time recovery (offline) HAService Quorum k=3 size=3 leader follower HAService HAService SYSTAP™, LLC © 2006-2014 All Rights Reserved 6 9/19/2014
  • 7. Embedded, Single Server, HA, Scale-out http://bigdata.com http://mapgraph.io RDF Data and SPARQL Query Client Service Distributed Index Management and Query Management Functions Client Service Registrar Data Service Client Service Data Service Data Service Data Service Data Service Data Service Data Service Zookeeper Shard Locator Transaction Mgr Load Balancer Unified API Application Client Application Client Application Client Application Client Application Client Client Service SPARQL XML SPARQL JSON RDF/XML N-Triples N-Quads Turtle TriG RDF/JSON SYSTAP™, LLC © 2006-2014 All Rights Reserved 7 9/19/2014
  • 8. http://bigdata.com http://mapgraph.io And now on GPUs SYSTAP™, LLC © 2006-2014 All Rights Reserved 8 9/19/2014
  • 9. Similar models, different problems • Graph query and graph analytics (traversal/mining) – Related data models – Very different computational requirements • Many technologies are a bad match or limited solution – Key-value stores (bigtable, Accumulo, Cassandra, HBase) – Map-reduce • Anti-pattern – Dump all data into “big bucket” http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 9 9/19/2014
  • 10. Similar models, different problems • Graph query and graph analytics (traversal/mining) – Related data models – Very different computational requirements • Many technologies are a bad match or limited solution – Key-value stores (bigtable, Accumulo, Cassandra, HBase) – Map-reduce • Anti-pattern – Dump all data into “big bucket” Storage and computation patterns must be correctly matched for high performance. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 10 9/19/2014
  • 11. Optimize for the right problem • Graph analytics – Parallelism – work must be distributed and balanced. – Memory bandwidth – memory, not disk, is the bottleneck – 2D partitioning – O(log(N)) communications pattern (versus O(N*N)) • 1D design looses locality when updating link weights for reverse indices. BFS PR • Storage and computation patterns must be correctly matched for high performance. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 11 9/19/2014
  • 12. GPUs – A Game Changer for Graph Analytics • Graphs are a hard problem • Non-locality • Data dependent parallelism • Memory, PCIe bus and network are bottlenecks • Recent performance gains driven by innovations in bottom-up search, data layout, and partitioning. • GPUs deliver effective parallelism • 10x CPU FLOPS • 10x CPU/RAM bandwidth • Significant speeds up over CPU • 3 GTEPS on one GPU • 32 GTEPS on 64 GPU cluster http://bigdata.com http://mapgraph.io 3500 3000 2500 2000 1500 1000 500 0 Breadth-First Search on Graphs NVIDIA Tesla C2050 Multicore per socket Sequential 0 2 1 10 100 1000 10000 100000 Million Traversed Edges per Second Average Traversal Depth 1 1 2 1 1 2 2 2 1 3 2 3 2 1 2 2 10x Speedup on GPUs
  • 13. http://bigdata.com http://mapgraph.io GPU Hardware Trends • K40 GPU (today) • 12G RAM/GPU • 288 GB/s bandwidth • PCIe Gen 3 • Pascal GPU (Q1 2016) • 24G RAM/GPU • 1 TB/s bandwidth • Unified memory across CPU, GPUs SYSTAP™, LLC © 2006-2014 All Rights Reserved 13 9/19/2014
  • 14. Full Bandwidth Access to CPU RAM http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 14 9/19/2014
  • 15. Architecture shapes performance • The data was a scale-free graph with 2.7M vertices and 5.6M – MapGraph used a larger version of the graph (24M vertices, 25M edges) • The query was a 5-degree subgraph (depth-limited BFS) • Two main takeaways – Horizontal scaling for titan is very expensive – wrong abstraction. – GPUs are ridiculously fast. platform load (s) http://bigdata.com http://mapgraph.io query (ms) comments titan 497.00 935 4 node cluster using Cassandra neo4j 608.00 668 single node community edition bigdata 396.00 281 single node (open source) MapGraph 0.08 27 NVIDIA K20 GPU SYSTAP™, LLC © 2006-2014 All Rights Reserved 15 9/19/2014
  • 16. http://bigdata.com http://mapgraph.io MapGraph Graph Processing on GPUs http://MapGraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 16 9/19/2014
  • 17. http://bigdata.com http://mapgraph.io Think Like a Vertex • Simple APIs pageRank(Message m) { total = m.value(); vertex.val = .15 * .85 + total; for(nbr : out_neighbors) { SendMsg(nbr, vertex.val/num_out_nbrs); } } • Lots of algorithms – BFS, SSSP, Page Rank, Connected Components, Louvain Modularity, Jaccard Distance, k-means clustering, Betweenness-Centrality, Personalized Page Rank, Loopy Belief Propagation, Graph search (crisp and approximate), etc. SYSTAP™, LLC © 2006-2014 All Rights Reserved 17 9/19/2014
  • 18. GAS – a Graph-Parallel Abstraction • Graph-Parallel Vertex-Centric API ala GraphLab • “Think like a vertex” • Gather: collect information about my neighborhood • Apply: update my value • Scatter: signal adjacent vertices • Can write all sorts of graph algorithms this way – BFS, PageRank, Connected Component, Triangle Counting, Max Flow, etc. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 18 9/19/2014
  • 19. http://bigdata.com http://mapgraph.io MapGraph • High-level graph processing framework • High programmability GPU architecture Optimization techniques CUDA • High performance Comparable to low-level approach SYSTAP™, LLC © 2006-2014 All Rights Reserved 19 9/19/2014
  • 20. http://bigdata.com http://mapgraph.io MapGraph • High-level graph processing framework • High programmability GPU architecture Optimization techniques CUDA • High performance Comparable to low-level approach SYSTAP™, LLC © 2006-2014 All Rights Reserved 20 9/19/2014
  • 21. http://bigdata.com http://mapgraph.io Single GPU MapGraph (BFS) Dataset #vertices #edges Max Degree Milliseconds Webbase 1,000,005 3,105,536 23 1.2 Delaunay 2,097,152 6,291,408 4,700 24.5 Bitcoin 6,297,539 28,143,065 4,075,472 345.3 Wiki 3,566,907 45,030,389 7,061 51.0 Kron 1,048,576 89,239,674 131,505 47.7 154.0 513.6 74.8 821.3 1870.9 2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0 Webbase Delaunay Bitcoin Wiki Kron MTEPS SYSTAP™, LLC © 2006-2014 All Rights Reserved 21 9/19/2014
  • 22. BFS Results : MapGraph vs GraphLab 1,000.00 100.00 10.00 http://bigdata.com http://mapgraph.io 1.00 0.10 Webbase Delaunay Bitcoin Wiki Kron Speedup MapGraph Speedup vs GraphLab (BFS) GL-2 GL-4 GL-8 GL-12 MPG SYSTAP™, LLC © 2006-2014 All Rights Reserved 22 9/19/2014
  • 23. PageRank : MapGraph vs GraphLab 100.00 10.00 http://bigdata.com http://mapgraph.io 1.00 0.10 Webbase Delaunay Bitcoin Wiki Kron Speedup MapGraph Speedup vs GraphLab (Page Rank) GL-2 GL-4 GL-8 GL-12 MPG SYSTAP™, LLC © 2006-2014 All Rights Reserved 23 9/19/2014
  • 24. Graph Mining on GPU Clusters • 2D partitioning (aka vertex cuts) • Minimizes the communication volume. • Batch parallel Gather in row, Scatter in column. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 24 9/19/2014
  • 25. Accelerated Graph Analytics http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 25 9/19/2014
  • 26. • Work spans multiple orders of magnitude. 10,000,000 1,000,000 100,000 10,000 1,000 100 http://bigdata.com http://mapgraph.io 10 1 Scale 25 Traversal 0 1 2 3 4 5 6 Frontier Size Iteration SYSTAP™, LLC © 2006-2014 All Rights Reserved 26 9/19/2014
  • 27. http://bigdata.com http://mapgraph.io Strong Scaling • Speedup on a constant problem size with more GPUs • Problem scale 25 – 2^25 vertices (33,554,432) – 2^26 directed edges (1,073,741,824) 23 22 21 20 19 18 17 16 15 14 13 10 20 30 40 50 60 70 GTEPS #GPUs Strong scaling GPUs GTEPS Time (s) 16 14.3 0.075 25 16.4 0.066 36 18.1 0.059 64 22.7 0.047 SYSTAP™, LLC © 2006-2014 All Rights Reserved 27 9/19/2014
  • 28. http://bigdata.com http://mapgraph.io Weak Scaling • Scaling the problem size with more GPUs 35 30 25 20 15 10 5 0 1 4 16 64 GTEPS #GPUs Weak scaling GPUs Scale Vertices Edges Time (s) GTEPS 1 21 2,097,152 67,108,864 0.0254 3 4 23 8,388,608 268,435,456 0.0429 6 16 25 33,554,432 1,073,741,824 0.0715 15 64 27 134,217,728 4,294,967,296 0.1478 29 SYSTAP™, LLC © 2006-2014 All Rights Reserved 28 9/19/2014
  • 29. http://bigdata.com http://mapgraph.io Highlights • For algorithms on large graphs – Memory is the bottleneck • CPUs quickly saturate the memory bus. • CPU cache thrashing limits scaling for graph traversal. • Continued performance gains for CPUs focus on reducing the #of visited edges to reduce bandwidth. • Hybrid CPU/GPU architectures offload either small degree vertices (reduce cache thrashing) or high degree vertices (if the algorithm is FLOPS bound on the CPU, e.g., BC) – Many core is the future. – GPUs are primarily known for their FLOPS, but they have high memory bandwidth and can deliver effective parallelism on parallel graph problems (with sophisticated kernels). • Scaling to very large graphs on large compute clusters – Communications bound. • Communication must be constant for perfect scaling – Hybrid partitioning seeks to reduce #of messages, size of messages, and optimize for asynchronous communications and degree-aware layouts for bottom-up search to reduce memory bandwidth. SYSTAP™, LLC © 2006-2014 All Rights Reserved 29 http://www.bigdata.com/blog
  • 30. http://bigdata.com http://mapgraph.io Bryan Thompson SYSTAP, LLC bryan@systap.com http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 30 9/19/2014