SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Brad Rees, Connected Data London, Oct 4th, 2019
cuGraph
Accelerating all your Graph Analytic Needs
2
Brad
Rees
Name
NVIDIA
Sr
Manager
cuGraph
Lead
PhD
Community
Detection in
Social
Networks
> 30
years
education
experience
Cyber
SNA
works at
Graph
Computer
Science
>20years
HPC
Big
Data
3
WE ARE
CONNECTED
7 degrees of Kevin Bacon
Duncan Watts & Steven Strogatz
Collective dynamics of
‘small-world’ networks - 1998
And have
always been
connected The small-world problem - 1968
Stanley Milgram (social psychologist)
1929
4
CONNECTEDNESS
CAPTURED AS A
GRAPH
As well as
associated
information,
knowledge,
metadata, etc..
5
AND THERE ARE
A LOT OF GRAPH
FRAMEWORKS
In lots of variations
Neo4j
TigerGraph
AnzoGraph
RedisGraph
Oracle
Product names are the property of the owners
GraphX
Pegasus
Pregel
GraphLab
Giraph
Graphulo
PowerGraph
GaloisLigra
Gunrock
GraphBLAS
Stinger
HornetcuGraph
NetworkX
NetworkX
6
Why cuGraph?
More generally, why RAPIDS?
A) Graph is not an isolated function, and
needs to be part of the complete Data Science
Process.
And Graph are just cool
7
Speed, UX, and Iteration
The Way to Win at Data Science
Slide borrowed from Francois Chollet
8
cuDF cuIO
Analytics
GPU Memory
Data Preparation VisualizationModel Training
cuML
Machine Learning
cuGraph
Graph Analytics
PyTorch Chainer MxNet
Deep Learning
cuXfilter <> pyViz
Visualization
Enter
End-to-End Accelerated GPU Data Science
Dask
Reduce Data Movement and Keep All Processing on the GPU
9
ETL - the Backbone of Data Science
cuDF is…
Python Library
● A Python library for manipulating GPU DataFrames
following the Pandas API
● Python interface to CUDA C++ library with
additional functionality
● Creating GPU DataFrames from Numpy arrays,
Pandas DataFrames, and PyArrow Tables
● JIT compilation of User-Defined Functions (UDFs)
using Numba
● String Support
10
Extraction is the Cornerstone of ETL
cuIO is born
• Follows the APIs of Pandas and provide >10x
speedup
• CSV Reader - v0.2, CSV Writer v0.8
• Parquet Reader – v0.7
• ORC Reader – v0.7
• JSON Reader - v0.8
• Avro Reader - v0.9
• HDF5 Reader - v0.10
• Key is GPU-accelerating both parsing and
decompression wherever possible
Source: Apache Crail blog: SQL Performance: Part 1 - Input File Formats
11
cuML Machine Learning
GPU-accelerated Scikit-Learn
Classification / Regression
Statistical Inference
Clustering
Decomposition & Dimensionality Reduction
Time Series Forecasting
Recommendations
Decision Trees / Random Forests
Linear Regression
Logistic Regression
K-Nearest Neighbors
Kalman Filtering
Bayesian Inference
Gaussian Mixture Models
Hidden Markov Models
K-Means
DBSCAN
Spectral Clustering
Principal Components
Singular Value Decomposition
UMAP
Spectral Embedding
ARIMA
Holt-Winters
Implicit Matrix Factorization
Cross Validation
More to come!
Hyper-parameter Tuning
1x V100
vs
2x 20 core CPU
12
cuGraph
Accelerating your Graph needs
13
GOALS AND BENEFITS OF CUGRAPH
• Seamless integration with cuDF and cuML
•Python APIs accepts and returns cuDF DataFrames
• Allows for Property Graph
• Features
• Extensive collection of algorithm, primitive, and utility functions**
• With Accelerated Performance
• Python API:
• Multiple APIs: NetworkX, Pregel**, GraphBLAS**, Frontier**
• Graph Query Language**
• C/C++
• Full featured C++ API
Focus on Features an Easy-of-Use
** On Roadmap
14
Graph Technology Stack
Python
Cython
C++ cuGraph Algorithms
Prims
CUDA Libraries
CUDA
Dask cuGraph
Dask cuDF
cuDF
Numpy
Thrust
Cub
cuSolver
cuSparse
cuRand
Gunrock*
cuGraphBLAS cuHornet
nvGRAPH has been Opened Sourced and integrated into cuGraph. * Gunrock is from UC Davis
cuGraphBLAS projected release Is. 0.12
15
Bringing in leading researchers
Leveraging the great work of others
cuGraphGunrock Hornet
GraphBLAS
https://news.developer.nvidia.com/graph-technology-leaders-combine-forces-to-advance-graph-analytics/
cuHornet
cuGraphBLAS
16
Algorithms
(as of release 0.10)
GPU-accelerated NetworkX
Community
Components
Link Analysis
Link Prediction
Traversal
Structure
Spectral Clustering
Balanced-Cut
Modularity Maximization
Louvain
Subgraph Extraction
Triangle Counting
Jaccard
Weighted Jaccard
Overlap Coefficient
Single Source Shortest Path (SSSP)
Breadth First Search (BFS)
COO-to-CSR
Transpose
Renumbering
Multi-GPU
More to come!
Utilities
Weakly Connected Components
Strongly Connected Components
Page Rank
Personal Page Rank
Katz
Query Language
Page Rank
OpenCypher:
Find-Matches
Long list of additional algorithms to come
Symmetrize
17
PageRank Speedup
cuGraph PageRank vs NetworkX PageRank
G = cugraph.Graph()
G.add_edge_list(gdf[‘src’], gdf[‘dst’], None)
df = cugraph.pagerank(G, alpha, max_iter, tol)
https://github.com/rapidsai/notebooks-extended/tree/master/advanced/benchmarks/cugraph_benchmark
SciPy
18
PageRank Performance
HiBench Websearch benchmark
All times are in seconds
Vertices Edges
File Size
(GB)
Number of
GPUs
Read data
and create
DataFrame
Run
Pagerank
(20 iterations)
Write
Scores
TOTAL
runtime
50,000,000 1,980,000,000 34 3 28.6 6.8 6.2 41.6
100,000,000 4,000,000,000 69 6 33.4 11.3 12.7 57.4
200,000,000 8,000,000,000 146 12 36.8 24.4 26.7 87.9
400,000,000 16,000,000,000 300 16 58.3 42.8 53.0 154.1
Ø Process
Ø Read Data
Ø Parse CSV into DataFrame
Ø Run Page Rank
Ø Convert Data to CSR
Ø Setup
Ø Run PagePage Solver
Ø Collect Results and convert of a DataFrame
Ø Write Score
19
Faster Speeds, Real-World Benefits
cuIO/cuDF –
Load and Data Preparation cuML - XGBoost
Time in seconds (shorter is better)
cuIO/cuDF (Load and Data Prep) Data Conversion XGBoost
Benchmark
200GB CSV dataset; Data prep includes
joins, variable transformations
CPU Cluster Configuration
CPU nodes (61 GiB memory, 8 vCPUs, 64-
bit platform), Apache Spark
DGX Cluster Configuration
5x DGX-1 on InfiniBand
network
8762
6148
3925
3221
322
213
End-to-End
Non-Graph
20
21
Deploy RAPIDS Everywhere
Focused on robust functionality, deployment, and user experience
Integration with major cloud providers
Both containers and cloud specific machine instances
Support for Enterprise and HPC Orchestration Layers
Cloud Dataproc Azure Machine Learning
G R A P H I S T info@graphistry.com
Data Scientist
Notebooks
Dev API For
Embedding
Analyst
Tool Suite
Automate
Investigations
Virtual Graph over
graph and tabular APIs
GPU Visual Analytics:
• 100X via GPUs:
client<>cloud
• Correlate w/ graph
• Time, histograms, …
100X Investigations with Graphistry:
Visibility & workflows for handling modern enterprise data
G R A P H I S T R Y
23
Articles
THANK YOU
Please give us a star on GitHub
https://github.com/rapidsai/cugraph
Questions?
25
PageRank Performance
HiBench Websearch benchmark
All times are in seconds
Vertices Edges
File Size
(GB)
Number of
GPUs
Read data
and create
DataFrame
Run
Pagerank
(20 iterations)
Write
Scores
TOTAL
runtime
50,000,000 1,980,000,000 34 3 28.6 6.8 6.2 41.6
100,000,000 4,000,000,000 69 6 33.4 11.3 12.7 57.4
200,000,000 8,000,000,000 146 12 36.8 24.4 26.7 87.9
400,000,000 16,000,000,000 300 16 58.3 42.8 53.0 154.1
Vertices Edges
Convert
DataFrame
to CSR
Just
PageRank
Solver
50,000,000 1,980,000,000 2.4 3.66
100,000,000 4,000,000,000 4.5 5.16
200,000,000 8,000,000,000 9.6 8.65
400,000,000 16,000,000,000 19.5 13.89
Ø Process
Ø Read Data
Ø Parse CSV into DataFrame
Ø Run Page Rank
Ø Convert Data to CSR
Ø Setup
Ø Run PagePage Solver
Ø Collect Results and convert of a DataFrame
Ø Write Score

Weitere ähnliche Inhalte

Was ist angesagt?

Cuda optimization
Cuda optimizationCuda optimization
Cuda optimizationCHIHTE LU
 
Pythonでパケット解析
Pythonでパケット解析Pythonでパケット解析
Pythonでパケット解析euphoricwavism
 
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
TPC-DSから学ぶPostgreSQLの弱点と今後の展望TPC-DSから学ぶPostgreSQLの弱点と今後の展望
TPC-DSから学ぶPostgreSQLの弱点と今後の展望Kohei KaiGai
 
単語のベクトル表現/Word2Vecの基礎
単語のベクトル表現/Word2Vecの基礎単語のベクトル表現/Word2Vecの基礎
単語のベクトル表現/Word2Vecの基礎So
 
いまさら聞けない!CUDA高速化入門
いまさら聞けない!CUDA高速化入門いまさら聞けない!CUDA高速化入門
いまさら聞けない!CUDA高速化入門Fixstars Corporation
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 
CMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティング
CMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティングCMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティング
CMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティングComputational Materials Science Initiative
 
Introduction to Categorical Programming (Revised)
Introduction to Categorical Programming (Revised)Introduction to Categorical Programming (Revised)
Introduction to Categorical Programming (Revised)Masahiro Sakai
 
地理分散DBについて
地理分散DBについて地理分散DBについて
地理分散DBについてKumazaki Hiroki
 
CDNの仕組み(JANOG36)
CDNの仕組み(JANOG36)CDNの仕組み(JANOG36)
CDNの仕組み(JANOG36)J-Stream Inc.
 
より深く知るオプティマイザとそのチューニング
より深く知るオプティマイザとそのチューニングより深く知るオプティマイザとそのチューニング
より深く知るオプティマイザとそのチューニングYuto Hayamizu
 
【Zansa】第17回 ブートストラップ法入門
【Zansa】第17回 ブートストラップ法入門【Zansa】第17回 ブートストラップ法入門
【Zansa】第17回 ブートストラップ法入門Zansa
 
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...Google Cloud Platform - Japan
 
スケーラブルなシステムのためのHBaseスキーマ設計 #hcj13w
スケーラブルなシステムのためのHBaseスキーマ設計 #hcj13wスケーラブルなシステムのためのHBaseスキーマ設計 #hcj13w
スケーラブルなシステムのためのHBaseスキーマ設計 #hcj13wCloudera Japan
 
Hopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことHopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことNVIDIA Japan
 
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法MITSUNARI Shigeo
 
1070: CUDA プログラミング入門
1070: CUDA プログラミング入門1070: CUDA プログラミング入門
1070: CUDA プログラミング入門NVIDIA Japan
 

Was ist angesagt? (20)

grpc-haskell.pdf
grpc-haskell.pdfgrpc-haskell.pdf
grpc-haskell.pdf
 
Cuda optimization
Cuda optimizationCuda optimization
Cuda optimization
 
Pythonでパケット解析
Pythonでパケット解析Pythonでパケット解析
Pythonでパケット解析
 
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
TPC-DSから学ぶPostgreSQLの弱点と今後の展望TPC-DSから学ぶPostgreSQLの弱点と今後の展望
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
 
単語のベクトル表現/Word2Vecの基礎
単語のベクトル表現/Word2Vecの基礎単語のベクトル表現/Word2Vecの基礎
単語のベクトル表現/Word2Vecの基礎
 
いまさら聞けない!CUDA高速化入門
いまさら聞けない!CUDA高速化入門いまさら聞けない!CUDA高速化入門
いまさら聞けない!CUDA高速化入門
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
CMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティング
CMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティングCMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティング
CMSI計算科学技術特論B(14) OpenACC・CUDAによるGPUコンピューティング
 
Introduction to Categorical Programming (Revised)
Introduction to Categorical Programming (Revised)Introduction to Categorical Programming (Revised)
Introduction to Categorical Programming (Revised)
 
地理分散DBについて
地理分散DBについて地理分散DBについて
地理分散DBについて
 
perfを使ったPostgreSQLの解析(前編)
perfを使ったPostgreSQLの解析(前編)perfを使ったPostgreSQLの解析(前編)
perfを使ったPostgreSQLの解析(前編)
 
CDNの仕組み(JANOG36)
CDNの仕組み(JANOG36)CDNの仕組み(JANOG36)
CDNの仕組み(JANOG36)
 
より深く知るオプティマイザとそのチューニング
より深く知るオプティマイザとそのチューニングより深く知るオプティマイザとそのチューニング
より深く知るオプティマイザとそのチューニング
 
【Zansa】第17回 ブートストラップ法入門
【Zansa】第17回 ブートストラップ法入門【Zansa】第17回 ブートストラップ法入門
【Zansa】第17回 ブートストラップ法入門
 
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
 
スケーラブルなシステムのためのHBaseスキーマ設計 #hcj13w
スケーラブルなシステムのためのHBaseスキーマ設計 #hcj13wスケーラブルなシステムのためのHBaseスキーマ設計 #hcj13w
スケーラブルなシステムのためのHBaseスキーマ設計 #hcj13w
 
Hopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことHopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないこと
 
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
 
1070: CUDA プログラミング入門
1070: CUDA プログラミング入門1070: CUDA プログラミング入門
1070: CUDA プログラミング入門
 
直交領域探索
直交領域探索直交領域探索
直交領域探索
 

Ähnlich wie Accelerating Graph Analytics with cuGraph

NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfDLow6
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...TigerGraph
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...TigerGraph
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...HostedbyConfluent
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
 
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Cheer Chain Enterprise Co., Ltd.
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020John Zedlewski
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLDESMOND YUEN
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringKeith Kraus
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Matej Misik
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
GPU-Accelerating A Deep Learning Anomaly Detection Platform
GPU-Accelerating A Deep Learning Anomaly Detection PlatformGPU-Accelerating A Deep Learning Anomaly Detection Platform
GPU-Accelerating A Deep Learning Anomaly Detection PlatformNVIDIA
 

Ähnlich wie Accelerating Graph Analytics with cuGraph (20)

Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDL
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature Engineering
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
GPU-Accelerating A Deep Learning Anomaly Detection Platform
GPU-Accelerating A Deep Learning Anomaly Detection PlatformGPU-Accelerating A Deep Learning Anomaly Detection Platform
GPU-Accelerating A Deep Learning Anomaly Detection Platform
 

Mehr von Connected Data World

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenConnected Data World
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine LearningConnected Data World
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is hereConnected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3Connected Data World
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data ModelConnected Data World
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseConnected Data World
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Connected Data World
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Connected Data World
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the WebConnected Data World
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsConnected Data World
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGOConnected Data World
 
What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?Connected Data World
 

Mehr von Connected Data World (20)

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van Harmelen
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine Learning
 
Graphs in sustainable finance
Graphs in sustainable financeGraphs in sustainable finance
Graphs in sustainable finance
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is here
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data Model
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scale
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the Web
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property Graphs
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGO
 
What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?
 

Kürzlich hochgeladen

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 

Kürzlich hochgeladen (20)

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 

Accelerating Graph Analytics with cuGraph

  • 1. Brad Rees, Connected Data London, Oct 4th, 2019 cuGraph Accelerating all your Graph Analytic Needs
  • 3. 3 WE ARE CONNECTED 7 degrees of Kevin Bacon Duncan Watts & Steven Strogatz Collective dynamics of ‘small-world’ networks - 1998 And have always been connected The small-world problem - 1968 Stanley Milgram (social psychologist) 1929
  • 4. 4 CONNECTEDNESS CAPTURED AS A GRAPH As well as associated information, knowledge, metadata, etc..
  • 5. 5 AND THERE ARE A LOT OF GRAPH FRAMEWORKS In lots of variations Neo4j TigerGraph AnzoGraph RedisGraph Oracle Product names are the property of the owners GraphX Pegasus Pregel GraphLab Giraph Graphulo PowerGraph GaloisLigra Gunrock GraphBLAS Stinger HornetcuGraph NetworkX NetworkX
  • 6. 6 Why cuGraph? More generally, why RAPIDS? A) Graph is not an isolated function, and needs to be part of the complete Data Science Process. And Graph are just cool
  • 7. 7 Speed, UX, and Iteration The Way to Win at Data Science Slide borrowed from Francois Chollet
  • 8. 8 cuDF cuIO Analytics GPU Memory Data Preparation VisualizationModel Training cuML Machine Learning cuGraph Graph Analytics PyTorch Chainer MxNet Deep Learning cuXfilter <> pyViz Visualization Enter End-to-End Accelerated GPU Data Science Dask Reduce Data Movement and Keep All Processing on the GPU
  • 9. 9 ETL - the Backbone of Data Science cuDF is… Python Library ● A Python library for manipulating GPU DataFrames following the Pandas API ● Python interface to CUDA C++ library with additional functionality ● Creating GPU DataFrames from Numpy arrays, Pandas DataFrames, and PyArrow Tables ● JIT compilation of User-Defined Functions (UDFs) using Numba ● String Support
  • 10. 10 Extraction is the Cornerstone of ETL cuIO is born • Follows the APIs of Pandas and provide >10x speedup • CSV Reader - v0.2, CSV Writer v0.8 • Parquet Reader – v0.7 • ORC Reader – v0.7 • JSON Reader - v0.8 • Avro Reader - v0.9 • HDF5 Reader - v0.10 • Key is GPU-accelerating both parsing and decompression wherever possible Source: Apache Crail blog: SQL Performance: Part 1 - Input File Formats
  • 11. 11 cuML Machine Learning GPU-accelerated Scikit-Learn Classification / Regression Statistical Inference Clustering Decomposition & Dimensionality Reduction Time Series Forecasting Recommendations Decision Trees / Random Forests Linear Regression Logistic Regression K-Nearest Neighbors Kalman Filtering Bayesian Inference Gaussian Mixture Models Hidden Markov Models K-Means DBSCAN Spectral Clustering Principal Components Singular Value Decomposition UMAP Spectral Embedding ARIMA Holt-Winters Implicit Matrix Factorization Cross Validation More to come! Hyper-parameter Tuning 1x V100 vs 2x 20 core CPU
  • 13. 13 GOALS AND BENEFITS OF CUGRAPH • Seamless integration with cuDF and cuML •Python APIs accepts and returns cuDF DataFrames • Allows for Property Graph • Features • Extensive collection of algorithm, primitive, and utility functions** • With Accelerated Performance • Python API: • Multiple APIs: NetworkX, Pregel**, GraphBLAS**, Frontier** • Graph Query Language** • C/C++ • Full featured C++ API Focus on Features an Easy-of-Use ** On Roadmap
  • 14. 14 Graph Technology Stack Python Cython C++ cuGraph Algorithms Prims CUDA Libraries CUDA Dask cuGraph Dask cuDF cuDF Numpy Thrust Cub cuSolver cuSparse cuRand Gunrock* cuGraphBLAS cuHornet nvGRAPH has been Opened Sourced and integrated into cuGraph. * Gunrock is from UC Davis cuGraphBLAS projected release Is. 0.12
  • 15. 15 Bringing in leading researchers Leveraging the great work of others cuGraphGunrock Hornet GraphBLAS https://news.developer.nvidia.com/graph-technology-leaders-combine-forces-to-advance-graph-analytics/ cuHornet cuGraphBLAS
  • 16. 16 Algorithms (as of release 0.10) GPU-accelerated NetworkX Community Components Link Analysis Link Prediction Traversal Structure Spectral Clustering Balanced-Cut Modularity Maximization Louvain Subgraph Extraction Triangle Counting Jaccard Weighted Jaccard Overlap Coefficient Single Source Shortest Path (SSSP) Breadth First Search (BFS) COO-to-CSR Transpose Renumbering Multi-GPU More to come! Utilities Weakly Connected Components Strongly Connected Components Page Rank Personal Page Rank Katz Query Language Page Rank OpenCypher: Find-Matches Long list of additional algorithms to come Symmetrize
  • 17. 17 PageRank Speedup cuGraph PageRank vs NetworkX PageRank G = cugraph.Graph() G.add_edge_list(gdf[‘src’], gdf[‘dst’], None) df = cugraph.pagerank(G, alpha, max_iter, tol) https://github.com/rapidsai/notebooks-extended/tree/master/advanced/benchmarks/cugraph_benchmark SciPy
  • 18. 18 PageRank Performance HiBench Websearch benchmark All times are in seconds Vertices Edges File Size (GB) Number of GPUs Read data and create DataFrame Run Pagerank (20 iterations) Write Scores TOTAL runtime 50,000,000 1,980,000,000 34 3 28.6 6.8 6.2 41.6 100,000,000 4,000,000,000 69 6 33.4 11.3 12.7 57.4 200,000,000 8,000,000,000 146 12 36.8 24.4 26.7 87.9 400,000,000 16,000,000,000 300 16 58.3 42.8 53.0 154.1 Ø Process Ø Read Data Ø Parse CSV into DataFrame Ø Run Page Rank Ø Convert Data to CSR Ø Setup Ø Run PagePage Solver Ø Collect Results and convert of a DataFrame Ø Write Score
  • 19. 19 Faster Speeds, Real-World Benefits cuIO/cuDF – Load and Data Preparation cuML - XGBoost Time in seconds (shorter is better) cuIO/cuDF (Load and Data Prep) Data Conversion XGBoost Benchmark 200GB CSV dataset; Data prep includes joins, variable transformations CPU Cluster Configuration CPU nodes (61 GiB memory, 8 vCPUs, 64- bit platform), Apache Spark DGX Cluster Configuration 5x DGX-1 on InfiniBand network 8762 6148 3925 3221 322 213 End-to-End Non-Graph
  • 20. 20
  • 21. 21 Deploy RAPIDS Everywhere Focused on robust functionality, deployment, and user experience Integration with major cloud providers Both containers and cloud specific machine instances Support for Enterprise and HPC Orchestration Layers Cloud Dataproc Azure Machine Learning
  • 22. G R A P H I S T info@graphistry.com Data Scientist Notebooks Dev API For Embedding Analyst Tool Suite Automate Investigations Virtual Graph over graph and tabular APIs GPU Visual Analytics: • 100X via GPUs: client<>cloud • Correlate w/ graph • Time, histograms, … 100X Investigations with Graphistry: Visibility & workflows for handling modern enterprise data G R A P H I S T R Y
  • 24. THANK YOU Please give us a star on GitHub https://github.com/rapidsai/cugraph Questions?
  • 25. 25 PageRank Performance HiBench Websearch benchmark All times are in seconds Vertices Edges File Size (GB) Number of GPUs Read data and create DataFrame Run Pagerank (20 iterations) Write Scores TOTAL runtime 50,000,000 1,980,000,000 34 3 28.6 6.8 6.2 41.6 100,000,000 4,000,000,000 69 6 33.4 11.3 12.7 57.4 200,000,000 8,000,000,000 146 12 36.8 24.4 26.7 87.9 400,000,000 16,000,000,000 300 16 58.3 42.8 53.0 154.1 Vertices Edges Convert DataFrame to CSR Just PageRank Solver 50,000,000 1,980,000,000 2.4 3.66 100,000,000 4,000,000,000 4.5 5.16 200,000,000 8,000,000,000 9.6 8.65 400,000,000 16,000,000,000 19.5 13.89 Ø Process Ø Read Data Ø Parse CSV into DataFrame Ø Run Page Rank Ø Convert Data to CSR Ø Setup Ø Run PagePage Solver Ø Collect Results and convert of a DataFrame Ø Write Score