SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Speed up UDFs with
GPUs using the RAPIDS
Accelerator
Jason Lowe
Software Developer at NVIDIA
Agenda
§ RAPIDS Accelerator
§ Why are UDFs a Problem?
§ Scala UDF Compiler
§ UDF with RAPIDS Code
§ Future Work
RAPIDS Accelerator for Apache Spark
No Code Changes
§ Scala
§ Java
§ PySpark
§ Spark SQL
§ SparkR
§ Koalas
§ Requires Spark 3.x
Accelerates SQL and DataFrame with GPUs
start = time.time()
spark.sql(“””
select o_orderpriority, count(*) as order_count
from orders
where
o_orderdate >= date ‘1993-07-01’
and o_orderdate < date ‘1993-07-01’ + interval
‘3’ month
and exists (
select * from lineitem
where
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
group by o_orderpriority
order by o_orderpriority”””).show()
time.time() - start
NDS Benchmark Dataset
• Approximately 3 TB of raw data
• 1 TB of compressed Parquet
• Partitioned
• Double values for decimals
• Stored in HDFS
Benchmark Hardware
EGX / NVIDIA Certified OEM Servers
Nodes 8
CPU
2 x AMD EPYC 7452
(64 cores/128 threads)
GPU
2 x NVIDIA Ampere A100, PCIe,
250W, 40GB
RAM 0.5 TB
Storage 4 x 7.68 TB Gen4 U.2 NVMe
Networking
1 x Mellanox CX-6 Single Port
HDR100 QSFP56
Software
HDFS (Hadoop 3.2.1)
Spark 3.0.2 (stand alone)
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Apache Spark Apache Spark + RAPIDS Accelerator
Total
Seconds
Total Time Across 100+ Different Queries
NDS Benchmark Results
GPU Performance: 3.21X
GPU Cost Savings: 48%
How Does It Work?
How It Works
Dask,
cuDF, Pandas
Python
Cython
cuDF C++
CUDA Libraries
CUDA
Java
JNI bindings
Spark DataFrame,
Scala, PySpark
How It Works
RAPIDS Accelerator
for Apache Spark
UCX Libraries
RAPIDS C++ Libraries
JNI bindings
Mapping From Java/Scala to C++
DISTRIBUTED SCALE-OUT SPARK APPLICATIONS
APACHE SPARK CORE
Spark SQL Spark Shuffle
DataFrame
if gpu_enabled(op, data_type)
call-out to RAPIDS
else
execute standard Spark op
● Custom Spark Shuffle
● Optimized for RDMA and
GPU-to-GPU transfer
CUDA
JNI bindings
Mapping From Java/Scala to C++
How It Works
DataFrame
Logical Plan
Physical Plan
RDD[InternalRow]
bar.groupBy(
col(”product_id”),
col(“ds”))
.agg(
max(col(“price”)) -
min(col(“price”)).alias(“range”))
SELECT product_id, ds,
max(price) – min(price) AS
range FROM bar GROUP BY
product_id, ds
QUERY
GPU
PHYSICAL
PLAN
Physical Plan
RDD[ColumnarBatch]
Translating a Simple Aggregation Query
CPU
PHYSICAL
PLAN
Read Parquet File
First Stage
Aggregate
Shuffle Exchange
Second Stage
Aggregate
Write Parquet File
Combine Shuffle
Data
Read Parquet File
First Stage
Aggregate
Shuffle Exchange
Second Stage
Aggregate
Write Parquet File
Convert to Row
Format
Convert to Row
Format
GPU
PHYSICAL
PLAN
Why are UDFs a Problem?
Opaque User-Defined Functions
• Need to translate logic to GPU operations
• UDFs hide custom logic behind a generic interface
• Custom logic may be supported but difficult to discern
• UDFs can force computation to the CPU
Columnar and Row Conversions
• CPU executes row-by-row
• GPU executes in columnar batches
• Data format conversion overhead
• Optimizing but never zero cost
Scala UDF Compiler
Automatic Scala UDF Handling
• Optional plugin with the RAPIDS Accelerator
• Uses JVM reflection to analyze UDF bytecode
• Attempts to translate UDF logic to Catalyst operations
• Common math operations
• Type casts
• Conditional (if, case)
• Common string operations
• Date and time parsing via LocalDateTime
Scala UDF Example Translation
val myudf = (x: Long, y: String) =>
s"$y := ${2*x}”
spark.register.udf(“myudf”, myudf)
sql(“SELECT myudf(c, s) as udfcol
from data”)
Catalyst Expression Tree
Scala UDF
Concat
s ” := ” Cast
Multiply
2 c
Keeping Data on the GPU
Project [if (isnull(c#5L))
null else
myudf(knownnotnull(c#5L),
s#2) AS udfcol#228]
GpuProject [gpuconcat(,
c#2, := , cast((2 * s#5L)
as string)) AS udfcol#230]
Scala UDF Compiler Limitations
• No looping constructs
• No higher-order functions
• Corner-case semantic differences (e.g.: divide-by-zero)
UDF with RAPIDS Implementation
Alternate UDF Implementation for GPU
• UDF provides implementation for CPU and GPU
• CPU executes row-by-row
• GPU executes in RAPIDS cuDF columnar batches
• Enables GPU-specific algorithms and optimizations
Supported UDF Types
• Spark Scala UDF
• Spark Java UDF
• Hive Simple UDF
• Hive Generic UDF
RAPIDS UDF Interface
import ai.rapids.cudf.ColumnVector;
/**
* Evaluate a user-defined function with RAPIDS cuDF columnar inputs
* producing a cuDF column as output
*/
public interface RapidsUDF {
ColumnVector evaluateColumnar(ColumnVector... args);
}
Case Study: URLDecode
public class URLDecode implements UDF1<String, String> {
/** Row-by-row implementation that executes on the CPU */
@Override
public String call(String s) {
String result = null;
if (s != null) {
result = URLDecoder.decode(s, "utf-8");
}
return result;
}
Case Study: URLDecode
public class URLDecode implements UDF1<String, String>, RapidsUDF {
[…]
/** Columnar implementation that runs on the GPU */
@Override
public ColumnVector evaluateColumnar(ColumnVector... args) {
ColumnVector input = args[0];
try (Scalar plusScalar = Scalar.fromString("+");
Scalar spaceScalar = Scalar.fromString(" ");
ColumnVector replaced = input.stringReplace(plusScalar, spaceScalar)) {
return replaced.urlDecode();
}
}
0
50
100
150
200
250
Apache Spark Apache Spark + RAPIDS Accelerator
Total
Seconds
4.4 TiB URL decode (4.4 billion rows)
Case Study: URLDecode
GPU Performance: 6.0X
Custom Native GPU Code Supported
• Existing cudf Java bindings not required
• UDF can use other CUDA libraries
• Examples in the RAPIDS Accelerator repository
• Cosine similarity operating on float arrays
Future Work
Future Work
• Expand support to other user-defined function types
• UDAF
• Hive UDTF
• Improved Pandas UDF data transfer
Improved Pandas Data Transfer
JVM PYTHON
Row Arrow
Run Pandas UDF
Arrow
Row
CPU
Arrow
Arrow
Arrow
Arrow
GPU Run Pandas UDF
For More Information
• Check out other RAPIDS Accelerator talks
• SAIS 2020: Deep Dive into GPU Support in Apache Spark 3.x
• GTC 2021: S31846 Running Large-Scale ETL Benchmarks with GPU-
Accelerated Apache Spark
• GTC 2021: S31822 Accelerating Apache Spark Shuffle with UCX
• The RAPIDS Accelerator is open source
• https://github.com/NVIDIA/spark-rapids
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streamingdatamantra
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...InfluxData
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Apache Spark Internals
Apache Spark InternalsApache Spark Internals
Apache Spark InternalsKnoldus Inc.
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkKazuaki Ishizaki
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...Databricks
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Productizing Structured Streaming Jobs
Productizing Structured Streaming JobsProductizing Structured Streaming Jobs
Productizing Structured Streaming JobsDatabricks
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Ono Shigeru
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDatabricks
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introductionsudhakara st
 
Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)rjain51
 
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015NAVER / MusicPlatform
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 

Was ist angesagt? (20)

Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
 
Apache Spark Internals
Apache Spark InternalsApache Spark Internals
Apache Spark Internals
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
Spark
SparkSpark
Spark
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Productizing Structured Streaming Jobs
Productizing Structured Streaming JobsProductizing Structured Streaming Jobs
Productizing Structured Streaming Jobs
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 
Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)
 
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
서버 개발자가 바라 본 Functional Reactive Programming with RxJava - SpringCamp2015
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 

Ähnlich wie Speed up UDFs with GPUs using the RAPIDS Accelerator

SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...Chester Chen
 
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Databricks
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkDatabricks
 
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...Databricks
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestDatabricks
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 
How Spark Does It Internally?
How Spark Does It Internally?How Spark Does It Internally?
How Spark Does It Internally?Knoldus Inc.
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoopclairvoyantllc
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Databricks
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkRahul Kumar
 
How to Performance-Tune Apache Spark Applications in Large Clusters
How to Performance-Tune Apache Spark Applications in Large ClustersHow to Performance-Tune Apache Spark Applications in Large Clusters
How to Performance-Tune Apache Spark Applications in Large ClustersDatabricks
 
How to performance tune spark applications in large clusters
How to performance tune spark applications in large clustersHow to performance tune spark applications in large clusters
How to performance tune spark applications in large clustersOmkar Joshi
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkPatrick Wendell
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKSkills Matter
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLScyllaDB
 
Ten tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache SparkTen tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache SparkWill Du
 

Ähnlich wie Speed up UDFs with GPUs using the RAPIDS Accelerator (20)

SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
 
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
 
Scala and spark
Scala and sparkScala and spark
Scala and spark
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
How Spark Does It Internally?
How Spark Does It Internally?How Spark Does It Internally?
How Spark Does It Internally?
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
GR740 User day
GR740 User dayGR740 User day
GR740 User day
 
How to Performance-Tune Apache Spark Applications in Large Clusters
How to Performance-Tune Apache Spark Applications in Large ClustersHow to Performance-Tune Apache Spark Applications in Large Clusters
How to Performance-Tune Apache Spark Applications in Large Clusters
 
How to performance tune spark applications in large clusters
How to performance tune spark applications in large clustersHow to performance tune spark applications in large clusters
How to performance tune spark applications in large clusters
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Grails 101
Grails 101Grails 101
Grails 101
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Ten tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache SparkTen tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache Spark
 

Mehr von Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Mehr von Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Kürzlich hochgeladen

原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 

Kürzlich hochgeladen (20)

原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 

Speed up UDFs with GPUs using the RAPIDS Accelerator

  • 1. Speed up UDFs with GPUs using the RAPIDS Accelerator Jason Lowe Software Developer at NVIDIA
  • 2. Agenda § RAPIDS Accelerator § Why are UDFs a Problem? § Scala UDF Compiler § UDF with RAPIDS Code § Future Work
  • 3. RAPIDS Accelerator for Apache Spark
  • 4. No Code Changes § Scala § Java § PySpark § Spark SQL § SparkR § Koalas § Requires Spark 3.x Accelerates SQL and DataFrame with GPUs start = time.time() spark.sql(“”” select o_orderpriority, count(*) as order_count from orders where o_orderdate >= date ‘1993-07-01’ and o_orderdate < date ‘1993-07-01’ + interval ‘3’ month and exists ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority”””).show() time.time() - start
  • 5. NDS Benchmark Dataset • Approximately 3 TB of raw data • 1 TB of compressed Parquet • Partitioned • Double values for decimals • Stored in HDFS
  • 6. Benchmark Hardware EGX / NVIDIA Certified OEM Servers Nodes 8 CPU 2 x AMD EPYC 7452 (64 cores/128 threads) GPU 2 x NVIDIA Ampere A100, PCIe, 250W, 40GB RAM 0.5 TB Storage 4 x 7.68 TB Gen4 U.2 NVMe Networking 1 x Mellanox CX-6 Single Port HDR100 QSFP56 Software HDFS (Hadoop 3.2.1) Spark 3.0.2 (stand alone)
  • 7. 0 500 1000 1500 2000 2500 3000 3500 4000 4500 Apache Spark Apache Spark + RAPIDS Accelerator Total Seconds Total Time Across 100+ Different Queries NDS Benchmark Results GPU Performance: 3.21X GPU Cost Savings: 48%
  • 8. How Does It Work?
  • 9. How It Works Dask, cuDF, Pandas Python Cython cuDF C++ CUDA Libraries CUDA Java JNI bindings Spark DataFrame, Scala, PySpark
  • 10. How It Works RAPIDS Accelerator for Apache Spark UCX Libraries RAPIDS C++ Libraries JNI bindings Mapping From Java/Scala to C++ DISTRIBUTED SCALE-OUT SPARK APPLICATIONS APACHE SPARK CORE Spark SQL Spark Shuffle DataFrame if gpu_enabled(op, data_type) call-out to RAPIDS else execute standard Spark op ● Custom Spark Shuffle ● Optimized for RDMA and GPU-to-GPU transfer CUDA JNI bindings Mapping From Java/Scala to C++
  • 11. How It Works DataFrame Logical Plan Physical Plan RDD[InternalRow] bar.groupBy( col(”product_id”), col(“ds”)) .agg( max(col(“price”)) - min(col(“price”)).alias(“range”)) SELECT product_id, ds, max(price) – min(price) AS range FROM bar GROUP BY product_id, ds QUERY GPU PHYSICAL PLAN Physical Plan RDD[ColumnarBatch]
  • 12. Translating a Simple Aggregation Query CPU PHYSICAL PLAN Read Parquet File First Stage Aggregate Shuffle Exchange Second Stage Aggregate Write Parquet File Combine Shuffle Data Read Parquet File First Stage Aggregate Shuffle Exchange Second Stage Aggregate Write Parquet File Convert to Row Format Convert to Row Format GPU PHYSICAL PLAN
  • 13. Why are UDFs a Problem?
  • 14. Opaque User-Defined Functions • Need to translate logic to GPU operations • UDFs hide custom logic behind a generic interface • Custom logic may be supported but difficult to discern • UDFs can force computation to the CPU
  • 15. Columnar and Row Conversions • CPU executes row-by-row • GPU executes in columnar batches • Data format conversion overhead • Optimizing but never zero cost
  • 17. Automatic Scala UDF Handling • Optional plugin with the RAPIDS Accelerator • Uses JVM reflection to analyze UDF bytecode • Attempts to translate UDF logic to Catalyst operations • Common math operations • Type casts • Conditional (if, case) • Common string operations • Date and time parsing via LocalDateTime
  • 18. Scala UDF Example Translation val myudf = (x: Long, y: String) => s"$y := ${2*x}” spark.register.udf(“myudf”, myudf) sql(“SELECT myudf(c, s) as udfcol from data”) Catalyst Expression Tree Scala UDF Concat s ” := ” Cast Multiply 2 c
  • 19. Keeping Data on the GPU Project [if (isnull(c#5L)) null else myudf(knownnotnull(c#5L), s#2) AS udfcol#228] GpuProject [gpuconcat(, c#2, := , cast((2 * s#5L) as string)) AS udfcol#230]
  • 20. Scala UDF Compiler Limitations • No looping constructs • No higher-order functions • Corner-case semantic differences (e.g.: divide-by-zero)
  • 21. UDF with RAPIDS Implementation
  • 22. Alternate UDF Implementation for GPU • UDF provides implementation for CPU and GPU • CPU executes row-by-row • GPU executes in RAPIDS cuDF columnar batches • Enables GPU-specific algorithms and optimizations
  • 23. Supported UDF Types • Spark Scala UDF • Spark Java UDF • Hive Simple UDF • Hive Generic UDF
  • 24. RAPIDS UDF Interface import ai.rapids.cudf.ColumnVector; /** * Evaluate a user-defined function with RAPIDS cuDF columnar inputs * producing a cuDF column as output */ public interface RapidsUDF { ColumnVector evaluateColumnar(ColumnVector... args); }
  • 25. Case Study: URLDecode public class URLDecode implements UDF1<String, String> { /** Row-by-row implementation that executes on the CPU */ @Override public String call(String s) { String result = null; if (s != null) { result = URLDecoder.decode(s, "utf-8"); } return result; }
  • 26. Case Study: URLDecode public class URLDecode implements UDF1<String, String>, RapidsUDF { […] /** Columnar implementation that runs on the GPU */ @Override public ColumnVector evaluateColumnar(ColumnVector... args) { ColumnVector input = args[0]; try (Scalar plusScalar = Scalar.fromString("+"); Scalar spaceScalar = Scalar.fromString(" "); ColumnVector replaced = input.stringReplace(plusScalar, spaceScalar)) { return replaced.urlDecode(); } }
  • 27. 0 50 100 150 200 250 Apache Spark Apache Spark + RAPIDS Accelerator Total Seconds 4.4 TiB URL decode (4.4 billion rows) Case Study: URLDecode GPU Performance: 6.0X
  • 28. Custom Native GPU Code Supported • Existing cudf Java bindings not required • UDF can use other CUDA libraries • Examples in the RAPIDS Accelerator repository • Cosine similarity operating on float arrays
  • 30. Future Work • Expand support to other user-defined function types • UDAF • Hive UDTF • Improved Pandas UDF data transfer
  • 31. Improved Pandas Data Transfer JVM PYTHON Row Arrow Run Pandas UDF Arrow Row CPU Arrow Arrow Arrow Arrow GPU Run Pandas UDF
  • 32. For More Information • Check out other RAPIDS Accelerator talks • SAIS 2020: Deep Dive into GPU Support in Apache Spark 3.x • GTC 2021: S31846 Running Large-Scale ETL Benchmarks with GPU- Accelerated Apache Spark • GTC 2021: S31822 Accelerating Apache Spark Shuffle with UCX • The RAPIDS Accelerator is open source • https://github.com/NVIDIA/spark-rapids
  • 33. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.