SlideShare a Scribd company logo
1 of 27
Download to read offline
SQL+GPU+SSD=∞
Wasserschwein@Shinagawa
Self Introduction
▌Name: Wasserschwein@Shinagawa
▌PostgreSQL: 9Years (2006~)
▌Works: Security, FDW, etc...
▌Hobby: Mixture of heterogeneous technology
with PostgreSQL
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞2
Very powerful
computing
capability
Very functional
& well-used
database
PG-Strom:
What I’m making
GPGPU
What’s PG-Strom – Brief overview
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞3
▌Core ideas
① GPU native code generation on the fly
② Asynchronous massive parallel execution
▌Advantages
 Transparent acceleration with 100% query compatibility
 Commodity H/W and less system integration cost
Parser
Planner
Executor
Custom-
Scan/Join
Interface
Query: SELECT * FROM l_tbl JOIN r_tbl on l_tbl.lid = r_tbl.rid;
PG-Strom
CUDA
driver
nvrtc
DMA Data Transfer
CUDA
Source
code
Massive
Parallel
Execution
Supported Workload – Scan, Join, Aggregation
▌SELECT cat, AVG(x) FROM t0 NATURAL JOIN t1 [, ...] GROUP BY cat;
 t0: 100M rows, t1~t10: 100K rows for each, all the data was preloaded.
▌Environment:
 PostgreSQL v9.5beta1 + PG-Strom (22-Oct), CUDA 7.0 + RHEL6.6 (x86_64)
 CPU: Xeon E5-2670v3, RAM: 384GB, GPU: NVIDIA TESLA K20c (2496cores)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞4
0
50
100
150
200
250
300
PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom
2 3 4 5 6 7 8
QueryResponseTime[sec]
# of tables involved
Time consumption per component (PostgreSQL v9.5β vs PG-Strom)
Scan Join Aggregate Others
Next target is I/O acceleration – from TPC/DS results
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞5
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Time consumption per workloads (PostgreSQL v9.5beta+PG-Strom)
Scan Join Aggregate Others
So, How to accelerate I/O stuff by GPU?
NOTICE
The story I like to introduce next is...
Just my Ideaat this moment
......So, I’ll pay my efforts to implement
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞6
A rough x86_64 hardware architecture
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞7
GPU SSD
CPU + RAM CPU + RAM
PCI-E
SAS
Usual I/O bottleneck 
Simplified diagram for introduction
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞8
GPU SSD
CPU + RAM
PCI-E
OK, it’s storage
NVM EXPRESS SSD
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞9
PCI-E direct SSD device – low latency and higher bandwidth
Samsung
SSD 950 PRO
Intel SSD 750
HGST
Ultrastar SN100
Intel
SSD DC P3700
Data Flow in analytic queries
① Data load from storage to CPU/RAM
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞10
GPU SSD
CPU + RAM
PCI-E
Table
Data Flow in analytic queries
① Data load from storage to CPU/RAM
② Remove invisible rows (Select)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞11
GPU SSD
CPU + RAM
PCI-E
Table
Data Flow in analytic queries
① Data load from storage to CPU/RAM
② Remove invisible rows (Select)
③ Remove unreferenced columns (Projection)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞12
GPU SSD
CPU + RAM
PCI-E
Table
 The job of CPU
Data Flow in analytic queries
① Data load from storage to CPU/RAM
② Remove invisible rows (Select)
③ Remove unreferenced columns (Projection)
④ Join with other tables (Join)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞13
GPU SSD
CPU + RAM
PCI-E
Table
+
SSD-to-GPU Direct
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞14
Data transfer between SSD
and GPU, bypass CPU/RAM
Also available on NVMe,
not only Fusion-IO
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞15
GPU SSD
CPU + RAM
PCI-E
Table
SSD-to-GPU
Direct
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞16
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞17
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Remove invisible rows
according to the scan
qualifiers
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞18
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Only visible rows are
moved to CPU+RAM
Remove invisible rows
according to the scan
qualifiers
Data Flow in analytic queries (2/3) – Advanced
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞19
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Data Flow in analytic queries (2/3) – Advanced
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞20
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Remove invisible rows
according to the scan
qualifiers
Data Flow in analytic queries (2/3) – Advanced
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞21
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Only visible rows and
referenced columns are
moved to CPU+RAM
Remove invisible rows
according to the scan
qualifiers
Remove invisible rows
and unreferenced
columns according to
the scan qualifiers
and projection
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞22
GPU SSD
CPU + RAM
PCI-E
GPUcodegenerated
fromSQLonthefly
Innerrelations
(JOINtarget)
Table
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞23
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Innerrelations
(JOINtarget)
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞24
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Innerrelations
(JOINtarget)
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞25
GPU SSD
CPU + RAM
PCI-E
Table
GPUcodegenerated
fromSQLonthefly
SSD-to-GPU
Direct
Tuples are already
joined when it read
data from the storage
Innerrelations
(JOINtarget)
+
Generate
joined tuples
on GPU side
Primitive Technologies
▌NVIDIA GPUDirect enhancement on NVMe device driver
 Interaction between NVMe and NVIDIA drivers are needed
▌Usage statistics of shared_buffers per relations
 To avoid SSDGPU direct on relations that is already preloaded
▌Add new access mode to shared_buffers
 Nobody can make the buffer dirty under the SSDGPU Direct transfer
We are welcome all the developer
who join to PG-Strom project
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞26
Coming Soon?

More Related Content

What's hot

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
Kohei KaiGai
 

What's hot (20)

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
 
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
 
PG-Strom - A FDW module utilizing GPU device
PG-Strom - A FDW module utilizing GPU devicePG-Strom - A FDW module utilizing GPU device
PG-Strom - A FDW module utilizing GPU device
 
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
 
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
 
GPGPU programming with CUDA
GPGPU programming with CUDAGPGPU programming with CUDA
GPGPU programming with CUDA
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
 
Easy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersEasy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java Programmers
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien RouhaudPGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDA
 

Viewers also liked

Viewers also liked (20)

20170127 JAWS HPC-UG#8
20170127 JAWS HPC-UG#820170127 JAWS HPC-UG#8
20170127 JAWS HPC-UG#8
 
An Intelligent Storage?
An Intelligent Storage?An Intelligent Storage?
An Intelligent Storage?
 
pgconfasia2016 lt ssd2gpu
pgconfasia2016 lt ssd2gpupgconfasia2016 lt ssd2gpu
pgconfasia2016 lt ssd2gpu
 
SQL+GPU+SSD=∞ (Japanese)
SQL+GPU+SSD=∞ (Japanese)SQL+GPU+SSD=∞ (Japanese)
SQL+GPU+SSD=∞ (Japanese)
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
20170310_InDatabaseAnalytics_#1
20170310_InDatabaseAnalytics_#120170310_InDatabaseAnalytics_#1
20170310_InDatabaseAnalytics_#1
 
Task Parallel Library (TPL)
Task Parallel Library (TPL)Task Parallel Library (TPL)
Task Parallel Library (TPL)
 
TPL Dataflow – зачем и для кого?
TPL Dataflow – зачем и для кого?TPL Dataflow – зачем и для кого?
TPL Dataflow – зачем и для кого?
 
Task Parallel Library 2014
Task Parallel Library 2014Task Parallel Library 2014
Task Parallel Library 2014
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
 
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
TPC-DSから学ぶPostgreSQLの弱点と今後の展望TPC-DSから学ぶPostgreSQLの弱点と今後の展望
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
並列クエリを実行するPostgreSQLのアーキテクチャ
並列クエリを実行するPostgreSQLのアーキテクチャ並列クエリを実行するPostgreSQLのアーキテクチャ
並列クエリを実行するPostgreSQLのアーキテクチャ
 
In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性
 
UX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and ArchivesUX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and Archives
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging Challenges
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Similar to SQL+GPU+SSD=∞ (English)

Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
Steen Larsen
 

Similar to SQL+GPU+SSD=∞ (English) (20)

20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
 
SNAP MACHINE LEARNING
SNAP MACHINE LEARNINGSNAP MACHINE LEARNING
SNAP MACHINE LEARNING
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
Nagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
Nagios Conference 2007 | Nagios in very large Environments by Werner NeunteuflNagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
Nagios Conference 2007 | Nagios in very large Environments by Werner Neunteufl
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature Engineering
 
Lec02
Lec02Lec02
Lec02
 
Nvidia tesla-k80-overview
Nvidia tesla-k80-overviewNvidia tesla-k80-overview
Nvidia tesla-k80-overview
 

More from Kohei KaiGai

More from Kohei KaiGai (20)

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow
 
20210928_pgunconf_hll_count
20210928_pgunconf_hll_count20210928_pgunconf_hll_count
20210928_pgunconf_hll_count
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache
 
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_Online
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.Japan
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStrom
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStrom
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
 

Recently uploaded

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Recently uploaded (20)

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 

SQL+GPU+SSD=∞ (English)

  • 2. Self Introduction ▌Name: Wasserschwein@Shinagawa ▌PostgreSQL: 9Years (2006~) ▌Works: Security, FDW, etc... ▌Hobby: Mixture of heterogeneous technology with PostgreSQL PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞2 Very powerful computing capability Very functional & well-used database PG-Strom: What I’m making GPGPU
  • 3. What’s PG-Strom – Brief overview PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞3 ▌Core ideas ① GPU native code generation on the fly ② Asynchronous massive parallel execution ▌Advantages  Transparent acceleration with 100% query compatibility  Commodity H/W and less system integration cost Parser Planner Executor Custom- Scan/Join Interface Query: SELECT * FROM l_tbl JOIN r_tbl on l_tbl.lid = r_tbl.rid; PG-Strom CUDA driver nvrtc DMA Data Transfer CUDA Source code Massive Parallel Execution
  • 4. Supported Workload – Scan, Join, Aggregation ▌SELECT cat, AVG(x) FROM t0 NATURAL JOIN t1 [, ...] GROUP BY cat;  t0: 100M rows, t1~t10: 100K rows for each, all the data was preloaded. ▌Environment:  PostgreSQL v9.5beta1 + PG-Strom (22-Oct), CUDA 7.0 + RHEL6.6 (x86_64)  CPU: Xeon E5-2670v3, RAM: 384GB, GPU: NVIDIA TESLA K20c (2496cores) PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞4 0 50 100 150 200 250 300 PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom 2 3 4 5 6 7 8 QueryResponseTime[sec] # of tables involved Time consumption per component (PostgreSQL v9.5β vs PG-Strom) Scan Join Aggregate Others
  • 5. Next target is I/O acceleration – from TPC/DS results PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞5 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Time consumption per workloads (PostgreSQL v9.5beta+PG-Strom) Scan Join Aggregate Others So, How to accelerate I/O stuff by GPU?
  • 6. NOTICE The story I like to introduce next is... Just my Ideaat this moment ......So, I’ll pay my efforts to implement PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞6
  • 7. A rough x86_64 hardware architecture PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞7 GPU SSD CPU + RAM CPU + RAM PCI-E SAS Usual I/O bottleneck 
  • 8. Simplified diagram for introduction PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞8 GPU SSD CPU + RAM PCI-E OK, it’s storage
  • 9. NVM EXPRESS SSD PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞9 PCI-E direct SSD device – low latency and higher bandwidth Samsung SSD 950 PRO Intel SSD 750 HGST Ultrastar SN100 Intel SSD DC P3700
  • 10. Data Flow in analytic queries ① Data load from storage to CPU/RAM PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞10 GPU SSD CPU + RAM PCI-E Table
  • 11. Data Flow in analytic queries ① Data load from storage to CPU/RAM ② Remove invisible rows (Select) PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞11 GPU SSD CPU + RAM PCI-E Table
  • 12. Data Flow in analytic queries ① Data load from storage to CPU/RAM ② Remove invisible rows (Select) ③ Remove unreferenced columns (Projection) PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞12 GPU SSD CPU + RAM PCI-E Table
  • 13.  The job of CPU Data Flow in analytic queries ① Data load from storage to CPU/RAM ② Remove invisible rows (Select) ③ Remove unreferenced columns (Projection) ④ Join with other tables (Join) PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞13 GPU SSD CPU + RAM PCI-E Table +
  • 14. SSD-to-GPU Direct PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞14 Data transfer between SSD and GPU, bypass CPU/RAM Also available on NVMe, not only Fusion-IO
  • 15. Data Flow in analytic queries (1/3) – Basic PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞15 GPU SSD CPU + RAM PCI-E Table SSD-to-GPU Direct
  • 16. Data Flow in analytic queries (1/3) – Basic PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞16 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct
  • 17. Data Flow in analytic queries (1/3) – Basic PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞17 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Remove invisible rows according to the scan qualifiers
  • 18. Data Flow in analytic queries (1/3) – Basic PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞18 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Only visible rows are moved to CPU+RAM Remove invisible rows according to the scan qualifiers
  • 19. Data Flow in analytic queries (2/3) – Advanced PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞19 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct
  • 20. Data Flow in analytic queries (2/3) – Advanced PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞20 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Remove invisible rows according to the scan qualifiers
  • 21. Data Flow in analytic queries (2/3) – Advanced PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞21 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Only visible rows and referenced columns are moved to CPU+RAM Remove invisible rows according to the scan qualifiers Remove invisible rows and unreferenced columns according to the scan qualifiers and projection
  • 22. Data Flow in analytic queries (3/3) – Ultimate PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞22 GPU SSD CPU + RAM PCI-E GPUcodegenerated fromSQLonthefly Innerrelations (JOINtarget) Table
  • 23. Data Flow in analytic queries (3/3) – Ultimate PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞23 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Innerrelations (JOINtarget)
  • 24. Data Flow in analytic queries (3/3) – Ultimate PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞24 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Innerrelations (JOINtarget)
  • 25. Data Flow in analytic queries (3/3) – Ultimate PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞25 GPU SSD CPU + RAM PCI-E Table GPUcodegenerated fromSQLonthefly SSD-to-GPU Direct Tuples are already joined when it read data from the storage Innerrelations (JOINtarget) + Generate joined tuples on GPU side
  • 26. Primitive Technologies ▌NVIDIA GPUDirect enhancement on NVMe device driver  Interaction between NVMe and NVIDIA drivers are needed ▌Usage statistics of shared_buffers per relations  To avoid SSDGPU direct on relations that is already preloaded ▌Add new access mode to shared_buffers  Nobody can make the buffer dirty under the SSDGPU Direct transfer We are welcome all the developer who join to PG-Strom project PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞26