SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
Data processing more than billion rows per second
~unleash full capability of GPU and NVME~
HeteroDB,Inc
Chief Architect & CEO
KaiGai Kohei <kaigai@heterodb.com>
Self-Introduction
 KaiGai Kohei (海外浩平)
 Chief Architect & CEO of HeteroDB
 Contributor of PostgreSQL (2006-)
 Primary Developer of PG-Strom (2012-)
 Interested in: Big-data, GPU, NVME/PMEM, ...
about myself
about our company
 Established: 4th-Jul-2017
 Location: Shinagawa, Tokyo, Japan
 Businesses:
✓ Sales & development of high-performance data-processing
software on top of heterogeneous architecture.
✓ Technology consulting service on GPU&DB area. PG-Strom
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second2
What is PG-Strom?
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second3
 Transparent GPU acceleration for analytics and reporting workloads
 Binary code generation by JIT from SQL statements
 PCIe-bus level optimization by SSD-to-GPU Direct SQL technology
 Columnar-storage for efficient I/O and vector processors
PG-Strom is an extension of PostgreSQL for terabytes scale data-processing
and in-database analytics, by utilization of GPU and NVME-SSD.
App
GPU
off-loading
for IoT/Big-Data
for ML/Analytics
➢ SSD-to-GPU Direct SQL
➢ Columnar Store (Arrow_Fdw)
➢ Asymmetric Partition-wise
JOIN/GROUP BY
➢ BRIN-Index Support
➢ NVME-over-Fabric Support
➢ PostGIS + GiST Support (WIP)
➢ GPU Memory Store(WIP)
➢ Procedural Language for
GPU native code.
➢ IPC on GPU device memory
over cuPy data frame.
PG-Strom as an open source project
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second4
Official documentation in English / Japanese
http://heterodb.github.io/pg-strom/
Many stars ☺
Distributed under GPL-v2.0
Has been developed since 2012
▌Supported PostgreSQL versions
 PostgreSQL v12, v11, and v10. WIP for v13 support
▌Related contributions to PostgreSQL
 Writable FDW support (9.3)
 Custom-Scan interface (9.5)
 FDW and Custom-Scan JOIN pushdown (9.5)
▌Talks at PostgreSQL community
 PGconf.EU 2012 (Prague), 2015 (Vienna), 2018 (Lisbon)
 PGconf.ASIA 2018 (Tokyo), 2019 (Bali, Indonesia)
 PGconf.SV 2016 (San Francisco)
 PGconf.China 2015 (Beijing)
Target Workloads?
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second5
Target: IoT/M2M Log-data processing
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second6
Manufacturing Logistics Mobile Home electronics
 Any kind of devices generate various form of log-data.
 People want/try to find out insight from the data.
 Data importing and processing must be rapid.
 System administration should be simple and easy.
JBoF: Just Bunch of Flash
NVME over
Fabric
(RDMA)
DB Admin
BI Tools
Machine-learning
applications
e.g, anomaly detection
shared
data-frame
PG-Strom
How much data size we expect? (1/3)
▌This is a factory of an international top-brand company.
▌Most of users don’t have bigger data than them.
Case: A semiconductor factory managed up to 100TB with PostgreSQL
Total 20TB
Per Year
Kept for
5 years
Defect
Investigation
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second7
How much data size we expect? (2/3)
Nowadays, 2U server is capable for 100TB capacity
model Supermicro 2029U-TN24R4T Qty
CPU Intel Xeon Gold 6226 (12C, 2.7GHz) 2
RAM 32GB RDIMM (DDR4-2933, ECC) 12
GPU NVIDIA Tesla P40 (3840C, 24GB) 2
HDD Seagate 1.0TB SATA (7.2krpm) 1
NVME Intel DC P4510 (8.0TB, U.2) 24
N/W built-in 10GBase-T 4
8.0TB x 24 = 192TB
How much is it?
60,932USD
by thinkmate.com
(31st-Aug-2019)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second8
How much data size we expect? (3/3)
On the other hands, it is not small enough.
TB
TB
TB
B
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second9
Weapons to tackle terabytes scale data
• SSD-to-GPU Direct SQL
Efficient
Storage
• Arrow_Fdw
Efficient
Data Structure
• Table Partitioning
Efficient
Data
Deployment
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second10
PG-Strom: SSD-to-GPU Direct SQL
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second11
Efficient Storage
GPU’s characteristics - mostly as a computing accelerator
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second12
Over 10years history in HPC, then massive popularization in Machine-Learning
NVIDIA Tesla V100
Super Computer
(TITEC; TSUBAME3.0) Computer Graphics Machine-Learning
How I/O workloads are accelerated by GPU that is a computing accelerator?
Simulation
A usual composition of x86_64 server
GPUSSD
CPU
RAM
HDD
N/W
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second13
Data flow to process a massive amount of data
CPU RAM
SSD GPU
PCIe
PostgreSQL
Data Blocks
Normal Data Flow
All the records, including junks, must be loaded
onto RAM once, because software cannot check
necessity of the rows prior to the data loading.
So, amount of the I/O traffic over PCIe bus tends
to be large.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second14
Unless records are not loaded to CPU/RAM once, over the PCIe bus,
software cannot check its necessity even if they are “junk” records.
SSD-to-GPU Direct SQL (1/4) – Overview
CPU RAM
SSD GPU
PCIe
PostgreSQL
Data Blocks
NVIDIA GPUDirect RDMA
It allows to load the data blocks on NVME-SSD
to GPU using peer-to-peer DMA over PCIe-bus;
bypassing CPU/RAM. WHERE-clause
JOIN
GROUP BY
Run SQL by GPU
to reduce the data size
Data Size: Small
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second15
Background - GPUDirect RDMA by NVIDIA
Physical
address space
PCIe BAR1 Area
GPU
device
memory
RAM
NVMe-SSD Infiniband
HBA
PCIe devices
GPUDirect RDMA
It enables to map GPU device
memory on physical address
space of the host system
Once “physical address of GPU device memory”
appears, we can use is as source or destination
address of DMA with PCIe devices.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second16
0xf0000000
0xe0000000
DMA Request
SRC: 1200th sector
LEN: 40 sectors
DST: 0xe0200000
SSD-to-GPU Direct SQL (2/4) – System configuration and workloads
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second17
Supermicro SYS-1019GP-TT
CPU Xeon Gold 6126T (2.6GHz, 12C) x1
RAM 192GB (32GB DDR4-2666 x 6)
GPU NVIDIA Tesla V100 (5120C, 16GB) x1
SSD
Intel SSD DC P4600 (HHHL; 2.0TB) x3
(striping configuration by md-raid0)
HDD 2.0TB(SATA; 72krpm) x6
Network 10Gb Ethernet 2ports
OS
Red Hat Enterprise Linux 7.6
CUDA 10.1 + NVIDIA Driver 418.40.04
DB
PostgreSQL v11.2
PG-Strom v2.2devel
■ Query Example (Q2_3)
SELECT sum(lo_revenue), d_year, p_brand1
FROM lineorder, date1, part, supplier
WHERE lo_orderdate = d_datekey
AND lo_partkey = p_partkey
AND lo_suppkey = s_suppkey
AND p_category = 'MFGR#12‘
AND s_region = 'AMERICA‘
GROUP BY d_year, p_brand1
ORDER BY d_year, p_brand1;
customer
12M rows
(1.6GB)
date1
2.5K rows
(400KB)
part
1.8M rows
(206MB)
supplier
4.0M rows
(528MB)
lineorder
2.4B rows
(351GB)
Summarizing queries for typical Star-Schema structure on simple 1U server
SSD-to-GPU Direct SQL (3/4) – Benchmark results
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second18
 Query Execution Throughput = (353GB; DB-size) / (Query response time [sec])
 SSD-to-GPU Direct SQL runs the workloads close to the hardware limitation (8.5GB/s)
 about x3 times faster than filesystem and CPU based mechanism
2343.7 2320.6 2315.8 2233.1 2223.9 2216.8 2167.8 2281.9 2277.6 2275.9 2172.0 2198.8 2233.3
7880.4 7924.7 7912.5
7725.8 7712.6 7827.2
7398.1 7258.0
7638.5 7750.4
7402.8 7419.6
7639.8
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
QueryExecutionThroughput[MB/s]
Star Schema Benchmark on 1xNVIDIA Tesla V100 + 3xIntel DC P4600
PostgreSQL 11.2 PG-Strom 2.2devel
SSD-to-GPU Direct SQL (4/4) – Software Stack
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second19
Filesystem
(ext4)
nvme_strom
kernel module
NVMe SSD
PostgreSQL
pg_strom
extension
read(2) ioctl(2)
Operating
System
Software
Layer
Database
Software
Layer
blk-mq
nvme pcie nvme rdma
Network HBA
NVMe
Request
Network HBANVMe SSD
NVME-over-Fabric Target
RDMA over
Converged
Ethernet
GPUDirect RDMA
Translation from logical location
to physical location on the drive
■ Other software
■ Software by HeteroDB
■ Hardware
Special NVME READ command that
loads the source blocks on NVME-
SSD to GPU’s device memory.
External
Storage
Server
Local
Storage
Layer
PG-Strom: Arrow_Fdw (columnar store)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second20
Efficient Data Strucute
Background: Apache Arrow (1/2)
▌Characteristics
 Column-oriented data format designed for analytics workloads
 A common data format for inter-application exchange
 Various primitive data types like integer, floating-point, date/time and so on
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second21
PostgreSQL / PG-Strom
NVIDIA GPU
Background: Apache Arrow (2/2)
Apache Arrow Data Types PostgreSQL Data Types extra description
Int int2, int4, int8
FloatingPoint float2, float4, float8 float2 is an enhancement by PG-Strom
Binary bytea
Utf8 text
Bool bool
Decimal numeric Decimal = fixed-point 128bit value
Date date adjusted to unitsz = Day
Time time adjusted to unitsz = MicroSecond
Timestamp timestamp adjusted to unitsz = MicroSecond
Interval interval
List array types Only 1-dimensional array is supportable
Struct composite types
Union ------
FixedSizeBinary char(n)
FixedSizeList ------
Map ------
Most of data types are convertible between Apache Arrow and PostgreSQL.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second22
Log-data characteristics (1/2) – WHERE is it generated on?
ETL
OLTP OLAP
Traditional OLTP&OLAP – Data is generated inside of database system
Data
Creation
IoT/M2M use case – Data is generated outside of database system
Log
processing
BI Tools
BI Tools
Gateway Server
Data
Creation
Data
Creation
Many Devices
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second23
Data Importing becomes a heavy time-consuming operations for big-data processing.
Data
Import
Import!
Arrow_Fdw - maps Apache Arrow files as a foreign table
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second24
Apache Arrow
Files
Arrow_Fdw
Foreign Table
PostgreSQL
Tables
Not Importing
Write out
 Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.
 Not Importing, so Apache Arrow files are immediately visible.
Arrow_Fdw - maps Apache Arrow files as a foreign table
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second25
Apache Arrow
Files
Arrow_Fdw
Foreign Table
PostgreSQL
Tables
Write out
 Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.
 Not Importing, so Apache Arrow files are immediately visible.
 pg2arrow generates Apache Arrow file from query results of PostgreSQL.
pg2arrow
Arrow_Fdw - maps Apache Arrow files as a foreign table
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second26
Apache Arrow
Files
Arrow_Fdw
Foreign Table
PostgreSQL
Tables
Write out
 Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.
 Not Importing, so Apache Arrow files are immediately visible.
 pg2arrow generates Apache Arrow file from query results of PostgreSQL.
 Arrow_Fdw can be writable, but only batched-INSERT is supported.
pg2arrow
SSD-to-GPU Direct SQL on Arrow_Fdw (1/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second27
▌Why Apache Arrow is beneficial?
 Less amount of I/O to be loaded; only referenced columns
 Higher utilization of GPU core; by vector processing and wide memory bus
 Read-only structure; No MVCC checks are required on run-time
It transfers ONLY Referenced Columns over SSD-to-GPU Direct SQL mechanism
PCIe Bus
NVMe SSD
GPU
SSD-to-GPU P2P DMA
WHERE-clause
JOIN
GROUP BY
P2P data transfer
only referenced columns
GPU code supports Apache
Arrow format as data source.
Runs SQL workloads in parallel
by thousands cores.
Write back the small results built
as heap-tuple of PostgreSQLResults
metadata
SSD-to-GPU Direct SQL on Arrow_Fdw (2/3) - Benchmark Results
 Query Execution Throughput = (6.0B rows) / (Query Response Time [sec])
 Combined usage of SSD-to-GPU Direct SQL and Columnar-store pulled out the extreme
performance; 130-500M rows per second
 Server configuration is identical to what we show on p.17. (1U, 1CPU, 1GPU, 3SSD)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second28
16.3 16.0 16.0 15.7 15.5 15.6 15.6 15.9 15.7 15.7 15.5 15.6 15.4
55.0 55.1 55.1 54.4 54.4 54.6 52.9 54.0 54.9 54.8 53.2 53.1 54.0
481.0
494.2 499.6
246.6
261.7
282.0
156.2
179.0 183.6 185.0
138.3 139.1
152.6
0
100
200
300
400
500
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
Numberofrowsprocessedpersecond
[millionrows/sec]
Star Schema Benchmark (s=1000, DBsize: 879GB) on Tesla V100 x1 + DC P4600 x3
PostgreSQL 11.5 PG-Strom 2.2 (Row data) PG-Strom 2.2 + Arrow_Fdw
SSD-to-GPU Direct SQL on Arrow_Fdw (3/3) - Benchmark Validation
Foreign table "public.flineorder"
Column | Type | Size
--------------------+---------------+--------------------------------
lo_orderkey | numeric | 89.42GB
lo_linenumber | integer | 22.37GB
lo_custkey | numeric | 89.42GB
lo_partkey | integer | 22.37GB <-- ★Referenced by Q2_1
lo_suppkey | numeric | 89.42GB <-- ★Referenced by Q2_1
lo_orderdate | integer | 22.37GB <-- ★Referenced by Q2_1
lo_orderpriority | character(15) | 83.82GB
lo_shippriority | character(1) | 5.7GB
lo_quantity | integer | 22.37GB
lo_extendedprice | integer | 22.37GB
lo_ordertotalprice | integer | 22.37GB
lo_discount | integer | 22.37GB
lo_revenue | integer | 22.37GB <-- ★Referenced by Q2_1
lo_supplycost | integer | 22.37GB
lo_tax | integer | 22.37GB
lo_commit_date | character(8) | 44.71GB
lo_shipmode | character(10) | 55.88GB
FDW options: (file '/opt/nvme/lineorder_s401.arrow') ... file size = 310GB
▌Q2_1 actuall reads only 157GB of 681GB (23.0%) from the NVME-SSD.
▌Execution time of Q2_1 is 24.3s, so 157GB / 24.3s = 6.46GB/s
➔ Reasonable performance for 3x Intel DC P4600 on single CPU configuration.
Almost equivalent physical data transfer, but no need to load unreferenced columns
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second29
Table Partitioning
Efficient Data Deployment
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second30
Log-data characteristics (2/2) – INSERT-only
✓ MVCC visibility check is (relatively) not significant.
✓ Rows with old timestamp will never inserted.
INSERT
UPDATE
DELETE
INSERT
UPDATE
DELETE
Transactional Data Log Data
with
timestamp
current
2019
2018
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second31
Configuration of PostgreSQL partition for log-data (1/2)
▌Mixture of PostgreSQL table and Arrow foreign table in partition declaration
▌Log-data should have timestamp, and never updated
➔ Old data can be moved to Arrow foreign table for more efficient I/O
logdata_201912
logdata_202001
logdata_202002
logdata_current
logdata table
(PARTITION BY timestamp)
2020-03-21
12:34:56
dev_id: 2345
signal: 12.4
Log-data with
timestamp
PostgreSQL tables
Row data store
Read-writable but slot
Arrow foreign table
Column data store
Read-only but fast
Unrelated child tables are skipped,
if query contains WHERE-clause on the timestamp
E.g) WHERE timestamp > ‘2020-01-23’
AND device_id = 456
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second32
Configuration of PostgreSQL partition for log-data (2/2)
logdata_201912
logdata_202001
logdata_202002
logdata_current
logdata table
(PARTITION BY timestamp)
2019-03-21
12:34:56
dev_id: 2345
signal: 12.4
logdata_201912
logdata_202001
logdata_202002
logdata_202003
logdata_current
logdata table
(PARTITION BY timestamp)
a month
later
Extract data of
2020-03
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second33
▌Mixture of PostgreSQL table and Arrow foreign table in partition declaration
▌Log-data should have timestamp, and never updated
➔ Old data can be moved to Arrow foreign table for more efficient I/O
Log-data with
timestamp
Asymmetric Partition-wise JOIN (1/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second34
lineorder
lineorder_p0
lineorder_p1
lineorder_p2
reminder=0
reminder=1
reminder=2
customer date
supplier parts
tablespace: nvme0
tablespace: nvme1
tablespace: nvme2
Records from partition-leafs must be backed to CPU and processed once!
Scan
Scan
Scan
Append
Join
Agg
Query
Results
Scan
Massive records
must be processed
on CPU first
Asymmetric Partition-wise JOIN (2/4)
postgres=# explain select * from ptable p, t1 where p.a = t1.aid;
QUERY PLAN
----------------------------------------------------------------------------------
Hash Join (cost=2.12..24658.62 rows=49950 width=49)
Hash Cond: (p.a = t1.aid)
-> Append (cost=0.00..20407.00 rows=1000000 width=12)
-> Seq Scan on ptable_p0 p (cost=0.00..5134.63 rows=333263 width=12)
-> Seq Scan on ptable_p1 p_1 (cost=0.00..5137.97 rows=333497 width=12)
-> Seq Scan on ptable_p2 p_2 (cost=0.00..5134.40 rows=333240 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
(8 rows)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second35
Asymmetric Partition-wise JOIN (3/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second36
lineorder
lineorder_p0
lineorder_p1
lineorder_p2
reminder=0
reminder=1
reminder=2
customer date
supplier parts
Push down JOIN/GROUP BY, even if smaller half is not a partitioned table.
Join
Append
Agg
Query
Results
Scan
Scan
PreAgg
Join
Scan
PreAgg
Join
Scan
PreAgg
tablespace: nvme0
tablespace: nvme1
tablespace: nvme2
Very small number of partial
results of JOIN/GROUP BY
shall be gathered on CPU.
Asymmetric Partition-wise JOIN (4/4)
postgres=# set enable_partitionwise_join = on;
SET
postgres=# explain select * from ptable p, t1 where p.a = t1.aid;
QUERY PLAN
----------------------------------------------------------------------------------
Append (cost=2.12..19912.62 rows=49950 width=49)
-> Hash Join (cost=2.12..6552.96 rows=16647 width=49)
Hash Cond: (p.a = t1.aid)
-> Seq Scan on ptable_p0 p (cost=0.00..5134.63 rows=333263 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
-> Hash Join (cost=2.12..6557.29 rows=16658 width=49)
Hash Cond: (p_1.a = t1.aid)
-> Seq Scan on ptable_p1 p_1 (cost=0.00..5137.97 rows=333497 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
-> Hash Join (cost=2.12..6552.62 rows=16645 width=49)
Hash Cond: (p_2.a = t1.aid)
-> Seq Scan on ptable_p2 p_2 (cost=0.00..5134.40 rows=333240 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
(16 rows)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second37
PCIe-bus level optimization (1/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second38
Supermicro
SYS-4029TRT2
x96 lane
PCIe
switch
x96 lane
PCIe
switch
CPU2 CPU1
QPI
Gen3
x16
Gen3 x16
for each
slot
Gen3 x16
for each
slotGen3
x16
▌HPC Server – optimization for GPUDirect RDMA
▌I/O Expansion Box
NEC ExpEther 40G
(4slots edition)
Network
Switch
4 slots of
PCIe Gen3 x8
PCIe
Swich
40Gb
Ethernet
CPU
NIC
Extra I/O Boxes
PCIe-bus level optimization (2/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second39
By P2P DMA over PCIe-switch, major data traffic bypass CPU
CPU CPU
PCIe
switch
SSD GPU
PCIe
switch
SSD GPU
PCIe
switch
SSD GPU
PCIe
switch
SSD GPU
SCAN SCAN SCAN SCAN
JOIN JOIN JOIN JOIN
GROUP BY GROUP BY GROUP BY GROUP BY
Pre-processed Data (very small)
GATHER GATHER
PCIe-bus level optimization (3/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second40
$ pg_ctl restart
:
LOG: - PCIe[0000:80]
LOG: - PCIe(0000:80:02.0)
LOG: - PCIe(0000:83:00.0)
LOG: - PCIe(0000:84:00.0)
LOG: - PCIe(0000:85:00.0) nvme0 (INTEL SSDPEDKE020T7)
LOG: - PCIe(0000:84:01.0)
LOG: - PCIe(0000:86:00.0) GPU0 (Tesla P40)
LOG: - PCIe(0000:84:02.0)
LOG: - PCIe(0000:87:00.0) nvme1 (INTEL SSDPEDKE020T7)
LOG: - PCIe(0000:80:03.0)
LOG: - PCIe(0000:c0:00.0)
LOG: - PCIe(0000:c1:00.0)
LOG: - PCIe(0000:c2:00.0) nvme2 (INTEL SSDPEDKE020T7)
LOG: - PCIe(0000:c1:01.0)
LOG: - PCIe(0000:c3:00.0) GPU1 (Tesla P40)
LOG: - PCIe(0000:c1:02.0)
LOG: - PCIe(0000:c4:00.0) nvme3 (INTEL SSDPEDKE020T7)
LOG: - PCIe(0000:80:03.2)
LOG: - PCIe(0000:e0:00.0)
LOG: - PCIe(0000:e1:00.0)
LOG: - PCIe(0000:e2:00.0) nvme4 (INTEL SSDPEDKE020T7)
LOG: - PCIe(0000:e1:01.0)
LOG: - PCIe(0000:e3:00.0) GPU2 (Tesla P40)
LOG: - PCIe(0000:e1:02.0)
LOG: - PCIe(0000:e4:00.0) nvme5 (INTEL SSDPEDKE020T7)
LOG: GPU<->SSD Distance Matrix
LOG: GPU0 GPU1 GPU2
LOG: nvme0 ( 3) 7 7
LOG: nvme5 7 7 ( 3)
LOG: nvme4 7 7 ( 3)
LOG: nvme2 7 ( 3) 7
LOG: nvme1 ( 3) 7 7
LOG: nvme3 7 ( 3) 7
PG-Strom chooses the closest GPU
for the tables to be scanned,
according to the PCI-E device distance.
Large-scale Benchmark
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second41
Capable to process
100TB Log data
on a single-node
in reasonable time
100-120GB/s of effective
data transfer capability
by 4-units’ parallel
25-30GB/s of effective
data transfer capability by
columnar data structure
per unit
Large-scale Benchmark (1/4)
QPI
All of SSD-to-GPU Direct SQL, Apache Arrow and PCI-E bus optimization are used
CPU2CPU1RAM RAM
PCIe-SW PCIe-SW
NVME0
NVME1
NVME2
NVME3
NVME4
NVME5
NVME6
NVME7
NVME8
NVME9
NVME10
NVME11
NVME12
NVME13
NVME14
NVME15
GPU0 GPU1 GPU2 GPU3HBA0 HBA1 HBA2 HBA3
8.0-10GB/s of physical
data transfer capability
per GPU+4xSSD unit
42
PCIe-SW PCIe-SW
JBOF unit-0 JBOF unit-1
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second
Large-scale Benchmark (2/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second43
Build HPC 4U Server + 4 of GPU + 16 of NVME-SSD configuration
CPU2 CPU1
SerialCables
dual PCI-ENC8G-08A
(U.2 NVME JBOF; 8slots) x2
NVIDIA
Tesla V100 x4
PCIe
switch
PCIe
switch
PCIe
switch
PCIe
switch
Gen3 x16
for each slot
Gen3 x16
for each slot
via SFF-8644 based PCIe x4 cables
(3.2GB/s x 16 = max 51.2GB/s)
Intel SSD
DC P4510 (1.0TB) x16
SerialCables
PCI-AD-x16HE x4
Supermicro
SYS-4029GP-TRT
SPECIAL THANKS
Large-scale Benchmark (3/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second44
Build HPC 4U Server + 4 of GPU + 16 of NVME-SSD configuration
Large-scale Benchmark (4/4)
UPI
Distributed 3.5TB test data into 4 partitions; 879GB for each
CPU2CPU1RAM RAM
PCIe-SW PCIe-SW
NVME0
NVME1
NVME2
NVME3
NVME4
NVME5
NVME6
NVME7
NVME8
NVME9
NVME10
NVME11
NVME12
NVME13
NVME14
NVME15
GPU0 GPU1 GPU2 GPU3HBA0 HBA1 HBA2 HBA3
45
PCIe-SW PCIe-SW
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second
lineorder_p0
(879GB)
lineorder_p1
(879GB)
lineorder_p2
(879GB)
lineorder_p3
(879GB)
lineorder
(---GB)
Hash-Partition (N=4)
Benchmark Results (1/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second46
Billion rows per second at a single-node PostgreSQL!!
64.0 65.0 65.2 58.8 58.0 67.3 43.9 58.7 61.0 60.8 52.9 53.0 59.1
255.2 255.6 256.9 257.6 258.0 257.5 248.5 254.9 253.3 253.3 249.3 249.4 249.6
1,681
1,714 1,719
1,058
1,090
1,127
753
801 816 812
658 664 675
0
200
400
600
800
1,000
1,200
1,400
1,600
1,800
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
Numberofrowsprocessedpersecond
[millionrows/sec]
Results of Star Schema Benchmark
[SF=4,000 (3.5TB; 24B rows), 4xGPU, 16xNVME-SSD]
PostgreSQL 11.5 PG-Strom 2.2 PG-Strom 2.2 + Arrow_Fdw
Benchmark Results (2/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second47
40GB/s Read from NVME-SSDs during SQL execution (95% of H/W limit)
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340
NVME-SSDReadThroughput[MB/s]
Elapsed Time [sec]
/dev/md0 (GPU0) /dev/md1 (GPU1) /dev/md2 (GPU2) /dev/md3 (GPU3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second48
Future Direction
NVIDIA GPUDirect Storage (cuFile API) support
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second49
▌NVIDIA GPUDirect Storage?
 API & Driver stack for direct read from NVME to GPU
✓ Features are almost equivalend to NVME-Strom
✓ Ubuntu 18.04 & RHEL8/CentOS8 shall be released at the initial release
 NVIDIA will officially release the software in 2020.
▌Why beneficial?
 Linux kernel driver that is tested/evaluated by NVIDIA’s QA process.
 Involvement of broader ecosystem with third-party solutions;
like distributed-filesystem, block storage, software defined storage
 Broader OS support; Ubuntu 18.04.
PG-Strom is here
Ref: GPUDIRECT STORAGE: A DIRECT GPU-STORAGE DATA PATH
https://on-demand.gputechconf.com/supercomputing/2019/pdf/sc1922-gpudirect-storage-transfer-data-directly-to-gpu-memory-alleviating-io-bottlenecks.pdf
WIP
PostGIS Functions & Operator support (1/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second50
▌PostGIS
 An extension of handle geographic objects and relational operations on them.
✓ Geographic objects: Point, LineString, Polygon, and etc ....
✓ Geographic relational operations: contains, crosses, touch, distance, and etc...
 Wise design for fast search
✓ Bounding-box, GiST-index, CPU Parallel
 v1.0 is released at 2005, then its development has continued over 15 years.
© GAIA RESOURCES
Geometric
master data
Sensor Data
Massive data including
location information
Aggregating and summarizing data by prefecture, city, town, village and so on
【IoT/M2Mデータに対しての想定利用シーン】
WIP
PostGIS Functions & Operator support (2/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second51
▌Current status
 geometry st_makepoint(float8,float8,...)
 float8 st_distance(geometry,geometry)
 bool st_dwithin(geometry,geometry,float8)
 bool st_contains(geometry,geometry)
 bool st_crosses(geometry,geometry)
▌Next: GiST (R-tree) Index support
 Index-based nested-loop up to million polygon x billion points.
 Runs index-search by GPU’s thousands threads in parallel.
➔ Because of GiST internal structure, GPU-parallel is likely efficient.
WIP
GiST(R-tree) Index
Polygon-definition
Table with
Location data
Index search
by thousands
threads
in parallel
Resources
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second52
▌Repository
 https://github.com/heterodb/pg-strom
 https://heterodb.github.io/swdc/
▌Document
 http://heterodb.github.io/pg-strom/
▌Contact
 kaigai@heterodb.com
 Tw: @kkaigai / @heterodb
Backup Slides
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second54
Apache Arrow Internal (1/3)
▌Apache Arrow Internal
 Header
• Fixed byte string: “ARROW1¥0¥0”
 Schema Definition
• # of columns and columns definitions.
• Data type/precision, column name, column number, ...
 Record Batch
• Internal data block that contains a certain # of rows.
• E.g, if N=1M rows with (Int32, Float64), this record-batch
begins from 1M of Int32 array, then 1M of Float64 array
follows.
 Dictionary Batch
• Optional area for dictionary compression.
• E.g) 1 = “Tokyo”, 2 = “Osaka”, 3 = “Kyoto”
➔ Int32 representation for frequently appearing words
 Footer
• Metadata - offset/length of RecordBatches and
DictionaryBatches
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer
• DictionaryBatch[0] (offset, size)
• RecordBatch[0] (offset, size)
:
• RecordBatch[k] (offset, size)
Apache Arrow file
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second55
Apache Arrow Internal (2/3) - writable Arrow_Fdw
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
Arrow file (before writes)
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer (new revision)
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
• RecordBatch[k+1]
Arrow file (after writes)
RecordBatch-(k+1)
Overwrite of
the original
Footer area
Rows written by
a single INSERT
command
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second56
Apache Arrow Internal (3/3) - writable Arrow_Fdw
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
Arrow file (before writes)
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer (new revision)
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
• RecordBatch[k+1]
Arrow file (after writes)
RecordBatch-(k+1)
Overwrite of
the original
Footer area
Footer
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
Arrow_Fdw keeps the original
Footer image until commit,
for support of rollback.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second57
Pg2Arrow / MySQL2Arrow
$ pg2arrow --help
Usage:
pg2arrow [OPTION] [database] [username]
General options:
-d, --dbname=DBNAME Database name to connect to
-c, --command=COMMAND SQL command to run
-t, --table=TABLENAME Table name to be dumped
(-c and -t are exclusive, either of them must be given)
-o, --output=FILENAME result file in Apache Arrow format
--append=FILENAME result Apache Arrow file to be appended
(--output and --append are exclusive. If neither of them
are given, it creates a temporary file.)
Arrow format options:
-s, --segment-size=SIZE size of record batch for each
Connection options:
-h, --host=HOSTNAME database server host
-p, --port=PORT database server port
-u, --user=USERNAME database user name
-w, --no-password never prompt for password
-W, --password force password prompt
Other options:
--dump=FILENAME dump information of arrow file
--progress shows progress of the job
--set=NAME:VALUE config option to set before SQL execution
--help shows this message
✓ mysql2arrow also has almost equivalent capability
Pg2Arrow / MySQL2Arrow enables to dump SQL results as Arrow file
Apache Arrow
Data Files
Arrow_Fdw
Pg2Arrow
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second58

Weitere ähnliche Inhalte

Was ist angesagt?

SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)Kohei KaiGai
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015Kohei KaiGai
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdwKohei KaiGai
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdwKohei KaiGai
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 
20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStromKohei KaiGai
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020John Zedlewski
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
Let's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwLet's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwJan Holčapek
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA UnconferenceKohei KaiGai
 
Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014Hitoshi Sato
 
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data AnalyticsOpen Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analyticsinside-BigData.com
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
20210928_pgunconf_hll_count
20210928_pgunconf_hll_count20210928_pgunconf_hll_count
20210928_pgunconf_hll_countKohei KaiGai
 

Was ist angesagt? (20)

SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
 
PG-Strom
PG-StromPG-Strom
PG-Strom
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
Let's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwLet's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdw
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
 
Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014
 
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data AnalyticsOpen Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
20210928_pgunconf_hll_count
20210928_pgunconf_hll_count20210928_pgunconf_hll_count
20210928_pgunconf_hll_count
 

Ähnlich wie 20201006_PGconf_Online_Large_Data_Processing

PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQLKohei KaiGai
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Matej Misik
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarBill Wong
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power EdgeSashikris
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステムShinnosuke Furuya
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIndrajit Poddar
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceGanesan Narayanasamy
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringKeith Kraus
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
PostgreSQL 10; Long Awaited Enterprise Solutions
PostgreSQL 10; Long Awaited Enterprise SolutionsPostgreSQL 10; Long Awaited Enterprise Solutions
PostgreSQL 10; Long Awaited Enterprise SolutionsJulyanto SUTANDANG
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
High Performance Computing for LiDAR Data Production
High Performance Computing for LiDAR Data ProductionHigh Performance Computing for LiDAR Data Production
High Performance Computing for LiDAR Data ProductionMattBethel1
 
Virtualized Platform Migration On A Validated System
Virtualized Platform Migration On A Validated SystemVirtualized Platform Migration On A Validated System
Virtualized Platform Migration On A Validated Systemgazdagf
 
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...MariaDB plc
 

Ähnlich wie 20201006_PGconf_Online_Large_Data_Processing (20)

PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
 
SQREAM DB on IBM Power9
SQREAM DB on IBM Power9SQREAM DB on IBM Power9
SQREAM DB on IBM Power9
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performance
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature Engineering
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
PostgreSQL 10; Long Awaited Enterprise Solutions
PostgreSQL 10; Long Awaited Enterprise SolutionsPostgreSQL 10; Long Awaited Enterprise Solutions
PostgreSQL 10; Long Awaited Enterprise Solutions
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
High Performance Computing for LiDAR Data Production
High Performance Computing for LiDAR Data ProductionHigh Performance Computing for LiDAR Data Production
High Performance Computing for LiDAR Data Production
 
Virtualized Platform Migration On A Validated System
Virtualized Platform Migration On A Validated SystemVirtualized Platform Migration On A Validated System
Virtualized Platform Migration On A Validated System
 
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
 

Mehr von Kohei KaiGai

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_HistoryKohei KaiGai
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_APIKohei KaiGai
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrowKohei KaiGai
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0Kohei KaiGai
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCacheKohei KaiGai
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGISKohei KaiGai
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_OnlineKohei KaiGai
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdwKohei KaiGai
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_FdwKohei KaiGai
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_TokyoKohei KaiGai
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.JapanKohei KaiGai
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_BetaKohei KaiGai
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStromKohei KaiGai
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStromKohei KaiGai
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdwKohei KaiGai
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_FdwKohei KaiGai
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LTKohei KaiGai
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigDataKohei KaiGai
 
20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JPKohei KaiGai
 
20180914 GTCJ INCEPTION HeteroDB
20180914 GTCJ INCEPTION HeteroDB20180914 GTCJ INCEPTION HeteroDB
20180914 GTCJ INCEPTION HeteroDBKohei KaiGai
 

Mehr von Kohei KaiGai (20)

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_Online
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.Japan
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStrom
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStrom
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
 
20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP
 
20180914 GTCJ INCEPTION HeteroDB
20180914 GTCJ INCEPTION HeteroDB20180914 GTCJ INCEPTION HeteroDB
20180914 GTCJ INCEPTION HeteroDB
 

Kürzlich hochgeladen

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyAnusha Are
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Kürzlich hochgeladen (20)

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

20201006_PGconf_Online_Large_Data_Processing

  • 1. Data processing more than billion rows per second ~unleash full capability of GPU and NVME~ HeteroDB,Inc Chief Architect & CEO KaiGai Kohei <kaigai@heterodb.com>
  • 2. Self-Introduction  KaiGai Kohei (海外浩平)  Chief Architect & CEO of HeteroDB  Contributor of PostgreSQL (2006-)  Primary Developer of PG-Strom (2012-)  Interested in: Big-data, GPU, NVME/PMEM, ... about myself about our company  Established: 4th-Jul-2017  Location: Shinagawa, Tokyo, Japan  Businesses: ✓ Sales & development of high-performance data-processing software on top of heterogeneous architecture. ✓ Technology consulting service on GPU&DB area. PG-Strom PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second2
  • 3. What is PG-Strom? PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second3  Transparent GPU acceleration for analytics and reporting workloads  Binary code generation by JIT from SQL statements  PCIe-bus level optimization by SSD-to-GPU Direct SQL technology  Columnar-storage for efficient I/O and vector processors PG-Strom is an extension of PostgreSQL for terabytes scale data-processing and in-database analytics, by utilization of GPU and NVME-SSD. App GPU off-loading for IoT/Big-Data for ML/Analytics ➢ SSD-to-GPU Direct SQL ➢ Columnar Store (Arrow_Fdw) ➢ Asymmetric Partition-wise JOIN/GROUP BY ➢ BRIN-Index Support ➢ NVME-over-Fabric Support ➢ PostGIS + GiST Support (WIP) ➢ GPU Memory Store(WIP) ➢ Procedural Language for GPU native code. ➢ IPC on GPU device memory over cuPy data frame.
  • 4. PG-Strom as an open source project PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second4 Official documentation in English / Japanese http://heterodb.github.io/pg-strom/ Many stars ☺ Distributed under GPL-v2.0 Has been developed since 2012 ▌Supported PostgreSQL versions  PostgreSQL v12, v11, and v10. WIP for v13 support ▌Related contributions to PostgreSQL  Writable FDW support (9.3)  Custom-Scan interface (9.5)  FDW and Custom-Scan JOIN pushdown (9.5) ▌Talks at PostgreSQL community  PGconf.EU 2012 (Prague), 2015 (Vienna), 2018 (Lisbon)  PGconf.ASIA 2018 (Tokyo), 2019 (Bali, Indonesia)  PGconf.SV 2016 (San Francisco)  PGconf.China 2015 (Beijing)
  • 5. Target Workloads? PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second5
  • 6. Target: IoT/M2M Log-data processing PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second6 Manufacturing Logistics Mobile Home electronics  Any kind of devices generate various form of log-data.  People want/try to find out insight from the data.  Data importing and processing must be rapid.  System administration should be simple and easy. JBoF: Just Bunch of Flash NVME over Fabric (RDMA) DB Admin BI Tools Machine-learning applications e.g, anomaly detection shared data-frame PG-Strom
  • 7. How much data size we expect? (1/3) ▌This is a factory of an international top-brand company. ▌Most of users don’t have bigger data than them. Case: A semiconductor factory managed up to 100TB with PostgreSQL Total 20TB Per Year Kept for 5 years Defect Investigation PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second7
  • 8. How much data size we expect? (2/3) Nowadays, 2U server is capable for 100TB capacity model Supermicro 2029U-TN24R4T Qty CPU Intel Xeon Gold 6226 (12C, 2.7GHz) 2 RAM 32GB RDIMM (DDR4-2933, ECC) 12 GPU NVIDIA Tesla P40 (3840C, 24GB) 2 HDD Seagate 1.0TB SATA (7.2krpm) 1 NVME Intel DC P4510 (8.0TB, U.2) 24 N/W built-in 10GBase-T 4 8.0TB x 24 = 192TB How much is it? 60,932USD by thinkmate.com (31st-Aug-2019) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second8
  • 9. How much data size we expect? (3/3) On the other hands, it is not small enough. TB TB TB B PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second9
  • 10. Weapons to tackle terabytes scale data • SSD-to-GPU Direct SQL Efficient Storage • Arrow_Fdw Efficient Data Structure • Table Partitioning Efficient Data Deployment PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second10
  • 11. PG-Strom: SSD-to-GPU Direct SQL PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second11 Efficient Storage
  • 12. GPU’s characteristics - mostly as a computing accelerator PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second12 Over 10years history in HPC, then massive popularization in Machine-Learning NVIDIA Tesla V100 Super Computer (TITEC; TSUBAME3.0) Computer Graphics Machine-Learning How I/O workloads are accelerated by GPU that is a computing accelerator? Simulation
  • 13. A usual composition of x86_64 server GPUSSD CPU RAM HDD N/W PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second13
  • 14. Data flow to process a massive amount of data CPU RAM SSD GPU PCIe PostgreSQL Data Blocks Normal Data Flow All the records, including junks, must be loaded onto RAM once, because software cannot check necessity of the rows prior to the data loading. So, amount of the I/O traffic over PCIe bus tends to be large. PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second14 Unless records are not loaded to CPU/RAM once, over the PCIe bus, software cannot check its necessity even if they are “junk” records.
  • 15. SSD-to-GPU Direct SQL (1/4) – Overview CPU RAM SSD GPU PCIe PostgreSQL Data Blocks NVIDIA GPUDirect RDMA It allows to load the data blocks on NVME-SSD to GPU using peer-to-peer DMA over PCIe-bus; bypassing CPU/RAM. WHERE-clause JOIN GROUP BY Run SQL by GPU to reduce the data size Data Size: Small PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second15
  • 16. Background - GPUDirect RDMA by NVIDIA Physical address space PCIe BAR1 Area GPU device memory RAM NVMe-SSD Infiniband HBA PCIe devices GPUDirect RDMA It enables to map GPU device memory on physical address space of the host system Once “physical address of GPU device memory” appears, we can use is as source or destination address of DMA with PCIe devices. PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second16 0xf0000000 0xe0000000 DMA Request SRC: 1200th sector LEN: 40 sectors DST: 0xe0200000
  • 17. SSD-to-GPU Direct SQL (2/4) – System configuration and workloads PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second17 Supermicro SYS-1019GP-TT CPU Xeon Gold 6126T (2.6GHz, 12C) x1 RAM 192GB (32GB DDR4-2666 x 6) GPU NVIDIA Tesla V100 (5120C, 16GB) x1 SSD Intel SSD DC P4600 (HHHL; 2.0TB) x3 (striping configuration by md-raid0) HDD 2.0TB(SATA; 72krpm) x6 Network 10Gb Ethernet 2ports OS Red Hat Enterprise Linux 7.6 CUDA 10.1 + NVIDIA Driver 418.40.04 DB PostgreSQL v11.2 PG-Strom v2.2devel ■ Query Example (Q2_3) SELECT sum(lo_revenue), d_year, p_brand1 FROM lineorder, date1, part, supplier WHERE lo_orderdate = d_datekey AND lo_partkey = p_partkey AND lo_suppkey = s_suppkey AND p_category = 'MFGR#12‘ AND s_region = 'AMERICA‘ GROUP BY d_year, p_brand1 ORDER BY d_year, p_brand1; customer 12M rows (1.6GB) date1 2.5K rows (400KB) part 1.8M rows (206MB) supplier 4.0M rows (528MB) lineorder 2.4B rows (351GB) Summarizing queries for typical Star-Schema structure on simple 1U server
  • 18. SSD-to-GPU Direct SQL (3/4) – Benchmark results PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second18  Query Execution Throughput = (353GB; DB-size) / (Query response time [sec])  SSD-to-GPU Direct SQL runs the workloads close to the hardware limitation (8.5GB/s)  about x3 times faster than filesystem and CPU based mechanism 2343.7 2320.6 2315.8 2233.1 2223.9 2216.8 2167.8 2281.9 2277.6 2275.9 2172.0 2198.8 2233.3 7880.4 7924.7 7912.5 7725.8 7712.6 7827.2 7398.1 7258.0 7638.5 7750.4 7402.8 7419.6 7639.8 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3 QueryExecutionThroughput[MB/s] Star Schema Benchmark on 1xNVIDIA Tesla V100 + 3xIntel DC P4600 PostgreSQL 11.2 PG-Strom 2.2devel
  • 19. SSD-to-GPU Direct SQL (4/4) – Software Stack PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second19 Filesystem (ext4) nvme_strom kernel module NVMe SSD PostgreSQL pg_strom extension read(2) ioctl(2) Operating System Software Layer Database Software Layer blk-mq nvme pcie nvme rdma Network HBA NVMe Request Network HBANVMe SSD NVME-over-Fabric Target RDMA over Converged Ethernet GPUDirect RDMA Translation from logical location to physical location on the drive ■ Other software ■ Software by HeteroDB ■ Hardware Special NVME READ command that loads the source blocks on NVME- SSD to GPU’s device memory. External Storage Server Local Storage Layer
  • 20. PG-Strom: Arrow_Fdw (columnar store) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second20 Efficient Data Strucute
  • 21. Background: Apache Arrow (1/2) ▌Characteristics  Column-oriented data format designed for analytics workloads  A common data format for inter-application exchange  Various primitive data types like integer, floating-point, date/time and so on PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second21 PostgreSQL / PG-Strom NVIDIA GPU
  • 22. Background: Apache Arrow (2/2) Apache Arrow Data Types PostgreSQL Data Types extra description Int int2, int4, int8 FloatingPoint float2, float4, float8 float2 is an enhancement by PG-Strom Binary bytea Utf8 text Bool bool Decimal numeric Decimal = fixed-point 128bit value Date date adjusted to unitsz = Day Time time adjusted to unitsz = MicroSecond Timestamp timestamp adjusted to unitsz = MicroSecond Interval interval List array types Only 1-dimensional array is supportable Struct composite types Union ------ FixedSizeBinary char(n) FixedSizeList ------ Map ------ Most of data types are convertible between Apache Arrow and PostgreSQL. PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second22
  • 23. Log-data characteristics (1/2) – WHERE is it generated on? ETL OLTP OLAP Traditional OLTP&OLAP – Data is generated inside of database system Data Creation IoT/M2M use case – Data is generated outside of database system Log processing BI Tools BI Tools Gateway Server Data Creation Data Creation Many Devices PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second23 Data Importing becomes a heavy time-consuming operations for big-data processing. Data Import Import!
  • 24. Arrow_Fdw - maps Apache Arrow files as a foreign table PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second24 Apache Arrow Files Arrow_Fdw Foreign Table PostgreSQL Tables Not Importing Write out  Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.  Not Importing, so Apache Arrow files are immediately visible.
  • 25. Arrow_Fdw - maps Apache Arrow files as a foreign table PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second25 Apache Arrow Files Arrow_Fdw Foreign Table PostgreSQL Tables Write out  Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.  Not Importing, so Apache Arrow files are immediately visible.  pg2arrow generates Apache Arrow file from query results of PostgreSQL. pg2arrow
  • 26. Arrow_Fdw - maps Apache Arrow files as a foreign table PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second26 Apache Arrow Files Arrow_Fdw Foreign Table PostgreSQL Tables Write out  Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.  Not Importing, so Apache Arrow files are immediately visible.  pg2arrow generates Apache Arrow file from query results of PostgreSQL.  Arrow_Fdw can be writable, but only batched-INSERT is supported. pg2arrow
  • 27. SSD-to-GPU Direct SQL on Arrow_Fdw (1/3) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second27 ▌Why Apache Arrow is beneficial?  Less amount of I/O to be loaded; only referenced columns  Higher utilization of GPU core; by vector processing and wide memory bus  Read-only structure; No MVCC checks are required on run-time It transfers ONLY Referenced Columns over SSD-to-GPU Direct SQL mechanism PCIe Bus NVMe SSD GPU SSD-to-GPU P2P DMA WHERE-clause JOIN GROUP BY P2P data transfer only referenced columns GPU code supports Apache Arrow format as data source. Runs SQL workloads in parallel by thousands cores. Write back the small results built as heap-tuple of PostgreSQLResults metadata
  • 28. SSD-to-GPU Direct SQL on Arrow_Fdw (2/3) - Benchmark Results  Query Execution Throughput = (6.0B rows) / (Query Response Time [sec])  Combined usage of SSD-to-GPU Direct SQL and Columnar-store pulled out the extreme performance; 130-500M rows per second  Server configuration is identical to what we show on p.17. (1U, 1CPU, 1GPU, 3SSD) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second28 16.3 16.0 16.0 15.7 15.5 15.6 15.6 15.9 15.7 15.7 15.5 15.6 15.4 55.0 55.1 55.1 54.4 54.4 54.6 52.9 54.0 54.9 54.8 53.2 53.1 54.0 481.0 494.2 499.6 246.6 261.7 282.0 156.2 179.0 183.6 185.0 138.3 139.1 152.6 0 100 200 300 400 500 Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3 Numberofrowsprocessedpersecond [millionrows/sec] Star Schema Benchmark (s=1000, DBsize: 879GB) on Tesla V100 x1 + DC P4600 x3 PostgreSQL 11.5 PG-Strom 2.2 (Row data) PG-Strom 2.2 + Arrow_Fdw
  • 29. SSD-to-GPU Direct SQL on Arrow_Fdw (3/3) - Benchmark Validation Foreign table "public.flineorder" Column | Type | Size --------------------+---------------+-------------------------------- lo_orderkey | numeric | 89.42GB lo_linenumber | integer | 22.37GB lo_custkey | numeric | 89.42GB lo_partkey | integer | 22.37GB <-- ★Referenced by Q2_1 lo_suppkey | numeric | 89.42GB <-- ★Referenced by Q2_1 lo_orderdate | integer | 22.37GB <-- ★Referenced by Q2_1 lo_orderpriority | character(15) | 83.82GB lo_shippriority | character(1) | 5.7GB lo_quantity | integer | 22.37GB lo_extendedprice | integer | 22.37GB lo_ordertotalprice | integer | 22.37GB lo_discount | integer | 22.37GB lo_revenue | integer | 22.37GB <-- ★Referenced by Q2_1 lo_supplycost | integer | 22.37GB lo_tax | integer | 22.37GB lo_commit_date | character(8) | 44.71GB lo_shipmode | character(10) | 55.88GB FDW options: (file '/opt/nvme/lineorder_s401.arrow') ... file size = 310GB ▌Q2_1 actuall reads only 157GB of 681GB (23.0%) from the NVME-SSD. ▌Execution time of Q2_1 is 24.3s, so 157GB / 24.3s = 6.46GB/s ➔ Reasonable performance for 3x Intel DC P4600 on single CPU configuration. Almost equivalent physical data transfer, but no need to load unreferenced columns PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second29
  • 30. Table Partitioning Efficient Data Deployment PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second30
  • 31. Log-data characteristics (2/2) – INSERT-only ✓ MVCC visibility check is (relatively) not significant. ✓ Rows with old timestamp will never inserted. INSERT UPDATE DELETE INSERT UPDATE DELETE Transactional Data Log Data with timestamp current 2019 2018 PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second31
  • 32. Configuration of PostgreSQL partition for log-data (1/2) ▌Mixture of PostgreSQL table and Arrow foreign table in partition declaration ▌Log-data should have timestamp, and never updated ➔ Old data can be moved to Arrow foreign table for more efficient I/O logdata_201912 logdata_202001 logdata_202002 logdata_current logdata table (PARTITION BY timestamp) 2020-03-21 12:34:56 dev_id: 2345 signal: 12.4 Log-data with timestamp PostgreSQL tables Row data store Read-writable but slot Arrow foreign table Column data store Read-only but fast Unrelated child tables are skipped, if query contains WHERE-clause on the timestamp E.g) WHERE timestamp > ‘2020-01-23’ AND device_id = 456 PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second32
  • 33. Configuration of PostgreSQL partition for log-data (2/2) logdata_201912 logdata_202001 logdata_202002 logdata_current logdata table (PARTITION BY timestamp) 2019-03-21 12:34:56 dev_id: 2345 signal: 12.4 logdata_201912 logdata_202001 logdata_202002 logdata_202003 logdata_current logdata table (PARTITION BY timestamp) a month later Extract data of 2020-03 PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second33 ▌Mixture of PostgreSQL table and Arrow foreign table in partition declaration ▌Log-data should have timestamp, and never updated ➔ Old data can be moved to Arrow foreign table for more efficient I/O Log-data with timestamp
  • 34. Asymmetric Partition-wise JOIN (1/4) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second34 lineorder lineorder_p0 lineorder_p1 lineorder_p2 reminder=0 reminder=1 reminder=2 customer date supplier parts tablespace: nvme0 tablespace: nvme1 tablespace: nvme2 Records from partition-leafs must be backed to CPU and processed once! Scan Scan Scan Append Join Agg Query Results Scan Massive records must be processed on CPU first
  • 35. Asymmetric Partition-wise JOIN (2/4) postgres=# explain select * from ptable p, t1 where p.a = t1.aid; QUERY PLAN ---------------------------------------------------------------------------------- Hash Join (cost=2.12..24658.62 rows=49950 width=49) Hash Cond: (p.a = t1.aid) -> Append (cost=0.00..20407.00 rows=1000000 width=12) -> Seq Scan on ptable_p0 p (cost=0.00..5134.63 rows=333263 width=12) -> Seq Scan on ptable_p1 p_1 (cost=0.00..5137.97 rows=333497 width=12) -> Seq Scan on ptable_p2 p_2 (cost=0.00..5134.40 rows=333240 width=12) -> Hash (cost=1.50..1.50 rows=50 width=37) -> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37) (8 rows) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second35
  • 36. Asymmetric Partition-wise JOIN (3/4) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second36 lineorder lineorder_p0 lineorder_p1 lineorder_p2 reminder=0 reminder=1 reminder=2 customer date supplier parts Push down JOIN/GROUP BY, even if smaller half is not a partitioned table. Join Append Agg Query Results Scan Scan PreAgg Join Scan PreAgg Join Scan PreAgg tablespace: nvme0 tablespace: nvme1 tablespace: nvme2 Very small number of partial results of JOIN/GROUP BY shall be gathered on CPU.
  • 37. Asymmetric Partition-wise JOIN (4/4) postgres=# set enable_partitionwise_join = on; SET postgres=# explain select * from ptable p, t1 where p.a = t1.aid; QUERY PLAN ---------------------------------------------------------------------------------- Append (cost=2.12..19912.62 rows=49950 width=49) -> Hash Join (cost=2.12..6552.96 rows=16647 width=49) Hash Cond: (p.a = t1.aid) -> Seq Scan on ptable_p0 p (cost=0.00..5134.63 rows=333263 width=12) -> Hash (cost=1.50..1.50 rows=50 width=37) -> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37) -> Hash Join (cost=2.12..6557.29 rows=16658 width=49) Hash Cond: (p_1.a = t1.aid) -> Seq Scan on ptable_p1 p_1 (cost=0.00..5137.97 rows=333497 width=12) -> Hash (cost=1.50..1.50 rows=50 width=37) -> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37) -> Hash Join (cost=2.12..6552.62 rows=16645 width=49) Hash Cond: (p_2.a = t1.aid) -> Seq Scan on ptable_p2 p_2 (cost=0.00..5134.40 rows=333240 width=12) -> Hash (cost=1.50..1.50 rows=50 width=37) -> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37) (16 rows) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second37
  • 38. PCIe-bus level optimization (1/3) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second38 Supermicro SYS-4029TRT2 x96 lane PCIe switch x96 lane PCIe switch CPU2 CPU1 QPI Gen3 x16 Gen3 x16 for each slot Gen3 x16 for each slotGen3 x16 ▌HPC Server – optimization for GPUDirect RDMA ▌I/O Expansion Box NEC ExpEther 40G (4slots edition) Network Switch 4 slots of PCIe Gen3 x8 PCIe Swich 40Gb Ethernet CPU NIC Extra I/O Boxes
  • 39. PCIe-bus level optimization (2/3) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second39 By P2P DMA over PCIe-switch, major data traffic bypass CPU CPU CPU PCIe switch SSD GPU PCIe switch SSD GPU PCIe switch SSD GPU PCIe switch SSD GPU SCAN SCAN SCAN SCAN JOIN JOIN JOIN JOIN GROUP BY GROUP BY GROUP BY GROUP BY Pre-processed Data (very small) GATHER GATHER
  • 40. PCIe-bus level optimization (3/3) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second40 $ pg_ctl restart : LOG: - PCIe[0000:80] LOG: - PCIe(0000:80:02.0) LOG: - PCIe(0000:83:00.0) LOG: - PCIe(0000:84:00.0) LOG: - PCIe(0000:85:00.0) nvme0 (INTEL SSDPEDKE020T7) LOG: - PCIe(0000:84:01.0) LOG: - PCIe(0000:86:00.0) GPU0 (Tesla P40) LOG: - PCIe(0000:84:02.0) LOG: - PCIe(0000:87:00.0) nvme1 (INTEL SSDPEDKE020T7) LOG: - PCIe(0000:80:03.0) LOG: - PCIe(0000:c0:00.0) LOG: - PCIe(0000:c1:00.0) LOG: - PCIe(0000:c2:00.0) nvme2 (INTEL SSDPEDKE020T7) LOG: - PCIe(0000:c1:01.0) LOG: - PCIe(0000:c3:00.0) GPU1 (Tesla P40) LOG: - PCIe(0000:c1:02.0) LOG: - PCIe(0000:c4:00.0) nvme3 (INTEL SSDPEDKE020T7) LOG: - PCIe(0000:80:03.2) LOG: - PCIe(0000:e0:00.0) LOG: - PCIe(0000:e1:00.0) LOG: - PCIe(0000:e2:00.0) nvme4 (INTEL SSDPEDKE020T7) LOG: - PCIe(0000:e1:01.0) LOG: - PCIe(0000:e3:00.0) GPU2 (Tesla P40) LOG: - PCIe(0000:e1:02.0) LOG: - PCIe(0000:e4:00.0) nvme5 (INTEL SSDPEDKE020T7) LOG: GPU<->SSD Distance Matrix LOG: GPU0 GPU1 GPU2 LOG: nvme0 ( 3) 7 7 LOG: nvme5 7 7 ( 3) LOG: nvme4 7 7 ( 3) LOG: nvme2 7 ( 3) 7 LOG: nvme1 ( 3) 7 7 LOG: nvme3 7 ( 3) 7 PG-Strom chooses the closest GPU for the tables to be scanned, according to the PCI-E device distance.
  • 41. Large-scale Benchmark PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second41
  • 42. Capable to process 100TB Log data on a single-node in reasonable time 100-120GB/s of effective data transfer capability by 4-units’ parallel 25-30GB/s of effective data transfer capability by columnar data structure per unit Large-scale Benchmark (1/4) QPI All of SSD-to-GPU Direct SQL, Apache Arrow and PCI-E bus optimization are used CPU2CPU1RAM RAM PCIe-SW PCIe-SW NVME0 NVME1 NVME2 NVME3 NVME4 NVME5 NVME6 NVME7 NVME8 NVME9 NVME10 NVME11 NVME12 NVME13 NVME14 NVME15 GPU0 GPU1 GPU2 GPU3HBA0 HBA1 HBA2 HBA3 8.0-10GB/s of physical data transfer capability per GPU+4xSSD unit 42 PCIe-SW PCIe-SW JBOF unit-0 JBOF unit-1 PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second
  • 43. Large-scale Benchmark (2/4) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second43 Build HPC 4U Server + 4 of GPU + 16 of NVME-SSD configuration CPU2 CPU1 SerialCables dual PCI-ENC8G-08A (U.2 NVME JBOF; 8slots) x2 NVIDIA Tesla V100 x4 PCIe switch PCIe switch PCIe switch PCIe switch Gen3 x16 for each slot Gen3 x16 for each slot via SFF-8644 based PCIe x4 cables (3.2GB/s x 16 = max 51.2GB/s) Intel SSD DC P4510 (1.0TB) x16 SerialCables PCI-AD-x16HE x4 Supermicro SYS-4029GP-TRT SPECIAL THANKS
  • 44. Large-scale Benchmark (3/4) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second44 Build HPC 4U Server + 4 of GPU + 16 of NVME-SSD configuration
  • 45. Large-scale Benchmark (4/4) UPI Distributed 3.5TB test data into 4 partitions; 879GB for each CPU2CPU1RAM RAM PCIe-SW PCIe-SW NVME0 NVME1 NVME2 NVME3 NVME4 NVME5 NVME6 NVME7 NVME8 NVME9 NVME10 NVME11 NVME12 NVME13 NVME14 NVME15 GPU0 GPU1 GPU2 GPU3HBA0 HBA1 HBA2 HBA3 45 PCIe-SW PCIe-SW PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second lineorder_p0 (879GB) lineorder_p1 (879GB) lineorder_p2 (879GB) lineorder_p3 (879GB) lineorder (---GB) Hash-Partition (N=4)
  • 46. Benchmark Results (1/2) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second46 Billion rows per second at a single-node PostgreSQL!! 64.0 65.0 65.2 58.8 58.0 67.3 43.9 58.7 61.0 60.8 52.9 53.0 59.1 255.2 255.6 256.9 257.6 258.0 257.5 248.5 254.9 253.3 253.3 249.3 249.4 249.6 1,681 1,714 1,719 1,058 1,090 1,127 753 801 816 812 658 664 675 0 200 400 600 800 1,000 1,200 1,400 1,600 1,800 Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3 Numberofrowsprocessedpersecond [millionrows/sec] Results of Star Schema Benchmark [SF=4,000 (3.5TB; 24B rows), 4xGPU, 16xNVME-SSD] PostgreSQL 11.5 PG-Strom 2.2 PG-Strom 2.2 + Arrow_Fdw
  • 47. Benchmark Results (2/2) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second47 40GB/s Read from NVME-SSDs during SQL execution (95% of H/W limit) 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 NVME-SSDReadThroughput[MB/s] Elapsed Time [sec] /dev/md0 (GPU0) /dev/md1 (GPU1) /dev/md2 (GPU2) /dev/md3 (GPU3)
  • 48. PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second48 Future Direction
  • 49. NVIDIA GPUDirect Storage (cuFile API) support PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second49 ▌NVIDIA GPUDirect Storage?  API & Driver stack for direct read from NVME to GPU ✓ Features are almost equivalend to NVME-Strom ✓ Ubuntu 18.04 & RHEL8/CentOS8 shall be released at the initial release  NVIDIA will officially release the software in 2020. ▌Why beneficial?  Linux kernel driver that is tested/evaluated by NVIDIA’s QA process.  Involvement of broader ecosystem with third-party solutions; like distributed-filesystem, block storage, software defined storage  Broader OS support; Ubuntu 18.04. PG-Strom is here Ref: GPUDIRECT STORAGE: A DIRECT GPU-STORAGE DATA PATH https://on-demand.gputechconf.com/supercomputing/2019/pdf/sc1922-gpudirect-storage-transfer-data-directly-to-gpu-memory-alleviating-io-bottlenecks.pdf WIP
  • 50. PostGIS Functions & Operator support (1/2) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second50 ▌PostGIS  An extension of handle geographic objects and relational operations on them. ✓ Geographic objects: Point, LineString, Polygon, and etc .... ✓ Geographic relational operations: contains, crosses, touch, distance, and etc...  Wise design for fast search ✓ Bounding-box, GiST-index, CPU Parallel  v1.0 is released at 2005, then its development has continued over 15 years. © GAIA RESOURCES Geometric master data Sensor Data Massive data including location information Aggregating and summarizing data by prefecture, city, town, village and so on 【IoT/M2Mデータに対しての想定利用シーン】 WIP
  • 51. PostGIS Functions & Operator support (2/2) PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second51 ▌Current status  geometry st_makepoint(float8,float8,...)  float8 st_distance(geometry,geometry)  bool st_dwithin(geometry,geometry,float8)  bool st_contains(geometry,geometry)  bool st_crosses(geometry,geometry) ▌Next: GiST (R-tree) Index support  Index-based nested-loop up to million polygon x billion points.  Runs index-search by GPU’s thousands threads in parallel. ➔ Because of GiST internal structure, GPU-parallel is likely efficient. WIP GiST(R-tree) Index Polygon-definition Table with Location data Index search by thousands threads in parallel
  • 52. Resources PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second52 ▌Repository  https://github.com/heterodb/pg-strom  https://heterodb.github.io/swdc/ ▌Document  http://heterodb.github.io/pg-strom/ ▌Contact  kaigai@heterodb.com  Tw: @kkaigai / @heterodb
  • 53.
  • 54. Backup Slides PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second54
  • 55. Apache Arrow Internal (1/3) ▌Apache Arrow Internal  Header • Fixed byte string: “ARROW1¥0¥0”  Schema Definition • # of columns and columns definitions. • Data type/precision, column name, column number, ...  Record Batch • Internal data block that contains a certain # of rows. • E.g, if N=1M rows with (Int32, Float64), this record-batch begins from 1M of Int32 array, then 1M of Float64 array follows.  Dictionary Batch • Optional area for dictionary compression. • E.g) 1 = “Tokyo”, 2 = “Osaka”, 3 = “Kyoto” ➔ Int32 representation for frequently appearing words  Footer • Metadata - offset/length of RecordBatches and DictionaryBatches Header “ARROW1¥0¥0” Schema Definition DictionaryBatch-0 RecordBatch-0 RecordBatch-k Footer • DictionaryBatch[0] (offset, size) • RecordBatch[0] (offset, size) : • RecordBatch[k] (offset, size) Apache Arrow file PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second55
  • 56. Apache Arrow Internal (2/3) - writable Arrow_Fdw Header “ARROW1¥0¥0” Schema Definition DictionaryBatch-0 RecordBatch-0 RecordBatch-k Footer • DictionaryBatch[0] • RecordBatch[0] : • RecordBatch[k] Arrow file (before writes) Header “ARROW1¥0¥0” Schema Definition DictionaryBatch-0 RecordBatch-0 RecordBatch-k Footer (new revision) • DictionaryBatch[0] • RecordBatch[0] : • RecordBatch[k] • RecordBatch[k+1] Arrow file (after writes) RecordBatch-(k+1) Overwrite of the original Footer area Rows written by a single INSERT command PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second56
  • 57. Apache Arrow Internal (3/3) - writable Arrow_Fdw Header “ARROW1¥0¥0” Schema Definition DictionaryBatch-0 RecordBatch-0 RecordBatch-k Footer • DictionaryBatch[0] • RecordBatch[0] : • RecordBatch[k] Arrow file (before writes) Header “ARROW1¥0¥0” Schema Definition DictionaryBatch-0 RecordBatch-0 RecordBatch-k Footer (new revision) • DictionaryBatch[0] • RecordBatch[0] : • RecordBatch[k] • RecordBatch[k+1] Arrow file (after writes) RecordBatch-(k+1) Overwrite of the original Footer area Footer • DictionaryBatch[0] • RecordBatch[0] : • RecordBatch[k] Arrow_Fdw keeps the original Footer image until commit, for support of rollback. PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second57
  • 58. Pg2Arrow / MySQL2Arrow $ pg2arrow --help Usage: pg2arrow [OPTION] [database] [username] General options: -d, --dbname=DBNAME Database name to connect to -c, --command=COMMAND SQL command to run -t, --table=TABLENAME Table name to be dumped (-c and -t are exclusive, either of them must be given) -o, --output=FILENAME result file in Apache Arrow format --append=FILENAME result Apache Arrow file to be appended (--output and --append are exclusive. If neither of them are given, it creates a temporary file.) Arrow format options: -s, --segment-size=SIZE size of record batch for each Connection options: -h, --host=HOSTNAME database server host -p, --port=PORT database server port -u, --user=USERNAME database user name -w, --no-password never prompt for password -W, --password force password prompt Other options: --dump=FILENAME dump information of arrow file --progress shows progress of the job --set=NAME:VALUE config option to set before SQL execution --help shows this message ✓ mysql2arrow also has almost equivalent capability Pg2Arrow / MySQL2Arrow enables to dump SQL results as Arrow file Apache Arrow Data Files Arrow_Fdw Pg2Arrow PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second58