SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Massivelogdataprocessing usingI/Ooptimized PostgreSQL
〜NVME and GPU breaks through the storage performance limitation〜
HeteroDB,Inc
Chief Architect & CEO
KaiGai Kohei <kaigai@heterodb.com>
about HeteroDB
Massive log data processing using I/O optimized PostgreSQL
Corporate overview
 Name HeteroDB,Inc
 Established 4th-Jul-2017
 Location Shinagawa, Tokyo, Japan
 Businesses Production of accelerated PostgreSQL database
Technical consulting on GPU&DB region
By the heterogeneous-computing technology on the database area,
we provides a useful, fast and cost-effective data analytics platform
for all the people who need the power of analytics.
CEO Profile
 KaiGai Kohei – He has contributed for PostgreSQL and Linux kernel
development in the OSS community more than ten years, especially,
for security and database federation features of PostgreSQL.
 Award of “Genius Programmer” by IPA MITOH program (2007)
 The posters finalist at GPU Technology Conference 2017.
Features of RDBMS
 High-availability / Clustering
 DB administration and backup
 Transaction control
 BI and visualization
 We can use the products that
support PostgreSQL as-is.
Core technology – PG-Strom
Massive log data processing using I/O optimized PostgreSQL
PG-Strom: An extension module for PostgreSQL, to accelerate SQL
workloads by the thousands cores and wide-band memory of GPU.
GPU
Big-data Processing
PG-Strom
Machine-learning & Statistics
GPU’s characteristics - mostly as a computing accelerator
Massive log data processing using I/O optimized PostgreSQL
Over 10years history in HPC, then massive popularization in Machine-Learning
NVIDIA Tesla V100
Super Computer
(TITEC; TSUBAME3.0) Computer Graphics Machine-Learning
How I/O workloads are accelerated by GPU that is a computing accelerator?
Simulation
Where is the best “location” to process massive data?
Massive log data processing using I/O optimized PostgreSQL
 Data transportation is fundamentally expensive.
 SQL is a simple and flexible tool for data management.
 Long-standing familiar tool for engineers.
Data
Visualization
Data
Processing
Data
Accumulation
Data
Generation
Statistics &
Machine-
Learning
PG-Strom for
Big-Data
PG-Strom for
Analytics
SSD-to-GPU
Direct SQL Columnar Data
(Parquet)
CREATE FUNCTION
my_logic( reggstore, text )
RETURNS matrix
AS $$
$$ LANGUAGE ‘plcuda’;
PL/CUDA
INSERT
Gstore_fdw Pystrom
Database system is closest to the data.
Simple & Flexible data management without data transportation.
BI Tools
DataSize
Of course, mankind has no “silver bullets” for all the usage scenarios
Massive log data processing using I/O optimized PostgreSQL
Small Tier
• Small & Middle Businesses or
Branch of Big Companies
• Data size: Less than 100GB
• Software: RDBMS
Middle Tier
• Global or domestic large company
• Data size: 1TB〜100TB
• Software: Small scale Hadoop, or
DWH solution
Top Tier
• Global Mega-Company
• Data size: 100TB, or often PB class
• Software: Large scale Hadoop cluster
Data
Management
Technical Challenges for Log Processing (1/3)
Massive log data processing using I/O optimized PostgreSQL
I/O
(Throughput)
Computing
Technical Challenges for Log Processing (2/3)
Massive log data processing using I/O optimized PostgreSQL
Distributed System has less H/W load, but more expensive in administration
Data
Management
with
Reasonable
Design
Data
Management
Technical Challenges for Log Processing (3/3)
Massive log data processing using I/O optimized PostgreSQL
Ultra
Fast I/O
Parallel
Computing
Hardware Evolution pulls up single-node potential to big-data processing
GPU
NVME-
SSD
Massive log data processing using I/O optimized PostgreSQL
How PostgreSQL utilizes GPU?
〜Architecture of PG-Strom〜
GPU code generation from SQL - Example of WHERE-clause
Massive log data processing using I/O optimized PostgreSQL
QUERY: SELECT cat, count(*), avg(x) FROM t0
WHERE x between y and y + 20.0 GROUP BY cat;
:
STATIC_FUNCTION(bool)
gpupreagg_qual_eval(kern_context *kcxt,
kern_data_store *kds,
size_t kds_index)
{
pg_float8_t KPARAM_1 = pg_float8_param(kcxt,1);
pg_float8_t KVAR_3 = pg_float8_vref(kds,kcxt,2,kds_index);
pg_float8_t KVAR_4 = pg_float8_vref(kds,kcxt,3,kds_index);
return EVAL((pgfn_float8ge(kcxt, KVAR_3, KVAR_4) &&
pgfn_float8le(kcxt, KVAR_3,
pgfn_float8pl(kcxt, KVAR_4, KPARAM_1))));
} :
E.g) Transformation of the numeric-formula
in WHERE-clause to CUDA C code on demand
Reference to input data
SQL expression in CUDA source code
Run-time
compiler
Parallel
Execution
EXPLAIN shows Query Execution plan with Custom GPU-node
Massive log data processing using I/O optimized PostgreSQL
postgres=# EXPLAIN ANALYZE
SELECT cat,count(*),sum(ax) FROM tbl NATURAL JOIN t1 WHERE cid % 100 < 50 GROUP BY cat;
QUERY PLAN
---------------------------------------------------------------------------------------------------
GroupAggregate (cost=203498.81..203501.80 rows=26 width=20)
(actual time=1511.622..1511.632 rows=26 loops=1)
Group Key: tbl.cat
-> Sort (cost=203498.81..203499.26 rows=182 width=20)
(actual time=1511.612..1511.613 rows=26 loops=1)
Sort Key: tbl.cat
Sort Method: quicksort Memory: 27kB
-> Custom Scan (GpuPreAgg) (cost=203489.25..203491.98 rows=182 width=20)
(actual time=1511.554..1511.562 rows=26 loops=1)
Reduction: Local
Combined GpuJoin: enabled
-> Custom Scan (GpuJoin) on tbl (cost=13455.86..220069.26 rows=1797115 width=12)
(never executed)
Outer Scan: tbl (cost=12729.55..264113.41 rows=6665208 width=8)
(actual time=50.726..1101.414 rows=19995540 loops=1)
Outer Scan Filter: ((cid % 100) < 50)
Rows Removed by Outer Scan Filter: 10047462
Depth 1: GpuHashJoin (plan nrows: 6665208...1797115,
actual nrows: 9948078...2473997)
HashKeys: tbl.aid
JoinQuals: (tbl.aid = t1.aid)
KDS-Hash (size plan: 11.54MB, exec: 7125.12KB)
-> Seq Scan on t1 (cost=0.00..2031.00 rows=100000 width=12)
(actual time=0.016..15.407 rows=100000 loops=1)
Planning Time: 0.721 ms
Execution Time: 1595.815 ms
(19 rows)
A usual composition of x86_64 server
Massive log data processing using I/O optimized PostgreSQL
GPUSSD
CPU
RAM
HDD
N/W
Data flow to process a massive amount of data
Massive log data processing using I/O optimized PostgreSQL
CPU RAM
SSD GPU
PCIe
PostgreSQL
Data Blocks
Normal Data Flow
All the records, including junks, must be loaded
onto RAM once, because software cannot check
necessity of the rows prior to the data loading.
So, amount of the I/O traffic over PCIe bus tends
to be large.
Unless records are not loaded to CPU/RAM once, over the PCIe bus,
software cannot check its necessity even if it is “junk”.
Core Feature: SSD-to-GPU Direct SQL
Massive log data processing using I/O optimized PostgreSQL
CPU RAM
SSD GPU
PCIe
PostgreSQL
Data Blocks
NVIDIA GPUDirect RDMA
It allows to load the data blocks on NVME-SSD
to GPU using peer-to-peer DMA over PCIe-bus;
bypassing CPU/RAM. WHERE-clause
JOIN
GROUP BY
Run SQL by GPU
to reduce the data size
Data Size: Small
v2.0
Benchmark Results – single-node version
Massive log data processing using I/O optimized PostgreSQL
2172.3 2159.6 2158.9 2086.0 2127.2 2104.3
1920.3
2023.4 2101.1 2126.9
1900.0 1960.3
2072.1
6149.4 6279.3 6282.5
5985.6 6055.3 6152.5
5479.3
6051.2 6061.5 6074.2
5813.7 5871.8 5800.1
0
1000
2000
3000
4000
5000
6000
7000
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
QueryProcessingThroughput[MB/sec]
Star Schema Benchmark on NVMe-SSD + md-raid0
PgSQL9.6(SSDx3) PGStrom2.0(SSDx3) H/W Spec (3xSSD)
SSD-to-GPU Direct SQL pulls out an awesome performance close to the H/W spec
 Measurement by the Star Schema Benchmark; which is a set of typical batch / reporting workloads.
 CPU: Intel Xeon E5-2650v4, RAM: 128GB, GPU: NVIDIA Tesla P40, SSD: Intel 750 (400GB; SeqRead 2.2GB/s)x3
 Size of dataset is 353GB (sf: 401), to ensure I/O bounds workload
Technology Basis - GPUDirect RDMA
Massive log data processing using I/O optimized PostgreSQL
▌P2P data transfer technology between GPU and other PCIe devices, bypass CPU
 Originally designed for multi-nodes MPI over Infiniband
 Infrastructure of Linux kernel driver for other PCIe devices, including NVME-SSDs.
Copyright (c) NVIDIA corporation, 2015
SSD-to-GPU Direct SQL - Software Stack
Massive log data processing using I/O optimized PostgreSQL
Filesystem
(ext4, xfs)
nvme_strom
kernel module
NVMe SSD
NVIDIA Tesla GPU
PostgreSQL
pg_strom
extension
read(2) ioctl(2)
Hardware
Layer
Operating
System
Software
Layer
Database
Software
Layer
blk-mq
nvme pcie nvme rdma
Network HBA
File-offset to NVME-SSD sector
number translation
NVMEoF Target
(JBOF)
NVMe
Request
■ Other software component
■ Our developed software
■ Hardware
NVME over Fabric
nvme_strom v2.0 supports
NVME-over-Fabric (RDMA).
40min queries
in the former version,
now 30sec
Case Study: Log management and analytics solution
Massive log data processing using I/O optimized PostgreSQL
Database server with
GPU+NVME-SSD
Search by SQL
Security incident
• Fast try & error
• Familiar interface
• Search on row-data
Aug 12 18:01:01 saba systemd: Started
Session 21060 of user root.
Understanding the
situation of damage
Identifying the impact
Reporting to the manager
Run faster, beyond the limitation
Next bottleneck – Flood of the data over CPU’s capacity
NVME and GPU accelerates PostgreSQL beyond the limitation -PGconf.EU 2018-21
Skylake-SP allows P2P DMA routing up to 8.5GB/s over PCIe root complex.
GPU
SSD-1
SSD-2
SSD-3
md-raid0
Xeon
Gold
6126T
routing
by CPU
Consideration for the hardware configuration
Massive log data processing using I/O optimized PostgreSQL
PCIe-switch (PLX) can make CPU more relaxed.
CPU CPU
PLX
SSD GPU
PLX
SSD GPU
PLX
SSD GPU
PLX
SSD GPU
SCAN SCAN SCAN SCAN
JOIN JOIN JOIN JOIN
GROUP BY GROUP BY GROUP BY GROUP BY
Very small amount of data
GATHER GATHER
Hardware with PCIe switch (1/2) - HPC Server
Massive log data processing using I/O optimized PostgreSQL
Supermicro SYS-4029TRT2
x96 lane
PCIe switch
x96 lane
PCIe switch
CPU2 CPU1
QPI
Gen3
x16
Gen3 x16
for each slot
Gen3 x16
for each slot
Gen3
x16
Hardware with PCIe switch (2/2) - I/O expansion box
Massive log data processing using I/O optimized PostgreSQL
NEC ExpEther 40G (4slots)
4 slots of
PCIe Gen3 x8
PCIe
Swich
40Gb
Ethernet Network
switch
CPU
NIC
extra I/O boxes
System configuration with I/O expansion boxes
Massive log data processing using I/O optimized PostgreSQL
PCIe I/O Expansion Box
Host System
(x86_64 server)
NVMe SSD
PostgreSQL Tables
PostgreSQL
Data Blocks
Internal
PCIe Switch
SSD-to-GPU P2P DMA
(Large data size)
GPU
WHERE-clause
JOIN
GROUP BY
PCIe over
Ethernet
Pre-processed
small data
A few GB/s
SQL execution
performance
per box
A few GB/s
SQL execution
performance
per box
A few GB/s
SQL execution
performance
per box
NIC / HBA
Simplified DB operations and APP development
by the simple single-node PostgreSQL configuration
Enhancement of capacity & performance
Visible as leafs of partitioned child-tables on PostgreSQL
v2.1
Benchmark (1/2) - System configuration
Massive log data processing using I/O optimized PostgreSQL
x 1
x 3
x 6
x 3
NEC Express5800/R120h-2m
CPU: Intel Xeon E5-2603 v4 (6C, 1.7GHz)
RAM: 64GB
OS: Red Hat Enterprise Linux 7
(kernel: 3.10.0-862.9.1.el7.x86_64)
CUDA-9.2.148 + driver 396.44
DB: PostgreSQL 11beta3 + PG-Strom v2.1devel
NEC ExpEther 40G (4slots)
I/F: PCIe 3.0 x8 (x16 physical) ... 4slots
with internal PCIe switch
N/W: 40Gb-ethernet
Intel DC P4600 (2.0TB; HHHL)
SeqRead: 3200MB/s, SeqWrite: 1575MB/s
RandRead: 610k IOPS, RandWrite: 196k IOPS
I/F: PCIe 3.0 x4
NVIDIA Tesla P40
# of cores: 3840 (1.3GHz)
Device RAM: 24GB (347GB/s, GDDR5)
CC: 6.1 (Pascal, GP104)
I/F: PCIe 3.0 x16
SPECIAL THANKS FOR
v2.1
Benchmark (2/2) - Result of query execution performance
Massive log data processing using I/O optimized PostgreSQL
 13 SSBM queries to 1055GB database in total (a.k.a 351GB per I/O expansion box)
 Raw I/O data transfer without SQL execution was up to 9GB/s.
In other words, SQL execution was faster than simple storage read with raw-I/O.
2,388 2,477 2,493 2,502 2,739 2,831
1,865
2,268 2,442 2,418
1,789 1,848
2,202
13,401 13,534 13,536 13,330
12,696
12,965
12,533
11,498
12,312 12,419 12,414 12,622 12,594
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
QueryExecutionThroughput[MB/s]
Star Schema Benchmark for PgSQL v11beta3 / PG-Strom v2.1devel on NEC ExpEther x3
PostgreSQL v11beta3 PG-Strom v2.1devel Raw I/O Limitation
max 13.5GB/s for query execution performance with 3x GPU/SSD units!!
v2.1
PostgreSQL is a successor of XXXX?
Massive log data processing using I/O optimized PostgreSQL
13.5GB/s with
3x I/O boxes
30GB/s with
8x I/O boxes
is EXPECTED
Single-node PostgreSQL now offers DWH-appliance grade performance
Massive log data processing using I/O optimized PostgreSQL
The Future Direction
〜How latest hardware makes Big-Data easy〜
NVME-over-Fabric and GPU RDMA support
Massive log data processing using I/O optimized PostgreSQL
ready
Host System
SSD
SSD
SSD
HBA
CPU
100Gb-
Network
JBOF
 More NVME-SSDs than what we can install on single-node with GPUs
 Flexibility of storage capacity enhancement on demand
GPU/DB-node
Storage-node
SSD-to-GPU Direct SQL
over 100Gb Ethernet
with RDMA protocol
SSD
SSD
SSD
SSD
SSD
SSD
SSD
SSD
SSD
HBA
JBOF
SSD
SSD
SSD
SSD
SSD
SSD
HBA
Direct Load of External Columnar Data File (Parquet)
Massive log data processing using I/O optimized PostgreSQL
 Large Data import to PostgreSQL takes long time before analytics.
 Reference to external data files can skip the data importing.
 Columnar format minimizes amount of I/O to be loaded from storage.
 SSD-to-GPU Direct SQL on the columnar-file optimizes I/O efficiency.
Foreign Table allows reference to various kind of external data source.
It can be an infrastructure for SSD-to-GPU Direct SQL on Columnar Data Files.
Table
Foreign Table
(postgres_fdw)
Foreign Table
(file_fdw)
Foreign Table
(Gstore_fdw)
Foreign Table
(Parquet_fdw)
External RDMS server
(Oracle, MySQL, ...)
CSV File
GPU Device Memory
Parquet File
External
Data
Source
Data loading
takes
long time 
WIP
Expected usage – IoT grade log data processing and analysis
Massive log data processing using I/O optimized PostgreSQL
As a data management, analytics and machine-learning platform for log data daily growing up
Manufacturing Logistics Mobile Home electronics
Why PG-Strom?
 It covers around 30-50TB data with single node by addition of I/O expansion box.
 It allows to process the raw log data as is, more than max performance of H/W.
 Users can continue to use the familiar SQL statement and applications.
BI Tools
massive log collection
and management
JBoF: Just Bunch of Flash
NVME over Fabric
(RDMA)
PG-Strom
Summary
Massive log data processing using I/O optimized PostgreSQL
 PG-Strom
An extension of PostgreSQL, to accelerate SQL execution by GPU.
It pulls out max capability of the latest hardware for big-data processing workloads.
 Core feature: SSD-to-GPU Direct SQL
It directly loads data blocks on NVME-SSD to GPU using P2P DMA, then runs SQL
workloads on GPU on the middle of I/O path. By reduction of the data to be processed,
it improves the performance of I/O bound jobs.
 Major Usage?
 Large Log Processing - M2M/IoT grows the data size to be processed
rapidly, however, proper usage of the latest hardware can cover most of
the workloads, before the big architecture change to distributed cluster.
 Simple single-node system, Familiar SQL interface, 10GB/s〜 performance
 What I didn’t talk in this session
 Advanced Analytics, Machine-Learning & Similarity Search on GPU
 Let us make another chance 
Backup Slides
uninitialized
File
BLK-100: unknown
BLK-101: unknown
BLK-102: unknown
BLK-103: unknown
BLK-104: unknown
BLK-105: unknown
BLK-106: unknown
BLK-107: unknown
BLK-108: unknown
BLK-109: unknown
Consistency with disk buffers of PostgreSQL / Linux kernel
Massive log data processing using I/O optimized PostgreSQL
① PG-Strom checks PG’s shared buffer of the blocks to read
 If block is already cached on the shared buffer of PostgreSQL.
 If block may not be all-visible using visibility-map.
② NVME-Strom checks OS’s page cache of the blocks to read
 If block is already cached on the page cache of Linux kernel.
 But, on fast-SSD mode, page cache shall not be copied unless it is not diry.
③ Then, kicks SSD-to-GPU P2P DMA on the uncached blocks
BLK-100: uncached
BLK-101: cached by PG
BLK-102: uncached
BLK-103: uncached
BLK-104: cached by OS
BLK-105: cached by PG
BLK-106: uncached
BLK-107: uncached
BLK-108: cached by OS
BLK-109: uncached
BLCKSZ
(=8KB)
Transfer Size
Per Request
BLCKSZ *
NChunks
BLK-108: cached by OS
BLK-104: cached by OS
BLK-105: cached by PG
BLK-101: cached by PG
BLK-100: uncached
BLK-102: uncached
BLK-103: uncached
BLK-106: uncached
BLK-107: uncached
BLK-109: uncached
unused
SSD-to-GPU
P2P DMA
Userspace DMA
Buffer (RAM)
Device Memory
(GPU)
CUDA API
(userspace)
cuMemcpyHtoDAsync
BLK-108: cached by OS
BLK-104: cached by OS
BLK-105: cached by PG
BLK-101: cached by PG
(Extra Info) about Star Schema Benchmark
Massive log data processing using I/O optimized PostgreSQL
Q1-1)
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder,date1
where lo_orderdate = d_datekey
and d_year = 1993
and lo_discount between 1 and 3
and lo_quantity < 25;
Q4-1)
select d_year, c_nation,
sum(lo_revenue - lo_supplycost) as profit
from date1, customer, supplier, part, lineorder
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_partkey = p_partkey
and lo_orderdate = d_datekey
and c_region = 'AMERICA'
and s_region = 'AMERICA'
and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2')
group by d_year, c_nation
order by d_year, c_nation ;
1.6GB
customer
528MB
supplier
416KB
date
206MB
parts
lineorder
1051GB

Weitere ähnliche Inhalte

Was ist angesagt?

Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAprithan
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINEDB
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareAltinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOAltinity Ltd
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)Gruter
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
Upgrading To The New Map Reduce API
Upgrading To The New Map Reduce APIUpgrading To The New Map Reduce API
Upgrading To The New Map Reduce APITom Croucher
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learningSamir Bessalah
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemShuai Yuan
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and DatasetKazuaki Ishizaki
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentRyousei Takano
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementKyong-Ha Lee
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIRyousei Takano
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSPeterAndreasEntschev
 
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop InfrastructureSupermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructuretempledf
 
Supermicro cloudera hadoop
Supermicro cloudera hadoopSupermicro cloudera hadoop
Supermicro cloudera hadoopSupermicro_SMCI
 

Was ist angesagt? (20)

Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDA
 
Apache Nemo
Apache NemoApache Nemo
Apache Nemo
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
Prashant de-ny-project-s1
Prashant de-ny-project-s1Prashant de-ny-project-s1
Prashant de-ny-project-s1
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Upgrading To The New Map Reduce API
Upgrading To The New Map Reduce APIUpgrading To The New Map Reduce API
Upgrading To The New Map Reduce API
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and Dataset
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCI
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop InfrastructureSupermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
 
Supermicro cloudera hadoop
Supermicro cloudera hadoopSupermicro cloudera hadoop
Supermicro cloudera hadoop
 

Ähnlich wie 20181116 Massive Log Processing using I/O optimized PostgreSQL

GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...Kohei KaiGai
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_ProcessingKohei KaiGai
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)Kohei KaiGai
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storageKohei KaiGai
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA UnconferenceKohei KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드문기 박
 
BlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow DemoBlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow DemoRodrigo Aramburu
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020John Zedlewski
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfDLow6
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalScyllaDB
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 

Ähnlich wie 20181116 Massive Log Processing using I/O optimized PostgreSQL (20)

GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
 
BlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow DemoBlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow Demo
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 

Mehr von Kohei KaiGai

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_HistoryKohei KaiGai
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_APIKohei KaiGai
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrowKohei KaiGai
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0Kohei KaiGai
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCacheKohei KaiGai
 
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGISKohei KaiGai
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGISKohei KaiGai
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_OnlineKohei KaiGai
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdwKohei KaiGai
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_FdwKohei KaiGai
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_TokyoKohei KaiGai
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.JapanKohei KaiGai
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_BetaKohei KaiGai
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStromKohei KaiGai
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStromKohei KaiGai
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdwKohei KaiGai
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_FdwKohei KaiGai
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LTKohei KaiGai
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigDataKohei KaiGai
 
20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JPKohei KaiGai
 

Mehr von Kohei KaiGai (20)

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache
 
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_Online
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.Japan
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStrom
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStrom
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
 
20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP
 

Kürzlich hochgeladen

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

20181116 Massive Log Processing using I/O optimized PostgreSQL

  • 1. Massivelogdataprocessing usingI/Ooptimized PostgreSQL 〜NVME and GPU breaks through the storage performance limitation〜 HeteroDB,Inc Chief Architect & CEO KaiGai Kohei <kaigai@heterodb.com>
  • 2. about HeteroDB Massive log data processing using I/O optimized PostgreSQL Corporate overview  Name HeteroDB,Inc  Established 4th-Jul-2017  Location Shinagawa, Tokyo, Japan  Businesses Production of accelerated PostgreSQL database Technical consulting on GPU&DB region By the heterogeneous-computing technology on the database area, we provides a useful, fast and cost-effective data analytics platform for all the people who need the power of analytics. CEO Profile  KaiGai Kohei – He has contributed for PostgreSQL and Linux kernel development in the OSS community more than ten years, especially, for security and database federation features of PostgreSQL.  Award of “Genius Programmer” by IPA MITOH program (2007)  The posters finalist at GPU Technology Conference 2017.
  • 3. Features of RDBMS  High-availability / Clustering  DB administration and backup  Transaction control  BI and visualization  We can use the products that support PostgreSQL as-is. Core technology – PG-Strom Massive log data processing using I/O optimized PostgreSQL PG-Strom: An extension module for PostgreSQL, to accelerate SQL workloads by the thousands cores and wide-band memory of GPU. GPU Big-data Processing PG-Strom Machine-learning & Statistics
  • 4. GPU’s characteristics - mostly as a computing accelerator Massive log data processing using I/O optimized PostgreSQL Over 10years history in HPC, then massive popularization in Machine-Learning NVIDIA Tesla V100 Super Computer (TITEC; TSUBAME3.0) Computer Graphics Machine-Learning How I/O workloads are accelerated by GPU that is a computing accelerator? Simulation
  • 5. Where is the best “location” to process massive data? Massive log data processing using I/O optimized PostgreSQL  Data transportation is fundamentally expensive.  SQL is a simple and flexible tool for data management.  Long-standing familiar tool for engineers. Data Visualization Data Processing Data Accumulation Data Generation Statistics & Machine- Learning PG-Strom for Big-Data PG-Strom for Analytics SSD-to-GPU Direct SQL Columnar Data (Parquet) CREATE FUNCTION my_logic( reggstore, text ) RETURNS matrix AS $$ $$ LANGUAGE ‘plcuda’; PL/CUDA INSERT Gstore_fdw Pystrom Database system is closest to the data. Simple & Flexible data management without data transportation. BI Tools
  • 6. DataSize Of course, mankind has no “silver bullets” for all the usage scenarios Massive log data processing using I/O optimized PostgreSQL Small Tier • Small & Middle Businesses or Branch of Big Companies • Data size: Less than 100GB • Software: RDBMS Middle Tier • Global or domestic large company • Data size: 1TB〜100TB • Software: Small scale Hadoop, or DWH solution Top Tier • Global Mega-Company • Data size: 100TB, or often PB class • Software: Large scale Hadoop cluster
  • 7. Data Management Technical Challenges for Log Processing (1/3) Massive log data processing using I/O optimized PostgreSQL I/O (Throughput) Computing
  • 8. Technical Challenges for Log Processing (2/3) Massive log data processing using I/O optimized PostgreSQL Distributed System has less H/W load, but more expensive in administration Data Management with Reasonable Design
  • 9. Data Management Technical Challenges for Log Processing (3/3) Massive log data processing using I/O optimized PostgreSQL Ultra Fast I/O Parallel Computing Hardware Evolution pulls up single-node potential to big-data processing GPU NVME- SSD
  • 10. Massive log data processing using I/O optimized PostgreSQL How PostgreSQL utilizes GPU? 〜Architecture of PG-Strom〜
  • 11. GPU code generation from SQL - Example of WHERE-clause Massive log data processing using I/O optimized PostgreSQL QUERY: SELECT cat, count(*), avg(x) FROM t0 WHERE x between y and y + 20.0 GROUP BY cat; : STATIC_FUNCTION(bool) gpupreagg_qual_eval(kern_context *kcxt, kern_data_store *kds, size_t kds_index) { pg_float8_t KPARAM_1 = pg_float8_param(kcxt,1); pg_float8_t KVAR_3 = pg_float8_vref(kds,kcxt,2,kds_index); pg_float8_t KVAR_4 = pg_float8_vref(kds,kcxt,3,kds_index); return EVAL((pgfn_float8ge(kcxt, KVAR_3, KVAR_4) && pgfn_float8le(kcxt, KVAR_3, pgfn_float8pl(kcxt, KVAR_4, KPARAM_1)))); } : E.g) Transformation of the numeric-formula in WHERE-clause to CUDA C code on demand Reference to input data SQL expression in CUDA source code Run-time compiler Parallel Execution
  • 12. EXPLAIN shows Query Execution plan with Custom GPU-node Massive log data processing using I/O optimized PostgreSQL postgres=# EXPLAIN ANALYZE SELECT cat,count(*),sum(ax) FROM tbl NATURAL JOIN t1 WHERE cid % 100 < 50 GROUP BY cat; QUERY PLAN --------------------------------------------------------------------------------------------------- GroupAggregate (cost=203498.81..203501.80 rows=26 width=20) (actual time=1511.622..1511.632 rows=26 loops=1) Group Key: tbl.cat -> Sort (cost=203498.81..203499.26 rows=182 width=20) (actual time=1511.612..1511.613 rows=26 loops=1) Sort Key: tbl.cat Sort Method: quicksort Memory: 27kB -> Custom Scan (GpuPreAgg) (cost=203489.25..203491.98 rows=182 width=20) (actual time=1511.554..1511.562 rows=26 loops=1) Reduction: Local Combined GpuJoin: enabled -> Custom Scan (GpuJoin) on tbl (cost=13455.86..220069.26 rows=1797115 width=12) (never executed) Outer Scan: tbl (cost=12729.55..264113.41 rows=6665208 width=8) (actual time=50.726..1101.414 rows=19995540 loops=1) Outer Scan Filter: ((cid % 100) < 50) Rows Removed by Outer Scan Filter: 10047462 Depth 1: GpuHashJoin (plan nrows: 6665208...1797115, actual nrows: 9948078...2473997) HashKeys: tbl.aid JoinQuals: (tbl.aid = t1.aid) KDS-Hash (size plan: 11.54MB, exec: 7125.12KB) -> Seq Scan on t1 (cost=0.00..2031.00 rows=100000 width=12) (actual time=0.016..15.407 rows=100000 loops=1) Planning Time: 0.721 ms Execution Time: 1595.815 ms (19 rows)
  • 13. A usual composition of x86_64 server Massive log data processing using I/O optimized PostgreSQL GPUSSD CPU RAM HDD N/W
  • 14. Data flow to process a massive amount of data Massive log data processing using I/O optimized PostgreSQL CPU RAM SSD GPU PCIe PostgreSQL Data Blocks Normal Data Flow All the records, including junks, must be loaded onto RAM once, because software cannot check necessity of the rows prior to the data loading. So, amount of the I/O traffic over PCIe bus tends to be large. Unless records are not loaded to CPU/RAM once, over the PCIe bus, software cannot check its necessity even if it is “junk”.
  • 15. Core Feature: SSD-to-GPU Direct SQL Massive log data processing using I/O optimized PostgreSQL CPU RAM SSD GPU PCIe PostgreSQL Data Blocks NVIDIA GPUDirect RDMA It allows to load the data blocks on NVME-SSD to GPU using peer-to-peer DMA over PCIe-bus; bypassing CPU/RAM. WHERE-clause JOIN GROUP BY Run SQL by GPU to reduce the data size Data Size: Small v2.0
  • 16. Benchmark Results – single-node version Massive log data processing using I/O optimized PostgreSQL 2172.3 2159.6 2158.9 2086.0 2127.2 2104.3 1920.3 2023.4 2101.1 2126.9 1900.0 1960.3 2072.1 6149.4 6279.3 6282.5 5985.6 6055.3 6152.5 5479.3 6051.2 6061.5 6074.2 5813.7 5871.8 5800.1 0 1000 2000 3000 4000 5000 6000 7000 Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3 QueryProcessingThroughput[MB/sec] Star Schema Benchmark on NVMe-SSD + md-raid0 PgSQL9.6(SSDx3) PGStrom2.0(SSDx3) H/W Spec (3xSSD) SSD-to-GPU Direct SQL pulls out an awesome performance close to the H/W spec  Measurement by the Star Schema Benchmark; which is a set of typical batch / reporting workloads.  CPU: Intel Xeon E5-2650v4, RAM: 128GB, GPU: NVIDIA Tesla P40, SSD: Intel 750 (400GB; SeqRead 2.2GB/s)x3  Size of dataset is 353GB (sf: 401), to ensure I/O bounds workload
  • 17. Technology Basis - GPUDirect RDMA Massive log data processing using I/O optimized PostgreSQL ▌P2P data transfer technology between GPU and other PCIe devices, bypass CPU  Originally designed for multi-nodes MPI over Infiniband  Infrastructure of Linux kernel driver for other PCIe devices, including NVME-SSDs. Copyright (c) NVIDIA corporation, 2015
  • 18. SSD-to-GPU Direct SQL - Software Stack Massive log data processing using I/O optimized PostgreSQL Filesystem (ext4, xfs) nvme_strom kernel module NVMe SSD NVIDIA Tesla GPU PostgreSQL pg_strom extension read(2) ioctl(2) Hardware Layer Operating System Software Layer Database Software Layer blk-mq nvme pcie nvme rdma Network HBA File-offset to NVME-SSD sector number translation NVMEoF Target (JBOF) NVMe Request ■ Other software component ■ Our developed software ■ Hardware NVME over Fabric nvme_strom v2.0 supports NVME-over-Fabric (RDMA).
  • 19. 40min queries in the former version, now 30sec Case Study: Log management and analytics solution Massive log data processing using I/O optimized PostgreSQL Database server with GPU+NVME-SSD Search by SQL Security incident • Fast try & error • Familiar interface • Search on row-data Aug 12 18:01:01 saba systemd: Started Session 21060 of user root. Understanding the situation of damage Identifying the impact Reporting to the manager
  • 20. Run faster, beyond the limitation
  • 21. Next bottleneck – Flood of the data over CPU’s capacity NVME and GPU accelerates PostgreSQL beyond the limitation -PGconf.EU 2018-21 Skylake-SP allows P2P DMA routing up to 8.5GB/s over PCIe root complex. GPU SSD-1 SSD-2 SSD-3 md-raid0 Xeon Gold 6126T routing by CPU
  • 22. Consideration for the hardware configuration Massive log data processing using I/O optimized PostgreSQL PCIe-switch (PLX) can make CPU more relaxed. CPU CPU PLX SSD GPU PLX SSD GPU PLX SSD GPU PLX SSD GPU SCAN SCAN SCAN SCAN JOIN JOIN JOIN JOIN GROUP BY GROUP BY GROUP BY GROUP BY Very small amount of data GATHER GATHER
  • 23. Hardware with PCIe switch (1/2) - HPC Server Massive log data processing using I/O optimized PostgreSQL Supermicro SYS-4029TRT2 x96 lane PCIe switch x96 lane PCIe switch CPU2 CPU1 QPI Gen3 x16 Gen3 x16 for each slot Gen3 x16 for each slot Gen3 x16
  • 24. Hardware with PCIe switch (2/2) - I/O expansion box Massive log data processing using I/O optimized PostgreSQL NEC ExpEther 40G (4slots) 4 slots of PCIe Gen3 x8 PCIe Swich 40Gb Ethernet Network switch CPU NIC extra I/O boxes
  • 25. System configuration with I/O expansion boxes Massive log data processing using I/O optimized PostgreSQL PCIe I/O Expansion Box Host System (x86_64 server) NVMe SSD PostgreSQL Tables PostgreSQL Data Blocks Internal PCIe Switch SSD-to-GPU P2P DMA (Large data size) GPU WHERE-clause JOIN GROUP BY PCIe over Ethernet Pre-processed small data A few GB/s SQL execution performance per box A few GB/s SQL execution performance per box A few GB/s SQL execution performance per box NIC / HBA Simplified DB operations and APP development by the simple single-node PostgreSQL configuration Enhancement of capacity & performance Visible as leafs of partitioned child-tables on PostgreSQL v2.1
  • 26. Benchmark (1/2) - System configuration Massive log data processing using I/O optimized PostgreSQL x 1 x 3 x 6 x 3 NEC Express5800/R120h-2m CPU: Intel Xeon E5-2603 v4 (6C, 1.7GHz) RAM: 64GB OS: Red Hat Enterprise Linux 7 (kernel: 3.10.0-862.9.1.el7.x86_64) CUDA-9.2.148 + driver 396.44 DB: PostgreSQL 11beta3 + PG-Strom v2.1devel NEC ExpEther 40G (4slots) I/F: PCIe 3.0 x8 (x16 physical) ... 4slots with internal PCIe switch N/W: 40Gb-ethernet Intel DC P4600 (2.0TB; HHHL) SeqRead: 3200MB/s, SeqWrite: 1575MB/s RandRead: 610k IOPS, RandWrite: 196k IOPS I/F: PCIe 3.0 x4 NVIDIA Tesla P40 # of cores: 3840 (1.3GHz) Device RAM: 24GB (347GB/s, GDDR5) CC: 6.1 (Pascal, GP104) I/F: PCIe 3.0 x16 SPECIAL THANKS FOR v2.1
  • 27. Benchmark (2/2) - Result of query execution performance Massive log data processing using I/O optimized PostgreSQL  13 SSBM queries to 1055GB database in total (a.k.a 351GB per I/O expansion box)  Raw I/O data transfer without SQL execution was up to 9GB/s. In other words, SQL execution was faster than simple storage read with raw-I/O. 2,388 2,477 2,493 2,502 2,739 2,831 1,865 2,268 2,442 2,418 1,789 1,848 2,202 13,401 13,534 13,536 13,330 12,696 12,965 12,533 11,498 12,312 12,419 12,414 12,622 12,594 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3 QueryExecutionThroughput[MB/s] Star Schema Benchmark for PgSQL v11beta3 / PG-Strom v2.1devel on NEC ExpEther x3 PostgreSQL v11beta3 PG-Strom v2.1devel Raw I/O Limitation max 13.5GB/s for query execution performance with 3x GPU/SSD units!! v2.1
  • 28. PostgreSQL is a successor of XXXX? Massive log data processing using I/O optimized PostgreSQL 13.5GB/s with 3x I/O boxes 30GB/s with 8x I/O boxes is EXPECTED Single-node PostgreSQL now offers DWH-appliance grade performance
  • 29. Massive log data processing using I/O optimized PostgreSQL The Future Direction 〜How latest hardware makes Big-Data easy〜
  • 30. NVME-over-Fabric and GPU RDMA support Massive log data processing using I/O optimized PostgreSQL ready Host System SSD SSD SSD HBA CPU 100Gb- Network JBOF  More NVME-SSDs than what we can install on single-node with GPUs  Flexibility of storage capacity enhancement on demand GPU/DB-node Storage-node SSD-to-GPU Direct SQL over 100Gb Ethernet with RDMA protocol SSD SSD SSD SSD SSD SSD SSD SSD SSD HBA JBOF SSD SSD SSD SSD SSD SSD HBA
  • 31. Direct Load of External Columnar Data File (Parquet) Massive log data processing using I/O optimized PostgreSQL  Large Data import to PostgreSQL takes long time before analytics.  Reference to external data files can skip the data importing.  Columnar format minimizes amount of I/O to be loaded from storage.  SSD-to-GPU Direct SQL on the columnar-file optimizes I/O efficiency. Foreign Table allows reference to various kind of external data source. It can be an infrastructure for SSD-to-GPU Direct SQL on Columnar Data Files. Table Foreign Table (postgres_fdw) Foreign Table (file_fdw) Foreign Table (Gstore_fdw) Foreign Table (Parquet_fdw) External RDMS server (Oracle, MySQL, ...) CSV File GPU Device Memory Parquet File External Data Source Data loading takes long time  WIP
  • 32. Expected usage – IoT grade log data processing and analysis Massive log data processing using I/O optimized PostgreSQL As a data management, analytics and machine-learning platform for log data daily growing up Manufacturing Logistics Mobile Home electronics Why PG-Strom?  It covers around 30-50TB data with single node by addition of I/O expansion box.  It allows to process the raw log data as is, more than max performance of H/W.  Users can continue to use the familiar SQL statement and applications. BI Tools massive log collection and management JBoF: Just Bunch of Flash NVME over Fabric (RDMA) PG-Strom
  • 33. Summary Massive log data processing using I/O optimized PostgreSQL  PG-Strom An extension of PostgreSQL, to accelerate SQL execution by GPU. It pulls out max capability of the latest hardware for big-data processing workloads.  Core feature: SSD-to-GPU Direct SQL It directly loads data blocks on NVME-SSD to GPU using P2P DMA, then runs SQL workloads on GPU on the middle of I/O path. By reduction of the data to be processed, it improves the performance of I/O bound jobs.  Major Usage?  Large Log Processing - M2M/IoT grows the data size to be processed rapidly, however, proper usage of the latest hardware can cover most of the workloads, before the big architecture change to distributed cluster.  Simple single-node system, Familiar SQL interface, 10GB/s〜 performance  What I didn’t talk in this session  Advanced Analytics, Machine-Learning & Similarity Search on GPU  Let us make another chance 
  • 34.
  • 36. uninitialized File BLK-100: unknown BLK-101: unknown BLK-102: unknown BLK-103: unknown BLK-104: unknown BLK-105: unknown BLK-106: unknown BLK-107: unknown BLK-108: unknown BLK-109: unknown Consistency with disk buffers of PostgreSQL / Linux kernel Massive log data processing using I/O optimized PostgreSQL ① PG-Strom checks PG’s shared buffer of the blocks to read  If block is already cached on the shared buffer of PostgreSQL.  If block may not be all-visible using visibility-map. ② NVME-Strom checks OS’s page cache of the blocks to read  If block is already cached on the page cache of Linux kernel.  But, on fast-SSD mode, page cache shall not be copied unless it is not diry. ③ Then, kicks SSD-to-GPU P2P DMA on the uncached blocks BLK-100: uncached BLK-101: cached by PG BLK-102: uncached BLK-103: uncached BLK-104: cached by OS BLK-105: cached by PG BLK-106: uncached BLK-107: uncached BLK-108: cached by OS BLK-109: uncached BLCKSZ (=8KB) Transfer Size Per Request BLCKSZ * NChunks BLK-108: cached by OS BLK-104: cached by OS BLK-105: cached by PG BLK-101: cached by PG BLK-100: uncached BLK-102: uncached BLK-103: uncached BLK-106: uncached BLK-107: uncached BLK-109: uncached unused SSD-to-GPU P2P DMA Userspace DMA Buffer (RAM) Device Memory (GPU) CUDA API (userspace) cuMemcpyHtoDAsync BLK-108: cached by OS BLK-104: cached by OS BLK-105: cached by PG BLK-101: cached by PG
  • 37. (Extra Info) about Star Schema Benchmark Massive log data processing using I/O optimized PostgreSQL Q1-1) select sum(lo_extendedprice*lo_discount) as revenue from lineorder,date1 where lo_orderdate = d_datekey and d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25; Q4-1) select d_year, c_nation, sum(lo_revenue - lo_supplycost) as profit from date1, customer, supplier, part, lineorder where lo_custkey = c_custkey and lo_suppkey = s_suppkey and lo_partkey = p_partkey and lo_orderdate = d_datekey and c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') group by d_year, c_nation order by d_year, c_nation ; 1.6GB customer 528MB supplier 416KB date 206MB parts lineorder 1051GB