WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
20181025_pgconfeu_lt_gstorefdw
1. PostgreSQL as a Machine-Learning Platform
〜Gstore_fdw and data collaboration〜
HeteroDB,Inc
Chief Architect & CEO
KaiGai Kohei <kaigai@heterodb.com>
2. about HeteroDB
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-2
Corporate overview
Name HeteroDB,Inc
Established 4th-Jul-2017
Headcount 2 (KaiGai and Kashiwagi)
Location Shinagawa, Tokyo, Japan
Businesses Sales of accelerated database product
Technical consulting on GPU&DB region
By the heterogeneous-computing technology on the database area,
we provides a useful, fast and cost-effective data analytics platform
for all the people who need the power of analytics.
CEO Profile
KaiGai Kohei – He has contributed for PostgreSQL and Linux kernel
development in the OSS community more than ten years, especially,
for security and database federation features of PostgreSQL.
Award of “Genius Programmer” by IPA MITOH program (2007)
The top-5 posters finalist at GPU Technology Conference 2017.
3. Friday 11:50 - 12:40
NVME and GPU accelerates
PostgreSQL
Features of RDBMS
✓ High-availability / Clustering
✓ DB administration and backup
✓ Transaction control
✓ BI and visualization
➔ We can use the products that
support PostgreSQL as-is.
Core technology – PG-Strom
PG-Strom: An extension module for PostgreSQL, to accelerate SQL
workloads by the thousands cores and wide-band memory of GPU.
GPU
Big-data Analytics
PG-Strom
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-3
Machine-learning & Statistics
4. GPU’s characteristics - mostly as a computing accelerator
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-4
Over 10years history in HPC, then massive popularization in Machine-Learning
NVIDIA Tesla V100
Super Computer
(TITEC; TSUBAME3.0) Computer Graphics Machine-Learning
How PG-Strom utilizes the power of GPU for in-database analytics?
Simulation
6. PGconf.SV 2016 at SunFrancisco
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-6
Acceleration of drug-discovery workloads
with in-database analytics approach
using PL/CUDA user defined function
7. PL/CUDA Used Defined Function
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-7
Result
Scan
Pre-Process
Analytics
Post-ProcessCREATE FUNCTION
my_logic(int, real[], real[])
RETURNS real[]
AS $$
$$ LANGUAGE ‘plcuda’;
Custom CUDA C code block
(runs on GPU device)
▌Don’t export the dataset for analytics using external software.
▌All you pull out from the database is “result” of analytics.
8. PL/CUDA works on Drug-Discovery workloads (1/2)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-8
Database chemical
compounds set
(D; 10M records scale)
Query chemical
compounds set
(Q; ~1000 records scale)
Calculation of
their similarity
Target Protein “similar compounds” will
have higher probability of active
10 billions
combination
Similarity-search on chemical compounds is researcher’s daily job.
9. PL/CUDA works on Drug-Discovery workloads (2/2)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-9
30.25
145.29
295.95
1503.31
3034.94
13.00 13.23 13.59 16.01 19.13
0
500
1000
1500
2000
2500
3000
3500
10 50 100 500 1000
QueryResponseTime[sec]
Number of Query Compounds [Q]
Similarity search of chemical compounds by k-NN method (k=3, D=10M)
CPU(E5-2670v3) GTX1080
Yes, GPU accelerates the workloads more than x150 time faster!
x150 times
shorter
response!
11. PL/CUDA works on Drug-Discovery workloads (2/2)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-11
30.25
145.29
295.95
1503.31
3034.94
13.00 13.23 13.59 16.01 19.13
0
500
1000
1500
2000
2500
3000
3500
10 50 100 500 1000
QueryResponseTime[sec]
Number of Query Compounds [Q]
Similarity search of chemical compounds by k-NN method (k=3, D=10M)
CPU(E5-2670v3) GTX1080
CPU version consumes time according to the scale of calculation
x100 times larger execution time
for x100 times larger calculation amount
12. PL/CUDA works on Drug-Discovery workloads (2/2)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-12
30.25
145.29
295.95
1503.31
3034.94
13.00 13.23 13.59 16.01 19.13
0
500
1000
1500
2000
2500
3000
3500
10 50 100 500 1000
QueryResponseTime[sec]
Number of Query Compounds [Q]
Similarity search of chemical compounds by k-NN method (k=3, D=10M)
CPU(E5-2670v3) GTX1080
Why GPU version takes relatively longer time for the small workloads
Less than x2 times larger
execution time
for x100 times larger
calculation amount
13. Invocation of PL/CUDA functions
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-13
PREPARE knn_sim_rand_10m_gpu_v2(int) -- arg1:@k-value
AS
SELECT row_number() OVER (),
fp.name,
similarity
FROM (SELECT float4_as_int4(key_id) key_id, similarity
FROM matrix_unnest(
(SELECT rbind( knn_gpu_similarity($1,Q.matrix,
D.matrix))
FROM (SELECT cbind(array_matrix(id),
array_matrix(bitmap)) matrix
FROM finger_print_query) Q,
(SELECT matrix
FROM finger_print_10m_matrix) D
)
) AS sim(key_id real, similarity real)
ORDER BY similarity DESC) sim,
finger_print_10m fp
WHERE fp.id = sim.key_id
LIMIT 1000;
Time consumption by argument setup
(10~11sec per invocation)
16. Gstore_fdw - FDW on behalf of GPU device memory region
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-16
GPU world
Storage
SQL world
GPU device memory
Foreign Table
(gstore_fdw)
INSERT
UPDATE
DELETE
SELECT
Reference
by Zero-copy
✓ Data Format Conversion
✓ Data Compression
✓ Transaction Controls
PL/CUDA
User Defined
Function
17. Gstore_fdw manages persistent device memory (1/4)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-17
CREATE FOREIGN TABLE ft (
id int,
x0 real,
x1 real,
x2 real,
x3 real,
x4 real,
x5 real,
x6 real,
x7 real,
x8 real,
x9 real
) SERVER gstore_fdw
OPTIONS (pinning '0', format 'pgstrom');
18. Gstore_fdw manages persistent device memory (2/4)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-18
postgres=# INSERT INTO ft
(SELECT x, 100*random(), 100*random(), 100*random(),
100*random(), 100*random(), 100*random(),
100*random(), 100*random(), 100*random(),
100*random()
FROM generate_series(1,10000000) x);
LOG: alloc: preserved memory 440000320 bytes
INSERT 0 10000000
Acquired 440MB of GPU device memory, then
load the written data chunk to GPU device
19. Gstore_fdw manages persistent device memory (3/4)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-19
Before INSERT
$ nvidia-smi
Sun Nov 12 00:03:30 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81 Driver Version: 384.81 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:02:00.0 Off | 0 |
| N/A 36C P0 52W / 250W | 171MiB / 22912MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 12438 C ...bgworker: PG-Strom GPU memory keeper 161MiB |
+-----------------------------------------------------------------------------+
20. Gstore_fdw manages persistent device memory (3/4)
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-20
After INSERT
$ nvidia-smi
Sun Nov 12 00:06:01 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81 Driver Version: 384.81 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:02:00.0 Off | 0 |
| N/A 36C P0 51W / 250W | 591MiB / 22912MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 12438 C ...bgworker: PG-Strom GPU memory keeper 581MiB |
+-----------------------------------------------------------------------------+
Preserved GPU device memory
even if PostgreSQL session closed
23. CUDA Driver API - Interprocess device memory handling
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-23
CUresult cuIpcGetMemHandle(CUipcMemHandle *pHandle,
CUdeviceptr dptr);
Gets an interprocess memory handle for an existing device memory allocation.
CUresult cuIpcOpenMemHandle(CUdeviceptr* pdptr,
CUipcMemHandle handle,
unsigned int flags )
Opens an interprocess memory handle exported from another process and returns a
device pointer usable in the local process.
Gets a unique identifier of GPU device memory at the owner process
Opens the GPU device memory using the unique identifier at the other process
24. PostgreSQL as a Machine-Learning Platform
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-24
GPU world
Inter-Process
Data Collaboration
Storage
SQL world
GPU device
memory
Foreign Table
(gstore_fdw)
INSERT
UPDATE
DELETE
SELECT
User’s Custrom
Python Scripts
IPC Handle
IPC Handle
Internal data structure that
is compatible to Python’s
analytics libraries
ndarray data
ndarray data
Gets a unique identifier of GPU device memory at the owner process
25. Step-1. Export IPC handle of GPU device memory
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-25
postgres=# select gstore_export_ipchandle('ft’);
gstore_export_ipchandle
-------------------------------------------------------------
¥x006b73020000000060110000000000000075020000000000000020000000
0000000000000000000000020000000000005b000000000000002000d0c1ff
00005c
(1 row)
CUDA runtime returns a unique identifier with 64bytes length
26. Step-2. Open IPC handle on your Python script
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-26
#!/usr/bin/python
import psycopg2
import pystrom
# connect to PostgreSQL server
conn = psycopg2.connect("host=localhost dbname=postgres")
# Get IPC handle of the foreign-table ‘ft’
curr = conn.cursor()
curr.execute("select gstore_export_ipchandle('ft')::bytea")
row = curr.fetchone()
conn.close()
# Get cupy.ndarray object; 2D-matrix with float4
# which is consists of column ‘x’, ’y’ and ‘z’
X = pystrom.ipc_import(row[0], ['x','y','z'])
27. Step-3. Data is now already loaded on GPU. Do analytics as you like.
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-27
At Python script:
>>> X
array([[0.05267062, 0.15842682, 0.95535886],
[0.8110889 , 0.75173104, 0.09625155],
[0.0950045 , 0.71161145, 0.6916123 ],
...,
[0.32576588, 0.8340051 , 0.82255083],
[0.12769088, 0.23999453, 0.28765103],
[0.07242639, 0.14565416, 0.7454422 ]], dtype=float32)
At PostgreSQL:
postgres=# SELECT * FROM ft LIMIT 5;
id | x | y | z
----+-----------+----------+-----------
1 | 0.0526706 | 0.158427 | 0.955359
2 | 0.811089 | 0.751731 | 0.0962516
3 | 0.0950045 | 0.711611 | 0.691612
4 | 0.051835 | 0.405314 | 0.0207166
5 | 0.598073 | 0.4739 | 0.492226
(5 rows)
28. Step-4. All the stuff in Python, do your analytics workloads on GPUs
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-28
◆ Dot Product
>>> cupy.dot(X[:,0],X[:,1])
array(24974.453, dtype=float32)
◆ Transpose Matrix
>>> cupy.transpose(X)
array([[0.8655484, 0.9804696, 0.43135548, ..., 0.58545816,
0.9951294, 0.14361869],
[0.12646914, 0.92461866, 0.14051293, ..., 0.5793936,
0.7182556 , 0.15441231],
[0.10312917, 0.2307432 , 0.6121663 , ..., 0.78983736,
0.19550513, 0.38183048]], dtype=float32)
29. Our vision for in-database analytics & machine-learning
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-29
gstore_fdw
Data manipulation
on the local sideData
Collaboration
Good bye CSV, Data Management is a suitable job for DBMS
Python runs
statistical analysis &
machine-learning
Data Scientist
are responsible for both of data
management and data analytics;
including machine-learning.
Data Lake
Data Warehouse
postgres_fdw / xxx_fdw
connects remote database for data import.
Available to run filtering, pre-processing
and others on the remote side.
30. Resources
PostgreSQL as Machine-Learning Platform -PGconf.EU 2018-30
▌PG-Strom
GitHub:
https://github.com/heterodb/pg-strom
Documentation:
http://heterodb.github.io/pg-strom/
▌System requirement
Plan to distribute VM image for Microsoft Azure GPU instance
....likely, by end of the November (coming soon!)
Or, your on-premise environment, of course.
https://github.com/heterodb/pg-strom/wiki/002:-HW-Validation-List
▌Contact
ML: pgstrom@heterodb.com
e-mail: kaigai@heterodb.com
Twitter: @kkaigai