Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off

Databases Have Forgotten 
About Single Node Performance,
A Wrongheaded Trade Off
Matvey Arye, PhD
Core Database Engineer
mat@timescale.com · github.com/timescale

Databases changed from  
scaling up to scaling out

But…
MapReduce
specializes in
long jobs that
touch lots of data
• Grep: 150 Seconds
• Sort: 890 Seconds
• Average job completion time of
Google production workloads: 634
Seconds
(Source: Google MapReduce Paper)

Network
Latencies • Connection setup/teardown
• TCP ramp-up
• Network transfer

Consensus is
Expensive • Dreaded two-phase commit
• Snapshot Isolation even harder
• Coordinators often bottlenecks

The high-latency of distributed databases is nothing new
Core
s
Twitter_rv uk_2007_5
Spark 128 1784s 8000s+
Giraph 128 200s 8000s+
GraphLab 128 242s 714s
Graphx 128 251s 800s
Laptop 1 153s 417s
Laptop* 1 15s 30s
Graph Connected Component Analysis
Source: Scalability! But at what COST? - McSherry et al.
Core
s
Twitter_r
v
uk_2007_5
Spark 128 857s 1759s
Giraph 128 596s 1235s
GraphLab 128 249s 833s
Graphx 128 419s 462s
Laptop 1 300s 651s
Page Rank (20 iterations)

But that was 10 years ago
The world has changed
160 GB Spinning Rust
2 Core CPU
4GB RAM
60 TB SSDs
64 Core CPU
TBs of RAM
Edge computing
GPUs, TPUs
Then Now

Financial &
Marketing
Industrial
Machines
Datacenter
& DevOps And data
needs have
become
real-time.Web / Mobile
Events
Transportation
& Logistics

Shorter latency
requirements mean  
we need to focus
again on scaling up

So how do we improve
single-node performance? 
(Lessons from TimescaleDB)

Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage)
Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

Insight #1
Partition, even on a
single machine.

Partitioning on a single node.
Chunk (sub-table)
Hypertable
Space
Time 
(older)

Chunks are “right-sized”
Recent (hot) chunks ﬁt in memory

144K
METRICS / S
TimescaleDB vs. PostgreSQL
(single-row inserts)
14.4K
INSERTS / S
TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage)

>20x
TimescaleDB vs. PostgreSQL
(batch inserts)
TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage)
1.11M
METRICS / S

Other benefits to partitioning

Avoid querying chunks via constraint exclusion
SELECT time, device_id, temp FROM data 
WHERE time > ‘2017-08-22 18:18:00+00’

Efficient retention policies
Drop chunks, don’t delete rows avoids vacuuming

Insight #2
Partition, even on a
single machine 
across many disks

Single node: Scaling up via adding disks
• Faster inserts
• Parallelized queries
How Benefit
Chunks spread across many disks (elastically!)
either RAIDed or via distinct tablespaces

Insight #3
Query Optimizations

Avoid querying chunks via constraint exclusion (more)
SELECT time, device_id, temp FROM data 
WHERE time > now() - interval ’24 hours’ Won’t exclude
chunks in plain
PostgreSQL

Example Query Optimization
CREATE INDEX ON readings(time);
SELECT date_trunc(‘minute’, time) as bucket,
avg(cpu)
FROM readings
GROUP BY bucket
ORDER BY bucket DESC
LIMIT 10;
Will this usethe index?

Example Query Optimization
Timescaleunderstands
time
CREATE INDEX ON readings(time);
SELECT date_trunc(‘minute’, time) as bucket,
avg(cpu)
FROM readings
GROUP BY bucket
ORDER BY bucket DESC
LIMIT 10;

How to find bottlenecks?
Flame Graph
Function: postgres`standard_planner (110 samples, 90.91%)
postgres`0x10f97ed8e
li..
postgres`PostmasterMain
pos..
postgr..
postgres`..
po..
postgres`PortalRun
post..
postgres`build_simple_rel
libsystem_kernel.d..
postgres`RelationGetNumberOfBlock..
postgresàdd_base_rels_to_query
postgresàdd_base_rels_to_query
po..
postgres`mdnblocks
po..
post..postgres`FileSeek
postgres`mdnblocks
postgresòp.
postgres`grouping_planner
postgres`_mdnblocks
postgres`standard_ProcessUtility
pos..
timescaledb0.10.0dev.so`timescaledb_ddl_command_start
postgres`BackendStartup
postgres`FillPortalStore
postgres`get_relat..
postgres`set_base_rel_sizes
postgr..
postgr..postgres`FileAccess
po..
postgres`0x10
postgres`pr.
timescaledb0.10.0dev.so`prev_ProcessUtility
po..
postgr..
postgresèstimate_rel_size
postgres`_mdnblocks postgr..
postgr..
postgresèxec_simple_query
pos..
po..
po..
libsys..
po..
postgres`make_one_rel
po..
postgr..
postgres`query_planner
pos..
postgres`FileAccess
libsystem_kerne..
postgres`..
postgres`PostgresMain
li..
po..
postgres`RelationGetNumbe..
postgres`pr.
pos..
postgres`pr.
postgres`LruInsert
postgres`FileSeek po..
postgres`relation_excluded_by_constr..
postgresÈxplainQuery
pos..
post..
postgres`pg_plan_query
postgres`smgrnblocks
postgres`build_simple_rel
postgres`smgrnblocks
pos..
postgr..
postgres`subquery_planner
postgres`PortalRunUtility
postgres`planner
post..
postgres`ServerLoop
postgres`LruInsert
lib..
postgres`standard_planner
postgres`ProcessUtility
timescaledb0.10.0dev.so`timescaledb_planner
postgres`pr.
libdyld.dylib`start
postgr..
postgres`set_append_rel_size
postgres`0x10f89e519
postgr..
postgres`set_rel_size
postgres`get_relation_info
postgr..
postgresÈxplainOneQuery
po..
po..

Insight #4
HA != Horizontally-distributed

Single Node + Replicas = High-availability

Replication much
cheaper than
horizontal-
distribution
• Avoids consensus penalty
• Can more easily be asynchronous
• Still provides HA

Results vs Cassandra (5 nodes): Query
Type Scanned (h) Devices metrics
Timescale
query time (ms)
Cassandra
Multiplier
groupby-hour 24 all 1 27866.50 9.66
groupby-hour 24 all 5 35827.60 37.52
groupby-hour 24 all 10 44912.30 45.05
high-cpu 24 all 1 49,807 32.95
cpu-max 12 8 10 126.90 11.84
cpu-max 12 1 10 20.54 37.41
high-cpu 24 1 1 56 18.65
groupby-minute 12 1 1 33.80 0.91
groupby-minute 12 1 5 27.60 10.25
lastpoint all all 10 266.10 2285.30
groupby-orderby-limit all all 1 75.20 3181.62
Higher multiplier indicates worse Cassandra performance.

Insert metrics/s
Cassandra 150K
Timescale 745K
TimescaleDB vs.
Cassandra 
(5 nodes)
Read the Cassandra Benchmarks

Results vs Mongo: Query
Type Scanned (h) Devices metrics
Timescale
query time (ms)
Mongo
Multiplier
groupby-hour 24 all 1 29,968 7.63
high-cpu 24 all 1 51,323 5.24
cpu-max 12 8 10 260 1.14
cpu-max 12 1 10 28 1.32
high-cpu 24 1 1 95 1.17
groupby-minute 12 1 1 62 0.77
lastpoint all all 10 453 101.72
groupby-orderby-limit all all 1 149 1667.25
Higher multiplier indicates worse Mongo performance.

Insert metrics/s
Mongo 807K
Timescale 994K
TimescaleDB
vs. Mongo
Read the Mongo Benchmarks

Lessons for DB designers
1. Consider (insights!):
A. Partitioning even on single-node
• Even across disks
B. Performance analysis:
• High-level: Query optimization
• Low-level: Proﬁling
C. High-availability is possible on single-node
2. Concentrate on scale-up

Lessons for DB users
• Absolute performance as important as scaling numbers.
• Don’t go horizontally distributed unless you have to.
• HA not same as horizontal scalability.
• Replication cheaper than distribution.
• SQL and ACID are extremely useful.

Open Source (Apache 2.0)
• github.com/timescale/timescaledb
Join the Community
• slack.timescale.com

Timescale is hiring!
• Core C Developers
• R&D Engineers
• Solutions & Sales Engineers
• Technical Evangelists
• Customer Success
• Marketing
careers.timescale.com

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off

Ähnlich wie Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off