SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
CLICKHOUSE AND THE
MAGIC OF
MATERIALIZED VIEWS
Robert Hodges and the Altinity Engineering Team
Quick Introduction to Altinity
● Premier provider of software and services for ClickHouse
○ Enterprise support for ClickHouse and ecosystem projects
○ ClickHouse Feature Development (Geohashing, codecs, tiered
storage, log cleansing
)
○ Software (Kubernetes, cluster manager, tools & utilities)
○ POCs/Training
● Main US/Europe sponsor of ClickHouse community
Introduction to ClickHouse
Understands SQL
Runs on bare metal to cloud
Stores data in columns
Parallel and vectorized execution
Scales to many petabytes
Is Open source (Apache 2.0)
Is WAY fast!
a b c d
a b c d
a b c d
a b c d
Materialized
views for basic
sums
Computing sums on large datasets
Our favorite dataset!
1.31 billion rows of data
from taxi and limousine
rides in New York City
Question:
What is the average number
of persons per taxi in each
pickup location?
ClickHouse can solve it with a simple query
SELECT pickup_location_id,
sum(passenger_count) / pc_cnt AS pc_avg,
count() AS pc_cnt
FROM tripdata GROUP BY pickup_location_id
ORDER BY pc_cnt DESC LIMIT 10
...
10 rows in set. Elapsed: 1.818 sec. Processed 1.31
billion rows, 3.93 GB (721.16 million rows/s., 2.16
GB/s.)
Can we make it faster?
We can add a materialized view
CREATE MATERIALIZED VIEW tripdata_smt_mv
ENGINE = SummingMergeTree
PARTITION BY toYYYYMM(pickup_date)
ORDER BY (pickup_location_id, dropoff_location_id)
POPULATE
AS SELECT
pickup_date,
pickup_location_id,
dropoff_location_id,
sum(passenger_count) AS passenger_count_sum,
count() AS trips_count
FROM tripdata
GROUP BY
pickup_date,
pickup_location_id,
dropoff_location_id
Engine
How to
derive
data
Auto-
load
after
create
How
to
sum
Now we can query the materialized view
SELECT pickup_location_id,
sum(passenger_count_sum) / sum(trips_count) AS pc_avg,
sum(trips_count) AS pc_cnt
FROM tripdata_smt_mv GROUP BY pickup_location_id
ORDER BY pc_cnt DESC LIMIT 10
...
10 rows in set. Elapsed: 0.016 sec. Processed 2.04
million rows, 36.66 MB (127.04 million rows/s., 2.29
GB/s.)
Materialized view is 113 times faster
What’s going on under the covers?
cpu
(MergeTree)
.inner.tripdata_smt_mv
(SummingMergeTree)
tripdata_smt_mv
(materialized view)
INSERT
SELECT
Compressed size: ~43GB
Uncompressed size: ~104GB
Compressed size: ~13MB
Uncompressed size: ~44MB
SELECT
INSERT
(Trigger)
Materialized
views for SQL
aggregates
Let’s look at aggregates other than sums
SELECT min(fare_amount), avg(fare_amount),
max(fare_amount), sum(fare_amount), count()
FROM tripdata
WHERE (fare_amount > 0) AND (fare_amount < 500.)
...
1 rows in set. Elapsed: 4.640 sec. Processed 1.31
billion rows, 5.24 GB (282.54 million rows/s., 1.13
GB/s.)
Can we make all aggregates faster?
Let’s create a view for all aggregates
CREATE MATERIALIZED VIEW tripdata_agg_mv
ENGINE = SummingMergeTree
PARTITION BY toYYYYMM(pickup_date)
ORDER BY (pickup_location_id, dropoff_location_id)
POPULATE AS SELECT
pickup_date, pickup_location_id,
dropoff_location_id,
minState(fare_amount) as fare_amount_min,
avgState(fare_amount) as fare_amount_avg,
maxState(fare_amount) as fare_amount_max,
sumState(fare_amount) as fare_amount_sum,
countState() as fare_amount_count
FROM tripdata
WHERE (fare_amount > 0) AND (fare_amount < 500.)
GROUP BY
pickup_date, pickup_location_id, dropoff_location_id
Engine
How to
derive
data
Auto-
load
after
create
Agg.
order
Filter on
data
Digression: How aggregation works
fare_amount
avgState(fare_amount)
avgMerge(fare_amount_avg)
Source value
Partial
aggregate
Merged
aggregate
Now let’s select from the materialized view
SELECT minMerge(fare_amount_min) AS fare_amount_min,
avgMerge(fare_amount_avg) AS fare_amount_avg,
maxMerge(fare_amount_max) AS fare_amount_max,
sumMerge(fare_amount_sum) AS fare_amount_sum,
countMerge(fare_amount_count) AS fare_amount_count
FROM tripdata_agg_mv
...
1 rows in set. Elapsed: 0.017 sec. Processed 208.86
thousand rows, 23.88 MB (12.54 million rows/s., 1.43
GB/s.)
Materialized view is 274 times faster
Materialized
views for “last
point queries”
Last point problems are common in time series
Host
7023
Host
6522
CPU
Utilization
Host
9601
CPU
Utilization
CPU
UtilizationCPU
Utilization
CPU
UtilizationCPU
Utilization
CPU Table
Problem: Show the
current CPU utilization for
each host
CPU
Utilization
CPU
Utilization
ClickHouse can solve this using a subquery
SELECT t.hostname, tags_id, 100 - usage_idle usage
FROM (
SELECT tags_id, usage_idle
FROM cpu
WHERE (tags_id, created_at) IN
(SELECT tags_id, max(created_at)
FROM cpu GROUP BY tags_id)
) AS c
INNER JOIN tags AS t ON c.tags_id = t.id
ORDER BY
usage DESC,
t.hostname ASC
LIMIT 10
TABLE SCAN!
USE INDEX
OPTIMIZED
JOIN COST
SQL queries work but are ineïŹƒcient
OUTPUT:
┌─hostname──┬─tags_id─┬─usage─┐
│ host_1002 │ 9003 │ 100 │
│ host_1116 │ 9117 │ 100 │
│ host_1141 │ 9142 │ 100 │
│ host_1163 │ 9164 │ 100 │
│ host_1210 │ 9211 │ 100 │
│ host_1216 │ 9217 │ 100 │
│ host_1234 | 9235 │ 100 │
│ host_1308 │ 9309 │ 100 │
│ host_1419 │ 9420 │ 100 │
│ host_1491 │ 9492 │ 100 │
└───────────┮─────────┮───────┘
Using direct query on table:
10 rows in set. Elapsed: 0.566 sec.
Processed 32.87 million rows, 263.13
MB (53.19 million rows/s., 425.81
MB/s.)
Can we bring last
point performance
closer to real-time?
Create an explicit target for aggregate data
CREATE TABLE cpu_last_point_idle_agg (
created_date AggregateFunction(argMax, Date, DateTime),
max_created_at AggregateFunction(max, DateTime),
time AggregateFunction(argMax, String, DateTime),
tags_id UInt32,
usage_idle AggregateFunction(argMax, Float64, DateTime)
)
ENGINE = AggregatingMergeTree()
PARTITION BY tuple()
ORDER BY tags_id
TIP: This is a new way
to create materialized
views in ClickHouse
TIP: ‘tuple()’ means
don’t partition and
don’t sort data
argMaxState links columns with aggregates
CREATE MATERIALIZED VIEW cpu_last_point_idle_mv
TO cpu_last_point_idle_agg
AS SELECT
argMaxState(created_date, created_at) AS created_date,
maxState(created_at) AS max_created_at,
argMaxState(time, created_at) AS time,
tags_id,
argMaxState(usage_idle, created_at) AS usage_idle
FROM cpu
GROUP BY tags_id
Derive data
MV
table
Understanding how argMaxState works
created_at
maxState(created_at)
avgMerge(created_at)
Source
values
Partial
aggregates
Merged
aggregates
usage_idle
argMaxState(usage_idle,
created_at)
avgMaxMerge(usage_idle)
(Same row)
(Pick usage_idle from
aggregate with
matching created_at)
(Pick usage_idle value
from any row with
matching created_at)
Load materialized view using a SELECT
INSERT INTO cpu_last_point_idle_mv
SELECT
argMaxState(created_date, created_at) AS created_date,
maxState(created_at) AS max_created_at,
argMaxState(time, created_at) AS time,
Tags_id,
argMaxState(usage_idle, created_at) AS usage_idle
FROM cpu
GROUP BY tags_id
POPULATE keyword is not supported
Let’s hide the merge details with a view
CREATE VIEW cpu_last_point_idle_v AS
SELECT
argMaxMerge(created_date) AS created_date,
maxMerge(max_created_at) AS created_at,
argMaxMerge(time) AS time,
tags_id,
argMaxMerge(usage_idle) AS usage_idle
FROM cpu_last_point_idle_mv
GROUP BY tags_id
...Select again from the covering view
SELECT t.hostname, tags_id, 100 - usage_idle usage
FROM cpu_last_point_idle_v AS b
INNER JOIN tags AS t ON b.tags_id = t.id
ORDER BY usage DESC, t.hostname ASC
LIMIT 10
...
10 rows in set. Elapsed: 0.005 sec. Processed 14.00
thousand rows, 391.65 KB (2.97 million rows/s., 82.97
MB/s.)
Last point view is 113 times faster
Another view under the covers
cpu
(MergeTree)
cpu_last_point_idle_agg
(AggregatingMergeTree)
cpu_last_point_idle_mv
(Materialized View)
cpu_last_point_idle_v
(View)
INSERT
SELECT
SELECT
(Trigger)
Compressed size: ~3GB
Uncompressed size: ~12GB
Compressed size: ~29K (!!!)
Uncompressed size: ~264K (!!!)
Using views to
remember
unique visitors
It’s common to drop old data in large tables
CREATE TABLE traffic (
datetime DateTime,
date Date,
request_id UInt64,
cust_id UInt32,
sku UInt32
) ENGINE = MergeTree
PARTITION BY toYYYYMM(datetime)
ORDER BY (cust_id, date)
TTL datetime + INTERVAL 90 DAY
ClickHouse can aggregate unique values
-- Find which hours have most unique visits on
-- SKU #100
SELECT
toStartOfHour(datetime) as hour,
uniq(cust_id) as uniqs
FROM purchases
WHERE sku = 100
GROUP BY hour
ORDER BY uniqs DESC, hour ASC
LIMIT 10
How can we remember deleted visit data?
Create a target table for unique visit data
CREATE TABLE traffic_uniqs_agg (
hour DateTime,
cust_id_uniqs AggregateFunction(uniq, UInt32),
sku UInt32
)
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (sku, hour)
Need to partition
Engine
Agg. order
Aggregate for
unique values
Use uniqState to collect unique customers
CREATE MATERIALIZED VIEW traffic_uniqs_mv
TO traffic_uniqs_agg
AS SELECT
toStartOfHour(datetime) as hour,
uniqState(cust_id) as cust_id_uniqs,
Sku
FROM traffic
GROUP BY hour, sku
Now we can remember visitors indeïŹnitely!
SELECT
hour,
uniqMerge(cust_id_uniqs) as uniqs
FROM traffic_uniqs_mv
WHERE sku = 100
GROUP BY hour
ORDER BY uniqs DESC, hour ASC
LIMIT 10
Check view size regularly!
SELECT
table, count(*) AS columns,
sum(data_compressed_bytes) AS tc,
sum(data_uncompressed_bytes) AS tu
FROM system.columns
WHERE database = currentDatabase()
GROUP BY table
┌─table─────────────┬─columns─┬────────tc─┬─────────tu─┐
│ traffic_uniqs_agg │ 3 │ 52620883 │ 90630442 │
│ traffic │ 5 │ 626152255 │ 1463621588 │
│ traffic_uniqs_mv │ 3 │ 0 │ 0 │
└───────────────────┮─────────┮───────────┮────────────┘
Uniq hash
tables can be
quite large
Operational
Issues
Loading view data in production
Schema migration
Load past data in the future
CREATE MATERIALIZED VIEW sales_amount_mv
TO sales_amount_agg
AS SELECT
toStartOfHour(datetime) as hour,
sumState(amount) as amount_sum,
sku
FROM sales
WHERE
datetime >= toDateTime('2019-07-12 00:00:00')
GROUP BY hour, sku
How to
derive
data
Accept only future
rows from source
Creating a view that accepts future data
INSERT INTO sales_amount_agg
SELECT
toStartOfHour(datetime) as hour,
sumState(amount) as amount_sum,
sku
FROM sales
WHERE
datetime < toDateTime('2019-01-12 00:00:00')
GROUP BY hour, sku
Load to
view target
table
Accept only past
rows from source
Add an aggregate to MV with inner table
-- Detach view from server.
DETACH TABLE sales_amount_inner_mv
-- Update base table
ALTER TABLE `.inner.sales_amount_inner_mv` ADD COLUMN
`amount_avg` AggregateFunction(avg, Float32)
AFTER amount_sum
Check order
carefully!
TIP: Stop source
INSERTs to avoid
losing data or load
data later
Add column to MV with inner table (cont.)
-- Re-attach view with modifications.
ATTACH MATERIALIZED VIEW sales_amount_inner_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (sku, hour)
AS SELECT
toStartOfHour(datetime) as hour,
sumState(amount) as amount_sum,
avgState(amount) as amount_avg,
sku
FROM sales
GROUP BY hour, sku
Empty until
new data
arrive
Specify table
‘TO’ option views are easier to change
-- Drop view
DROP TABLE sales_amount_mv
-- Update target table
ALTER TABLE sales_amount_agg ADD COLUMN
`amount_avg` AggregateFunction(avg, Float32)
AFTER amount_sum
-- Recreate view
CREATE MATERIALIZED VIEW sales_amount_mv TO sales_amount_agg
AS SELECT toStartOfHour(datetime) as hour,
sumState(amount) as amount_sum,
avgState(amount) as amount_avg,
sku
FROM sales GROUP BY hour, sku
Empty until
new data
arrive
Let’s add a dimension to the view
-- Drop view
DROP TABLE sales_amount_mv
-- Update target table
ALTER TABLE sales_amount_agg
ADD COLUMN cust_id UInt32 AFTER sku,
MODIFY ORDER BY (sku, hour, cust_id)
-- Recreate view
CREATE MATERIALIZED VIEW sales_amount_mv TO sales_amount_agg
AS SELECT
toStartOfHour(datetime) as hour,
sumState(amount) as amount_sum,
avgState(amount) as amount_avg,
sku, cust_id
FROM sales GROUP BY hour, sku, cust_id
Empty until
new data
arrive
Conclusion
Key takeaways
● Materialized Views are like automatic triggers
○ Move data from source table to target
● They solve important problems within the database:
○ Speed up query by pre-computing aggregates
○ Make ‘last point’ queries extremely fast
○ Remember aggregates after source rows are dropped
● AggregateFunctions hold partially aggregated data
○ Functions like avgState(X) to put data into the view
○ Functions like avgMerge(X) to get results out
○ Functions like argMaxState(X, Y) link values across columns
● Two “styles” of materialized views
○ TO external table or using implicit ‘inner’ table as target
● Loading with ïŹlters avoids data loss on busy systems
● Schema migration operations are easier with ‘TO’ style of MV
But wait, there’s even more!
● [Many] More use cases for views
○ Different sort orders
○ Using views as indexes
○ Data transformation
● More about table engines and materialized views
● Failures
○ Effects of sync behavior when there are multiple views per table
○ Common materialized view problems
● More tips on schema migration
Thank you!
Special Offer:
Contact us for a
1-hour consultation!
Contacts:
info@altinity.com
Visit us at:
https://www.altinity.com
Free Consultation:
https://blog.altinity.com/offer

Weitere Àhnliche Inhalte

Was ist angesagt?

Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides Altinity Ltd
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house queryCristinaMunteanu43
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareAltinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...Altinity Ltd
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Ltd
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseAltinity Ltd
 
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEOAltinity Ltd
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOAltinity Ltd
 
Fast Insight from Fast Data: Integrating ClickHouse and Apache Kafka
Fast Insight from Fast Data: Integrating ClickHouse and Apache KafkaFast Insight from Fast Data: Integrating ClickHouse and Apache Kafka
Fast Insight from Fast Data: Integrating ClickHouse and Apache KafkaAltinity Ltd
 

Was ist angesagt? (20)

Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
 
Fast Insight from Fast Data: Integrating ClickHouse and Apache Kafka
Fast Insight from Fast Data: Integrating ClickHouse and Apache KafkaFast Insight from Fast Data: Integrating ClickHouse and Apache Kafka
Fast Insight from Fast Data: Integrating ClickHouse and Apache Kafka
 

Ähnlich wie ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity Engineering Team

John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenPostgresOpen
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLJim Mlodgenski
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Chris Adkin
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...InfluxData
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBSeveralnines
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Google Cluster Innards
Google Cluster InnardsGoogle Cluster Innards
Google Cluster InnardsMartin Dvorak
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleSingleStore
 
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster PerformanceWebinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster PerformanceAltinity Ltd
 
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Altinity Ltd
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...Andrew Lamb
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorMasahiko Sawada
 
Extra performance out of thin air
Extra performance out of thin airExtra performance out of thin air
Extra performance out of thin airKonstantine Krutiy
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at MacysDataStax Academy
 
Beyond porting
Beyond portingBeyond porting
Beyond portingCass Everitt
 
NVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăż
NVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăżNVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăż
NVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăżNVIDIA Japan
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksMongoDB
 
Performance measurement and tuning
Performance measurement and tuningPerformance measurement and tuning
Performance measurement and tuningAOE
 

Ähnlich wie ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity Engineering Team (20)

John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDB
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Google Cluster Innards
Google Cluster InnardsGoogle Cluster Innards
Google Cluster Innards
 
Mongo db dla administratora
Mongo db dla administratoraMongo db dla administratora
Mongo db dla administratora
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster PerformanceWebinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
 
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
Extra performance out of thin air
Extra performance out of thin airExtra performance out of thin air
Extra performance out of thin air
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Beyond porting
Beyond portingBeyond porting
Beyond porting
 
NVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăż
NVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăżNVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăż
NVIDIA HPC ă‚œăƒ•ăƒˆă‚Šă‚šă‚ąæ–œă‚èȘ­ăż
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo Seattle
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
 
Performance measurement and tuning
Performance measurement and tuningPerformance measurement and tuning
Performance measurement and tuning
 

Mehr von Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
 

Mehr von Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 

KĂŒrzlich hochgeladen

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

KĂŒrzlich hochgeladen (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity Engineering Team

  • 1. CLICKHOUSE AND THE MAGIC OF MATERIALIZED VIEWS Robert Hodges and the Altinity Engineering Team
  • 2. Quick Introduction to Altinity ● Premier provider of software and services for ClickHouse ○ Enterprise support for ClickHouse and ecosystem projects ○ ClickHouse Feature Development (Geohashing, codecs, tiered storage, log cleansing
) ○ Software (Kubernetes, cluster manager, tools & utilities) ○ POCs/Training ● Main US/Europe sponsor of ClickHouse community
  • 3. Introduction to ClickHouse Understands SQL Runs on bare metal to cloud Stores data in columns Parallel and vectorized execution Scales to many petabytes Is Open source (Apache 2.0) Is WAY fast! a b c d a b c d a b c d a b c d
  • 5. Computing sums on large datasets Our favorite dataset! 1.31 billion rows of data from taxi and limousine rides in New York City Question: What is the average number of persons per taxi in each pickup location?
  • 6. ClickHouse can solve it with a simple query SELECT pickup_location_id, sum(passenger_count) / pc_cnt AS pc_avg, count() AS pc_cnt FROM tripdata GROUP BY pickup_location_id ORDER BY pc_cnt DESC LIMIT 10 ... 10 rows in set. Elapsed: 1.818 sec. Processed 1.31 billion rows, 3.93 GB (721.16 million rows/s., 2.16 GB/s.) Can we make it faster?
  • 7. We can add a materialized view CREATE MATERIALIZED VIEW tripdata_smt_mv ENGINE = SummingMergeTree PARTITION BY toYYYYMM(pickup_date) ORDER BY (pickup_location_id, dropoff_location_id) POPULATE AS SELECT pickup_date, pickup_location_id, dropoff_location_id, sum(passenger_count) AS passenger_count_sum, count() AS trips_count FROM tripdata GROUP BY pickup_date, pickup_location_id, dropoff_location_id Engine How to derive data Auto- load after create How to sum
  • 8. Now we can query the materialized view SELECT pickup_location_id, sum(passenger_count_sum) / sum(trips_count) AS pc_avg, sum(trips_count) AS pc_cnt FROM tripdata_smt_mv GROUP BY pickup_location_id ORDER BY pc_cnt DESC LIMIT 10 ... 10 rows in set. Elapsed: 0.016 sec. Processed 2.04 million rows, 36.66 MB (127.04 million rows/s., 2.29 GB/s.) Materialized view is 113 times faster
  • 9. What’s going on under the covers? cpu (MergeTree) .inner.tripdata_smt_mv (SummingMergeTree) tripdata_smt_mv (materialized view) INSERT SELECT Compressed size: ~43GB Uncompressed size: ~104GB Compressed size: ~13MB Uncompressed size: ~44MB SELECT INSERT (Trigger)
  • 11. Let’s look at aggregates other than sums SELECT min(fare_amount), avg(fare_amount), max(fare_amount), sum(fare_amount), count() FROM tripdata WHERE (fare_amount > 0) AND (fare_amount < 500.) ... 1 rows in set. Elapsed: 4.640 sec. Processed 1.31 billion rows, 5.24 GB (282.54 million rows/s., 1.13 GB/s.) Can we make all aggregates faster?
  • 12. Let’s create a view for all aggregates CREATE MATERIALIZED VIEW tripdata_agg_mv ENGINE = SummingMergeTree PARTITION BY toYYYYMM(pickup_date) ORDER BY (pickup_location_id, dropoff_location_id) POPULATE AS SELECT pickup_date, pickup_location_id, dropoff_location_id, minState(fare_amount) as fare_amount_min, avgState(fare_amount) as fare_amount_avg, maxState(fare_amount) as fare_amount_max, sumState(fare_amount) as fare_amount_sum, countState() as fare_amount_count FROM tripdata WHERE (fare_amount > 0) AND (fare_amount < 500.) GROUP BY pickup_date, pickup_location_id, dropoff_location_id Engine How to derive data Auto- load after create Agg. order Filter on data
  • 13. Digression: How aggregation works fare_amount avgState(fare_amount) avgMerge(fare_amount_avg) Source value Partial aggregate Merged aggregate
  • 14. Now let’s select from the materialized view SELECT minMerge(fare_amount_min) AS fare_amount_min, avgMerge(fare_amount_avg) AS fare_amount_avg, maxMerge(fare_amount_max) AS fare_amount_max, sumMerge(fare_amount_sum) AS fare_amount_sum, countMerge(fare_amount_count) AS fare_amount_count FROM tripdata_agg_mv ... 1 rows in set. Elapsed: 0.017 sec. Processed 208.86 thousand rows, 23.88 MB (12.54 million rows/s., 1.43 GB/s.) Materialized view is 274 times faster
  • 16. Last point problems are common in time series Host 7023 Host 6522 CPU Utilization Host 9601 CPU Utilization CPU UtilizationCPU Utilization CPU UtilizationCPU Utilization CPU Table Problem: Show the current CPU utilization for each host CPU Utilization CPU Utilization
  • 17. ClickHouse can solve this using a subquery SELECT t.hostname, tags_id, 100 - usage_idle usage FROM ( SELECT tags_id, usage_idle FROM cpu WHERE (tags_id, created_at) IN (SELECT tags_id, max(created_at) FROM cpu GROUP BY tags_id) ) AS c INNER JOIN tags AS t ON c.tags_id = t.id ORDER BY usage DESC, t.hostname ASC LIMIT 10 TABLE SCAN! USE INDEX OPTIMIZED JOIN COST
  • 18. SQL queries work but are ineïŹƒcient OUTPUT: ┌─hostname──┬─tags_id─┬─usage─┐ │ host_1002 │ 9003 │ 100 │ │ host_1116 │ 9117 │ 100 │ │ host_1141 │ 9142 │ 100 │ │ host_1163 │ 9164 │ 100 │ │ host_1210 │ 9211 │ 100 │ │ host_1216 │ 9217 │ 100 │ │ host_1234 | 9235 │ 100 │ │ host_1308 │ 9309 │ 100 │ │ host_1419 │ 9420 │ 100 │ │ host_1491 │ 9492 │ 100 │ └───────────┮─────────┮───────┘ Using direct query on table: 10 rows in set. Elapsed: 0.566 sec. Processed 32.87 million rows, 263.13 MB (53.19 million rows/s., 425.81 MB/s.) Can we bring last point performance closer to real-time?
  • 19. Create an explicit target for aggregate data CREATE TABLE cpu_last_point_idle_agg ( created_date AggregateFunction(argMax, Date, DateTime), max_created_at AggregateFunction(max, DateTime), time AggregateFunction(argMax, String, DateTime), tags_id UInt32, usage_idle AggregateFunction(argMax, Float64, DateTime) ) ENGINE = AggregatingMergeTree() PARTITION BY tuple() ORDER BY tags_id TIP: This is a new way to create materialized views in ClickHouse TIP: ‘tuple()’ means don’t partition and don’t sort data
  • 20. argMaxState links columns with aggregates CREATE MATERIALIZED VIEW cpu_last_point_idle_mv TO cpu_last_point_idle_agg AS SELECT argMaxState(created_date, created_at) AS created_date, maxState(created_at) AS max_created_at, argMaxState(time, created_at) AS time, tags_id, argMaxState(usage_idle, created_at) AS usage_idle FROM cpu GROUP BY tags_id Derive data MV table
  • 21. Understanding how argMaxState works created_at maxState(created_at) avgMerge(created_at) Source values Partial aggregates Merged aggregates usage_idle argMaxState(usage_idle, created_at) avgMaxMerge(usage_idle) (Same row) (Pick usage_idle from aggregate with matching created_at) (Pick usage_idle value from any row with matching created_at)
  • 22. Load materialized view using a SELECT INSERT INTO cpu_last_point_idle_mv SELECT argMaxState(created_date, created_at) AS created_date, maxState(created_at) AS max_created_at, argMaxState(time, created_at) AS time, Tags_id, argMaxState(usage_idle, created_at) AS usage_idle FROM cpu GROUP BY tags_id POPULATE keyword is not supported
  • 23. Let’s hide the merge details with a view CREATE VIEW cpu_last_point_idle_v AS SELECT argMaxMerge(created_date) AS created_date, maxMerge(max_created_at) AS created_at, argMaxMerge(time) AS time, tags_id, argMaxMerge(usage_idle) AS usage_idle FROM cpu_last_point_idle_mv GROUP BY tags_id
  • 24. ...Select again from the covering view SELECT t.hostname, tags_id, 100 - usage_idle usage FROM cpu_last_point_idle_v AS b INNER JOIN tags AS t ON b.tags_id = t.id ORDER BY usage DESC, t.hostname ASC LIMIT 10 ... 10 rows in set. Elapsed: 0.005 sec. Processed 14.00 thousand rows, 391.65 KB (2.97 million rows/s., 82.97 MB/s.) Last point view is 113 times faster
  • 25. Another view under the covers cpu (MergeTree) cpu_last_point_idle_agg (AggregatingMergeTree) cpu_last_point_idle_mv (Materialized View) cpu_last_point_idle_v (View) INSERT SELECT SELECT (Trigger) Compressed size: ~3GB Uncompressed size: ~12GB Compressed size: ~29K (!!!) Uncompressed size: ~264K (!!!)
  • 27. It’s common to drop old data in large tables CREATE TABLE traffic ( datetime DateTime, date Date, request_id UInt64, cust_id UInt32, sku UInt32 ) ENGINE = MergeTree PARTITION BY toYYYYMM(datetime) ORDER BY (cust_id, date) TTL datetime + INTERVAL 90 DAY
  • 28. ClickHouse can aggregate unique values -- Find which hours have most unique visits on -- SKU #100 SELECT toStartOfHour(datetime) as hour, uniq(cust_id) as uniqs FROM purchases WHERE sku = 100 GROUP BY hour ORDER BY uniqs DESC, hour ASC LIMIT 10 How can we remember deleted visit data?
  • 29. Create a target table for unique visit data CREATE TABLE traffic_uniqs_agg ( hour DateTime, cust_id_uniqs AggregateFunction(uniq, UInt32), sku UInt32 ) ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(hour) ORDER BY (sku, hour) Need to partition Engine Agg. order Aggregate for unique values
  • 30. Use uniqState to collect unique customers CREATE MATERIALIZED VIEW traffic_uniqs_mv TO traffic_uniqs_agg AS SELECT toStartOfHour(datetime) as hour, uniqState(cust_id) as cust_id_uniqs, Sku FROM traffic GROUP BY hour, sku
  • 31. Now we can remember visitors indeïŹnitely! SELECT hour, uniqMerge(cust_id_uniqs) as uniqs FROM traffic_uniqs_mv WHERE sku = 100 GROUP BY hour ORDER BY uniqs DESC, hour ASC LIMIT 10
  • 32. Check view size regularly! SELECT table, count(*) AS columns, sum(data_compressed_bytes) AS tc, sum(data_uncompressed_bytes) AS tu FROM system.columns WHERE database = currentDatabase() GROUP BY table ┌─table─────────────┬─columns─┬────────tc─┬─────────tu─┐ │ traffic_uniqs_agg │ 3 │ 52620883 │ 90630442 │ │ traffic │ 5 │ 626152255 │ 1463621588 │ │ traffic_uniqs_mv │ 3 │ 0 │ 0 │ └───────────────────┮─────────┮───────────┮────────────┘ Uniq hash tables can be quite large
  • 33. Operational Issues Loading view data in production Schema migration
  • 34. Load past data in the future CREATE MATERIALIZED VIEW sales_amount_mv TO sales_amount_agg AS SELECT toStartOfHour(datetime) as hour, sumState(amount) as amount_sum, sku FROM sales WHERE datetime >= toDateTime('2019-07-12 00:00:00') GROUP BY hour, sku How to derive data Accept only future rows from source
  • 35. Creating a view that accepts future data INSERT INTO sales_amount_agg SELECT toStartOfHour(datetime) as hour, sumState(amount) as amount_sum, sku FROM sales WHERE datetime < toDateTime('2019-01-12 00:00:00') GROUP BY hour, sku Load to view target table Accept only past rows from source
  • 36. Add an aggregate to MV with inner table -- Detach view from server. DETACH TABLE sales_amount_inner_mv -- Update base table ALTER TABLE `.inner.sales_amount_inner_mv` ADD COLUMN `amount_avg` AggregateFunction(avg, Float32) AFTER amount_sum Check order carefully! TIP: Stop source INSERTs to avoid losing data or load data later
  • 37. Add column to MV with inner table (cont.) -- Re-attach view with modifications. ATTACH MATERIALIZED VIEW sales_amount_inner_mv ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(hour) ORDER BY (sku, hour) AS SELECT toStartOfHour(datetime) as hour, sumState(amount) as amount_sum, avgState(amount) as amount_avg, sku FROM sales GROUP BY hour, sku Empty until new data arrive Specify table
  • 38. ‘TO’ option views are easier to change -- Drop view DROP TABLE sales_amount_mv -- Update target table ALTER TABLE sales_amount_agg ADD COLUMN `amount_avg` AggregateFunction(avg, Float32) AFTER amount_sum -- Recreate view CREATE MATERIALIZED VIEW sales_amount_mv TO sales_amount_agg AS SELECT toStartOfHour(datetime) as hour, sumState(amount) as amount_sum, avgState(amount) as amount_avg, sku FROM sales GROUP BY hour, sku Empty until new data arrive
  • 39. Let’s add a dimension to the view -- Drop view DROP TABLE sales_amount_mv -- Update target table ALTER TABLE sales_amount_agg ADD COLUMN cust_id UInt32 AFTER sku, MODIFY ORDER BY (sku, hour, cust_id) -- Recreate view CREATE MATERIALIZED VIEW sales_amount_mv TO sales_amount_agg AS SELECT toStartOfHour(datetime) as hour, sumState(amount) as amount_sum, avgState(amount) as amount_avg, sku, cust_id FROM sales GROUP BY hour, sku, cust_id Empty until new data arrive
  • 41. Key takeaways ● Materialized Views are like automatic triggers ○ Move data from source table to target ● They solve important problems within the database: ○ Speed up query by pre-computing aggregates ○ Make ‘last point’ queries extremely fast ○ Remember aggregates after source rows are dropped ● AggregateFunctions hold partially aggregated data ○ Functions like avgState(X) to put data into the view ○ Functions like avgMerge(X) to get results out ○ Functions like argMaxState(X, Y) link values across columns ● Two “styles” of materialized views ○ TO external table or using implicit ‘inner’ table as target ● Loading with ïŹlters avoids data loss on busy systems ● Schema migration operations are easier with ‘TO’ style of MV
  • 42. But wait, there’s even more! ● [Many] More use cases for views ○ Different sort orders ○ Using views as indexes ○ Data transformation ● More about table engines and materialized views ● Failures ○ Effects of sync behavior when there are multiple views per table ○ Common materialized view problems ● More tips on schema migration
  • 43. Thank you! Special Offer: Contact us for a 1-hour consultation! Contacts: info@altinity.com Visit us at: https://www.altinity.com Free Consultation: https://blog.altinity.com/offer