Webinar - MariaDB Temporal Tables: a demonstration

Federico Razzoli
Federico RazzoliVettabase Founder um Vettabase
MariaDB Temporal Tables:
A Demonstration
● Why track data changes?
● System-versioned tables
● Application-period tables
● Bitemporal tables
● A word on MindsDB
Agenda
Tracking data
changes
● Auditing
● Travel back in time
● Compare today situation with 6 months ago
● Statistics on data changes
● Find correlations
● History of an entity
● Debug
Tracking Data Changes: WHY?
● There are many ways to track data changes.
● Most commonly, they involve having a consumer that reads the
binary log and send changes to other technologies, like Kafka.
● Great for analytics, message queues, auditing.
● But the changes are:
○ Replicated asynchronously
○ Not available to the application
Tracking Data Changes
In-Database data changes tracking methods:
● Logging row versions into a table
● Logging each value change into a table
● Temporal tables
Tracking Data Changes
Advantages of Temporal Tables:
● The versioning logic is transparent
● Rotation can be automated
● Faster and more scalable
Tracking Data Changes
Temporal Tables
Overview
Existing implementations (I know about):
● Oracle 11g (2007)
● IBM Db2 (2012)
● SQL Server (2016)
● Snowflake
Temporal Tables Overview
Existing implementations (I know about):
● PostgreSQL has a temporal tables extension
● CockroachDB
● CruxDB
● HBase (kind of)
Temporal Tables Overview
● MariaDB 10.3: system-versioned tables
● MariaDB 10.4: application period tables
A table can implement both. It's called a bitemporal table.
Temporal Tables Overview
● Rows are versioned
● Every row has 2 timestamps, the start & end of that version
validity
● INSERT, UPDATE, DELETE modify those timestamps in a
transparent way
● Plain SQL SELECTs only return current data
● Using temporal syntax, we can query past data
System-Versioned
● Works best to describe events with a start and an end
● Especially when some events cannot overlap
● Timestamps are written explicitly by the application
● But UPDATE and DELETE can automagically shrink or split
periods
● Apart from this, they are regular tables that you use with normal
SQL syntax
Application-Period Tables
● Not understanding this damages projects.
● If you work for a vendor, whether you want to say it or not, feel
free to correct any mistake I might make
Temporal Tables Overview
System-Versioned
Tables
● Create a sysver table:
CREATE TABLE tbl_name (
…
valid_since TIMESTAMP(6) GENERATED ALWAYS AS ROW START
INVISIBLE,
valid_until TIMESTAMP(6) GENERATED ALWAYS AS ROW END
INVISIBLE,
PERIOD FOR SYSTEM_TIME(valid_since, valid_until)
)
WITH SYSTEM VERSIONING
;
System-Versioned Tables
Best practices:
● You could omit the column names, but then you won't be able to
use them in queries
● You can use different names, but I recommend you always use
the same names
● You could use visible columns, but most of the times you don't
want to see them
System-Versioned Tables
● An existing table can be made sysver:
ALTER TABLE tbl_name
ADD COLUMN valid_since TIMESTAMP(6) GENERATED
ALWAYS AS ROW START INVISIBLE,
ADD COLUMN valid_until TIMESTAMP(6) GENERATED
ALWAYS AS ROW END INVISIBLE,
ADD PERIOD FOR SYSTEM_TIME(valid_since,
valid_until),
ADD SYSTEM VERSIONING
;
System-Versioned Tables
Best practices:
● Making one, multiple, or even all tables sysver is not a risky
operation - but you never know
● You can do this on a new replica that is used by analysts or
programs that read historical data
● For such replicas it's usually ok not to use an LTS version
● Once you're confident enough, you can make the change on the
master
System-Versioned Tables
● In both cases (new or existing table), it's practical to create one
or more separate partitions for historical data:
ALTER TABLE tbl_name
PARTITION BY SYSTEM_TIME (
PARTITION p_history1 HISTORY,
… ,
PARTITION p_current CURRENT
)
;
System-Versioned Tables
How to delete history:
● Remove history before a point in time:
DELETE HISTORY FROM tbl_name
BEFORE SYSTEM_TIME '2020-01-01 00:00:00';
● Remove whole history:
DELETE HISTORY FROM tbl_name;
● Remove history and make the table non-sysver:
ALTER TABLE t DROP SYSTEM VERSIONING;
● Remove history and current data:
TRUNCATE TABLE tbl_name;
System-Versioned Tables
● GDPR and possibly some other regulations guarantee the
Right To Be Forgotten (RTBF)
● This means that we can't keep the whole history of columns that
contain Personal Identifiable Information (PII)
● To exclude these columns from a table history:
CREATE TABLE user (
…
email VARCHAR(100) NOT NULL WITHOUT SYSTEM VERSIONING,
…
)
WITH SYSTEM VERSIONING
;
Right To Be Forgotten
Application-Period
Tables
● Creating an Application-Period table:
CREATE OR REPLACE TABLE reservation (
uuid UUID DEFAULT UUID(),
bungalow_name VARCHAR(100) NOT NULL,
client_name VARCHAR(100) NOT NULL,
start_date DATE,
end_date DATE,
PRIMARY KEY (uuid, start_date),
PERIOD FOR reservation (start_date, end_date)
);
System-Versioned Tables
● If you don't use periods explicitly, it will be a regular table
● But you can manipulate periods with this syntax:
○ DELETE FROM <table_name>
FOR PORTION OF <period_name>
FROM <date1> TO <date2>
○ UPDATE <table_name>
FOR PORTION OF <period_name>
FROM <date1> TO <date2>
System-Versioned Tables
Bitemporal Tables
● Combine the syntaxes of sysver and application-period tables to
obtain a bitemporal table
● This table will store two separate pairs of timestamps:
○ When the row was physically inserted/deleted/updated
○ The boundaries of the represented period
System-Versioned Tables
Example:
● 2018/01/10 - Customer registers, she lives in Glasgow
● 2022/05/01 - Customer relocates to Inverness
● 2022/06/01 - Customer orders a product
● 2022/06/02 - Customer changes her address in her profile, and
correctly dates the change to 2022/05/01
Customer never received the parcel. Our temporal table allows us
to track this chronology and point out that
the customer communicated her address change too late.
System-Versioned Tables
A note of MindsDB
● If you built Temporal Tables, you have something similar to
(but slightly more complex than) a time series
● Do you know that you can query future data?
System-Versioned Tables
● MindsDB is an AI-based virtual database
● It connects to a huge range of external data sources,
including MariaDB
● It accepts SQL queries
● The results are calculated using Machine Learning algorithms
System-Versioned Tables
So, for example, if you have data about your sales in the last 2
years, you can obtain a forecast about the next 6 months
Vettabase is MindsDB partner.
We maintain their MySQL integration.
System-Versioned Tables
1 von 32

Más contenido relacionado

Similar a Webinar - MariaDB Temporal Tables: a demonstration(20)

Webinar - MariaDB Temporal Tables: a demonstration

  • 2. ● Why track data changes? ● System-versioned tables ● Application-period tables ● Bitemporal tables ● A word on MindsDB Agenda
  • 4. ● Auditing ● Travel back in time ● Compare today situation with 6 months ago ● Statistics on data changes ● Find correlations ● History of an entity ● Debug Tracking Data Changes: WHY?
  • 5. ● There are many ways to track data changes. ● Most commonly, they involve having a consumer that reads the binary log and send changes to other technologies, like Kafka. ● Great for analytics, message queues, auditing. ● But the changes are: ○ Replicated asynchronously ○ Not available to the application Tracking Data Changes
  • 6. In-Database data changes tracking methods: ● Logging row versions into a table ● Logging each value change into a table ● Temporal tables Tracking Data Changes
  • 7. Advantages of Temporal Tables: ● The versioning logic is transparent ● Rotation can be automated ● Faster and more scalable Tracking Data Changes
  • 9. Existing implementations (I know about): ● Oracle 11g (2007) ● IBM Db2 (2012) ● SQL Server (2016) ● Snowflake Temporal Tables Overview
  • 10. Existing implementations (I know about): ● PostgreSQL has a temporal tables extension ● CockroachDB ● CruxDB ● HBase (kind of) Temporal Tables Overview
  • 11. ● MariaDB 10.3: system-versioned tables ● MariaDB 10.4: application period tables A table can implement both. It's called a bitemporal table. Temporal Tables Overview
  • 12. ● Rows are versioned ● Every row has 2 timestamps, the start & end of that version validity ● INSERT, UPDATE, DELETE modify those timestamps in a transparent way ● Plain SQL SELECTs only return current data ● Using temporal syntax, we can query past data System-Versioned
  • 13. ● Works best to describe events with a start and an end ● Especially when some events cannot overlap ● Timestamps are written explicitly by the application ● But UPDATE and DELETE can automagically shrink or split periods ● Apart from this, they are regular tables that you use with normal SQL syntax Application-Period Tables
  • 14. ● Not understanding this damages projects. ● If you work for a vendor, whether you want to say it or not, feel free to correct any mistake I might make Temporal Tables Overview
  • 16. ● Create a sysver table: CREATE TABLE tbl_name ( … valid_since TIMESTAMP(6) GENERATED ALWAYS AS ROW START INVISIBLE, valid_until TIMESTAMP(6) GENERATED ALWAYS AS ROW END INVISIBLE, PERIOD FOR SYSTEM_TIME(valid_since, valid_until) ) WITH SYSTEM VERSIONING ; System-Versioned Tables
  • 17. Best practices: ● You could omit the column names, but then you won't be able to use them in queries ● You can use different names, but I recommend you always use the same names ● You could use visible columns, but most of the times you don't want to see them System-Versioned Tables
  • 18. ● An existing table can be made sysver: ALTER TABLE tbl_name ADD COLUMN valid_since TIMESTAMP(6) GENERATED ALWAYS AS ROW START INVISIBLE, ADD COLUMN valid_until TIMESTAMP(6) GENERATED ALWAYS AS ROW END INVISIBLE, ADD PERIOD FOR SYSTEM_TIME(valid_since, valid_until), ADD SYSTEM VERSIONING ; System-Versioned Tables
  • 19. Best practices: ● Making one, multiple, or even all tables sysver is not a risky operation - but you never know ● You can do this on a new replica that is used by analysts or programs that read historical data ● For such replicas it's usually ok not to use an LTS version ● Once you're confident enough, you can make the change on the master System-Versioned Tables
  • 20. ● In both cases (new or existing table), it's practical to create one or more separate partitions for historical data: ALTER TABLE tbl_name PARTITION BY SYSTEM_TIME ( PARTITION p_history1 HISTORY, … , PARTITION p_current CURRENT ) ; System-Versioned Tables
  • 21. How to delete history: ● Remove history before a point in time: DELETE HISTORY FROM tbl_name BEFORE SYSTEM_TIME '2020-01-01 00:00:00'; ● Remove whole history: DELETE HISTORY FROM tbl_name; ● Remove history and make the table non-sysver: ALTER TABLE t DROP SYSTEM VERSIONING; ● Remove history and current data: TRUNCATE TABLE tbl_name; System-Versioned Tables
  • 22. ● GDPR and possibly some other regulations guarantee the Right To Be Forgotten (RTBF) ● This means that we can't keep the whole history of columns that contain Personal Identifiable Information (PII) ● To exclude these columns from a table history: CREATE TABLE user ( … email VARCHAR(100) NOT NULL WITHOUT SYSTEM VERSIONING, … ) WITH SYSTEM VERSIONING ; Right To Be Forgotten
  • 24. ● Creating an Application-Period table: CREATE OR REPLACE TABLE reservation ( uuid UUID DEFAULT UUID(), bungalow_name VARCHAR(100) NOT NULL, client_name VARCHAR(100) NOT NULL, start_date DATE, end_date DATE, PRIMARY KEY (uuid, start_date), PERIOD FOR reservation (start_date, end_date) ); System-Versioned Tables
  • 25. ● If you don't use periods explicitly, it will be a regular table ● But you can manipulate periods with this syntax: ○ DELETE FROM <table_name> FOR PORTION OF <period_name> FROM <date1> TO <date2> ○ UPDATE <table_name> FOR PORTION OF <period_name> FROM <date1> TO <date2> System-Versioned Tables
  • 27. ● Combine the syntaxes of sysver and application-period tables to obtain a bitemporal table ● This table will store two separate pairs of timestamps: ○ When the row was physically inserted/deleted/updated ○ The boundaries of the represented period System-Versioned Tables
  • 28. Example: ● 2018/01/10 - Customer registers, she lives in Glasgow ● 2022/05/01 - Customer relocates to Inverness ● 2022/06/01 - Customer orders a product ● 2022/06/02 - Customer changes her address in her profile, and correctly dates the change to 2022/05/01 Customer never received the parcel. Our temporal table allows us to track this chronology and point out that the customer communicated her address change too late. System-Versioned Tables
  • 29. A note of MindsDB
  • 30. ● If you built Temporal Tables, you have something similar to (but slightly more complex than) a time series ● Do you know that you can query future data? System-Versioned Tables
  • 31. ● MindsDB is an AI-based virtual database ● It connects to a huge range of external data sources, including MariaDB ● It accepts SQL queries ● The results are calculated using Machine Learning algorithms System-Versioned Tables
  • 32. So, for example, if you have data about your sales in the last 2 years, you can obtain a forecast about the next 6 months Vettabase is MindsDB partner. We maintain their MySQL integration. System-Versioned Tables