4. Window Functions
• Window functions were introduced in SQL:2003. The
last expansion was in the latest version of the standard,
SQL:2011.
• A window function looks at “windows” of your data while
processing it, which improves the efficiency of query
execution.
• Identified by the OVER clause
• Window functions
– can help eliminate expensive subqueries
– can help eliminate self-joins
– make queries more readable
– make queries faster
More efficient and readable
queries, especially powerful
when analyzing data
6. Window Functions
Data Series Example
SELECT
time, value
FROM data_points
ORDER BY time;
SELECT
time, value,
avg(value) over (ORDER BY time
ROWS BETWEEN 6 PRECEDING
AND 6 FOLLOWING)
FROM data_points
ORDER BY time;
7. Window Functions
Data Series Example
SELECT timestamp, transaction_id, customer_id,
amount, (SELECT sum(amount)
FROM transactions AS t2
WHERE t2.customer_id = t1.customer_id AND
t2.timestamp <= t1.timestamp) AS
balance FROM transactions AS t1
ORDER BY customer_id, timestamp;
SELECT timestamp, transaction_id, customer_id, amount,
sum(amount) OVER (PARTITION BY customer_id
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED
PRECEDING AND CURRENT ROW) AS balance
FROM transactions AS t1
ORDER BY customer_id, timestamp;
Query using Window
function
Query without
Window function
8. Window Functions
Example:
Given a set of bank
transactions,
compute the account balance
after each transaction
*Test done in a developer environment
#Rows Regular SQL
(seconds)
Regular SQL +
Index (seconds)
Window
Functions
(seconds)
10 000 0.29 0.01 0.02
100 000 2.91 0.09 0.16
1 000 000 29.1 2.86 3.04
10 000 000 346.3 90.97 43.17
100 000 000 4357.2 813.2 514.24
9. Window Functions
• MariaDB supports
– ROWS and RANGE-type frames
– "Streamable" window functions: ROW_NUMBER, RANK,
DENSE_RANK
– Window functions that can be streamed once the number of
rows in partition is known: PERCENT_RANK,
CUME_DIST, NTILE
– Aggregate functions that are currently supported as window
functions are: COUNT, SUM, AVG, BIT_OR, BIT_AND,
BIT_XOR.
– Aggregate functions with the DISTINCT specifier (e.g.
COUNT( DISTINCT x)) are not supported as window
functions.
Supported Functions
11. CTE
• Hierarchical and recursive queries for SQL were
introduced in SQL:1999 and were implemented as
common table expressions (CTEs)
• A Common Table Expression is basically a temporary and
named result set, derived from a simple query
• A CTE
– is identified by a WITH clause
– Similar to derived tables in the FROM clause
– More expressive and provide cleaner code
– Can produce more efficient query plans
– Can be used with SELECT and EXPLAIN
Refer to a subquery
expression many times in a
query. A temporary table that
only exists for the duration of
a query.
12. Common Table Expression
Example
WITH sales_product_year AS (
SELECT
product,
year(ship_date) as year,
SUM(price) as total_amt
FROM
item_sales
GROUP BY
product, year
)
SELECT *
FROM
sales_product_year CUR,
sales_product_year PREV,
WHERE
CUR.product = PREV.product AND
CUR.year = PREV.year + 1 AND
CUR.total_amt > PREV.total_amt;
Use CTE
Define CTE
14. JSON
• JSON (Java Script Object Notation), a text based and
platform independent data exchange format
• MariaDB Server provides
– JSON functions which gives users the highest flexibility to
work with JSON based data
– stored in string based data types like VARCHAR.
• Storing JSON data in MariaDB Server is combining
relational data with the world of NoSQL
• MariaDB Server currently supports 24 JSON functions to
evaluate, query and modify JSON formatted strings
JSON- and GeoJSON
Functions for MariaDB
15. JSON Example
Working with JSON Data
CREATE TABLE jsontable (
id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
jsonfield VARCHAR(1024),
category VARCHAR(20) as
(JSON_VALUE(jsonfield,'$.category')),
KEY jsonkey (category),
CHECK (JSON_VALID(jsonfield)));
Field for JSON Data
Virtual column for
Index on JSON key
Constraint to
validate JSON
format
16. JSON Example
Insert JSON Data
INSERT INTO jsontable (id,jsonfield) VALUES(NULL,
'{"category": "Software","product": "MaxScale",
"version": "2.0.0"}');
INSERT INTO jsontable (id,jsonfield) VALUES(NULL,
'{"category": "Service","service_type": "Training",
"class": "DBA"}');
SELECT * FROM jsontableG
******************* 1. row ***********************
id: 1
jsonfield: {"category": "Software","product":
"MaxScale", "version": "2.0.0"}
category: Software
Insert like a string
JSON_VALUE based
Virtual Column
entry
17. JSON Example
Validating JSON Format
INSERT INTO jsontable (id,jsonfield) VALUES(NULL,
'{"category" - "Software","product": "MaxScale",
"version": "2.0.0"}');
ERROR 4025 (23000): CONSTRAINT `CONSTRAINT_1`
failed for `test`.`jsontable`
Insert non JSON
format
Validation
JSON_VALID fails
18. JSON Function Examples
SELECT JSON_ARRAY(56, 3.1416, 'My name is "Foo"', NULL);
+--------------------------------------------------+
| Json_Array(56, 3.1416, 'My name is "Foo"', NULL) |
+--------------------------------------------------+
| [56, 3.1416, "My name is "Foo"", null] |
+--------------------------------------------------+
SELECT JSON_OBJECT("id", 1, "name", "Monty");
+---------------------------------------+
| JSON_OBJECT("id", 1, "name", "Monty") |
+---------------------------------------+
| {"id": 1, "name": "Monty"} |
+---------------------------------------+
JSON array from
listed values
JSON object from
key,value pairs
20. GeoJSON
• With MariaDB Server 10.2 GeoJSON functions have been
added to the existing functions used to work with spatial
data types, like POINT, LINESTRING, POLYGON
• ST_AsGeoJSON is used to convert a geometric data type
into a GeoJSON format
• ST_GeomFromGeoJSON is used to convert a GeoJSON
based description into a geometric format
• The GeoJSON format follows the open standard for
encoding geometric features - http://geojson.org
GeoJSON functions for
converting a geometry to a
GeoJSON element or vise
versa
21. JSON Function
Examples
SELECT ST_AsGeoJSON(ST_GeomFromText('POINT(5.3 7.2)'));
+-------------------------------------------------+
| ST_AsGeoJSON(ST_GeomFromText('POINT(5.3 7.2)')) |
+-------------------------------------------------+
| {"type": "Point", "coordinates": [5.3, 7.2]} |
+-------------------------------------------------+
SET @j = '{ "type": "Point", "coordinates": [5.3, 15.0]}';
SELECT ST_AsText(ST_GeomFromGeoJSON(@j));
+-----------------------------------+
| ST_AsText(ST_GeomFromGeoJSON(@j)) |
+-----------------------------------+
| POINT(5.3 15) |
+-----------------------------------+
ST_AsGeoJSON
returns the given
geometry as a
GeoJSON element
ST_GeomFromGeoJSON
given a GeoJSON input,
returns a geometry
object:
23. New Replication
Features
• Delayed Replication
– allows specifying a slave to lag behind the master
– CHANGE MASTER TO master_delay=3600;
• Restrict the speed of reading binlog from Master
– The option read_binlog_speed_limit can be used to
restrict the speed to read the binlog from the master
– Reduces the load on the master when slaves need to
catch up
• Compressed Binary Log
– Reduced Binary Log size and network traffic
– Binary Log events are compressed on the master
– The slave IO thread is uncompressing them before writing
them into the Relay Log
– Compression is controlled by log_bin_compress and
log_bin_compress_min_len
New Replication Features
help to reduce load on Master,
disk space, network
bandwidth usage
25. Database
Compatibility
• Multi-Trigger Support per Type
– Beginning with MariaDB Server 10.2, multiple triggers
of the same type (BEFORE, AFTER) can be created per
table
– The CREATE TRIGGER Statement now allows to define
[{ FOLLOWS | PRECEDES } other_trigger_name ]
when creating a trigger
• CHECK constraints allow data validation per column
– Check constraints helps a DBA to enforce data consistency
on the database server level without the need to implement
triggers
• EXECUTE IMMEDIATE for dynamic SQL combines
prepare, execute and deallocate prepare into one
action
• DECIMAL increased from 30 to 38 digits
• Support of expressions for DEFAULT
Increased Database
Compatibility by reducing
limitation based workflows
can now be built based on
triggers
26. CHECK Constraints
Examples
CREATE TABLE t1 (a INT CHECK (a>2), b INT CHECK (b>2),
CONSTRAINT a_greater CHECK (a>b));
CREATE TABLE t2 (name VARCHAR(30) CHECK
(CHAR_LENGTH(name)>2), start_date DATE,
end_date DATE CHECK (start_date IS NULL OR end_date IS
NULL OR start_date<end_date));
CREATE TABLE jsontable (
id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
jsonfield VARCHAR(1024),
category VARCHAR(20) as
(JSON_VALUE(jsonfield,'$.category')),
KEY jsonkey (category),
CHECK (JSON_VALID(jsonfield)));
Numeric constraints
and comparisons
Date comparisons and
character length
Validation by using
functions
28. Enhancements
from MySQL
InnoDB 5.7
• Some of the InnoDB 5.7 enhancements are:
– Indexes for spatial data types
– VARCHAR size can be increased using an in-place
ALTER TABLE, if the length stays between 0 and 255 or
higher than 255
– improved DDL performance for InnoDB temporary
tables
– InnoDB internal improvements and better control via
parameters
– On Fusion-io Non-Volatile Memory (NVM) file
systems the doublewrite buffer is automatically disabled
for system tablespace files
– Optimized crash recovery (tablespace discovery)
MySQL Server 5.7 has
introduced
enhancements to
InnoDB, some selected
changes have been
merged to MariaDB
Server
29. MyRocks Storage
Engine
• For workloads requiring higher compression and IO
efficiency
• Higher performance for web scale type applications
• LSM (Log-structured merge) architecture allows very
efficient data ingestion
• Extract of the available features
– Column Families
– Transactions and WriteBatchWithIndex
– Backup and Checkpoints
– Merge Operators
– Manual Compactions Run in Parallel with Automatic
Compactions
– Persistent Cache
– Bulk loading
– Delete files in range
– Pin iterator key/value
MyRocks in MariaDB,
compression and IO
efficiency based on
Facebook’s MyRocks.
31. Performance
Optimizations
• Enhanced Performance for creating Connections
– The way to create new connections has been optimized
– This is especially good for applications, which are using
non-persistent connections
• Indexes for Virtual Columns
– Before MariaDB Server 10.2 indexes could not be defined
for virtual columns
– With supporting indexes, virtual columns also can be used
efficiently for searching
• New Option to define a directory for InnoDB temporary
files
– By using a separate directory for InnoDB temporary files
separate disks and types of disks can be used, which is
reducing IO waits
Several changes have
been added to
MariaDB Server to
improve
Performance
33. New Security
related User
options
• Per User Server Load Limitations
– Limit to the number of queries, updates or
connections the user can place or make per hour.
– Limit to the number of simultaneous connections
that the user can hold
• Enforced TLS connections
– SSL/TLS encryption options have been introduced
for users, permitting only encrypted connections for a
user
– CREATE and ALTER USER now include the optional
parameters SSL, X509, ISSUER, SUBJECT, CIPHER
Limit the load a user
can add to the server;
enforce secure
connections
34. Security Syntax Enhancements
Examples
CREATE USER foo
WITH MAX_QUERIES_PER_HOUR 10
MAX_UPDATES_PER_HOUR 20
MAX_CONNECTIONS_PER_HOUR 30
MAX_USER_CONNECTIONS 40;
CREATE USER 'foo4'@'test'
REQUIRE ISSUER 'foo_issuer'
SUBJECT 'foo_subject'
CIPHER 'text;
Per User Server
Load Limitation
Enforced TLS
Connections
36. New options for
DBAs
• New functions for User Management
– ALTER USER
– SHOW CREATE USER
• Enhanced Informations from EXPLAIN
– EXPLAIN FORMAT=JSON for detailed information on
sort_key an outer_ref_condition
• User defined variables added to Information Schema
– Information schema plugin to report all user defined
variables via the Information Schema
USER_VARIABLES Table
• Binary Log based Flashback
– The MariaDB Flashback feature allows DBAs to roll back
instances, databases or tables to a given timestamp
37. Summary - What’s New
Analytics SQL ● Window Functions
● Common Table Expressions (CTE)
JSON ● JSON Functions
● GeoJSON Functions
Replication ● Delayed Replication
● Restrict the speed of reading binlog from Master
● Compressed Binary Log
Database Compatibility ● Multi-Trigger Support
● CHECK Constraint Expression Support
● EXECUTE IMMEDIATE statement
● Support for DEFAULT with expressions
● DECIMAL increased from 30 to 38 digits
Storage Engine
Enhancements
● New Storage Engine MyRocks based on RocksDB from Facebook
● Enhancements from MySQL InnoDB 5.7
● Enable InnoDB NUMA interleave for InnoDB
38. Summary - What’s New
Security ● Per User Server Load Limitations
● Enforced TLS Connections
Administration ● New functions for User Management
● Enhanced Informations from EXPLAIN
● User defined variables added to Information Schema
● Binary Log based Flashback
Performance ● Enhanced Performance for creating Connections
● Indexes for Virtual Columns
● New Option to define a directory for InnoDB temporary files
Server-Internal
Optimisations
● Internal Use of MariaDB Connector/C
● Optimizer Enhancements
● Non-Locking ANALYZE TABLE
Other Enhancements ● Lifted limitations for Virtual Computed Columns
● Subquery-Support for Views
● Multiple Use of Temporary Tables in Query
40. 40
MariaDB MaxScale
Database Proxy platform for
● Scalability
● Security
● High Availability
● Data Streaming
● Latest GA Version 2.0
● Soon to be GA version 2.1
● Part of MariaDB Enterprise Offering
MaxScale
MaxScale
41. MariaDB MaxScale Concept
Generic Core
▪ Multi-threaded
▪ e-poll based
▪ Stateless
▪ Shares the thread pool
Flexible, easy to write plug-ins for
▪ Protocol support
▪ Authentication
▪ Database monitoring
▪ Load balancing and routing
▪ Query Transformation and logging
ProtocolCore
Protocol
Authentication
Monitor
Filter&
Logging
Routing
Database
Servers
MaxScale
MaxScale
Master
Slaves
Binlog Cache
● Insulates client applications from the
complexities of backend database
cluster
● Simplify replication from
database to other databases
Client
42. What is new in MariaDB MaxScale 2.1 ?
• Performance
– Up to 2.8x improvement over MaxScale 2.0
• Security
– Encrypted binlog server files
– SSL between binlog server and Master/Slave
– LDAP/GSSAPI Authentication support
– Selective Data Masking
– Maximum rows returned limit
– Prepared Statement filtering by database firewall
• Scalability
– Aurora Cluster support
– Consistent Critical Read with Master Pinning
– Query Cache Filter
– Streaming Insert Plugin
• Ease of use
– Dynamic Configuration of server, monitor, listener and firewall rules
43. Up to 2.8 x Performance
Gain
MaxScale 2.1 Performance Gain
Read Only
Read Write
QPSQPS
Number of Connections
44. Security in MaxScale 2.1
• Existing in MaxScale 2.0
– End to End SSL for transport layer security between applications, proxy & databases
– Black and white list with Database Firewall Filter for SQL Injection protection
– Connection Rate Limitation for DDoS protection
• New in MaxScale 2.1
– Encrypted binlog server files
– SSL between binlog server and Master/Slave
– LDAP/GSSAPI Authentication support
– Selective Data Masking
– Maximum rows returned limit
– Prepared Statement and Functions filtering by database firewall
45. Security
• Secured Binlog Server
– Encrypt binlog server files on MaxScale
– SSL between binlog server and Master/Slave
• Secured user access
– LDAP/GSSAPI for secured single sign-on
across OS platforms(windows, linux),
applications and databases
Slaves
Master
Slaves
SSL
SSL
SSL
SSL
Binlog Cache
Binlog Cache
MaxScale
MaxScale
46. Security
• HIPPA/PCI Compliance: Selective Data Masking based on column name
– Database name, Table name classifier may be provided
• commerceDb.customerTbl.creditcardNum
• customerTbl.creditcardNum
• credicardNum
SELECT Name, creditcardNum, balance
FROM customerTbl
WHERE id=1001
Name creditcardNum balance
---------------------------------------
John Smith xxxxxxxxxx
1201.07
Client
MaxScale
Database Servers
47. Security
• DDoS Protection
• Maximum rows filter
– Return zero rows to client if
number of rows in result set
exceeds configured max limit
– Return zero rows to client if
the size of result set exceeds
configured max size in KB
MaxRowsLimit
Filter
Max Rows Limit = 500
NumRows Returned >
MaxRows Limit
NumRows returned = 1000
5
Query failed: 1141
Error: No rows returned
Query
Query
4
1
32
MaxScale
Clients
Database Servers
48. Security
• SQL Injection Protection
– Prepared Statement filtering by database firewall
• Apply database firewall rules to prepared statements
– Function name filtering by database firewall
• Black/White list query based on specific function being called in the SQL
49. Scalability with MaxScale 2.1
• Existing in MaxScale 2.0
– Transaction Scaling to support user growth and simplify applications
• MariaDB Master/Slave and MariaDB Galera Cluster
– Load balancing
– Database aware dynamic query routing
– Traffic profile based routing
– Replication Scaling to support web-scale applications’ user base
• Binlog Server for horizontal scaling of slaves in Master/Slave architecture
– Multi-tenant database scaling to transparently grow tenants and data volume
• Schema sharding
– Connection Rate Limitation
• New in MaxScale 2.1
– Aurora Cluster Monitor
– Multi-master and Failover mode for MySQL Monitor
– Consistent Critical Read Filter: Master Pinning
50. Aurora Cluster Monitoring and Routing
• Aurora Cluster Monitor
– Detect read replicas and write node in Aurora Cluster
– Supports launchable scripts on monitored events like
other monitors
– Read-Write Splitting, Read Connect Routing, Schema
Routing now can be used for Aurora Cluster Read
Write
Primary Read Replicas
MaxScale
51. Consistent Critical Read
• Master Pinning: Consistent Critical Read Filter
– Detects a statement that would modify the database and route all subsequent statement
to the master server where data is guaranteed to be in a up-to-date state
52. Query Cache
• New in MaxScale 2.1
– Query Cache Filter
• Cache query results in
MaxScale for configurable
timeout
• For the configurable timeout,
return results from cache
before being refreshed from to
server
• Configured like any other filter.
caching rules can be
specified in a separate json-file
In memory LRU cache.
MaxScale
Clients
Database Servers
53. Batch Insert
• New in MaxScale 2.1
– Streaming Insert Plugin
• Convert all INSERT statements done inside an explicit transaction into LOAD DATA LOCAL INFILE
• Useful for client applications doing many inserts within a transaction
• Improves performance of data load
INSERT INTO test.t1 VALUES
(“John Doe”, 1234, “1234-5678-ABCD”),
(“Jane Doe”, 4321, “8765-4321-DCBA”);
“John Doe”, 1234, “1234-5678-ABCD”
“Jane Doe”, 4321, “8765-4321-DCBA”
LOAD DATA LOCAL INFILE ‘maxscale.data’ INTO TABLE
test.t1 FIELDS TERMINATED BY ‘,’ LINES TERMINATED BY
‘n’;
54. Ease of Use
• New in MaxScale 2.1
– Dynamically configure server, listener, monitor
• Add/Remove new server to configuration
• Add/Remove new listener to configuration
• Add/Remove monitor to configuration
• No MaxScale restart require
– Dynamically configure database firewall rules
• Modify firewall rules while MaxScale is running
– Support for text wildcards in hostnames
– New global setting log_throttling to throttle flooding of log file
55. Future
• MariaDB 10.2 Support
• Cross Data Center HA enablement
• PAMD Authentication
• More performance enhancement
• Additional protocols
• Tighter integration with MariaDB
ColumnStore
MaxScale