SlideShare ist ein Scribd-Unternehmen logo
1 von 112
Downloaden Sie, um offline zu lesen
MariaDB workshop
Alex Chistyakov, Git in Sky
Outline
- Tables and DDL
- Queries and DML
- Indexes and compound indexes
- Transactions and how they work, isolation levels
- Authorization and authentication, client protocol
Outline
- Basics of performance monitoring
- Notion of replication, types of replication
- Traditional replication in details
- Galera cluster and how it works
- MMM, PRM and query proxying
What’s in a box?
- Ubuntu 16.04.2
- Python 2.7.12
- MariaDB 10.0.29
- Sakila DB, Employees DB
- Percona Toolkit 2.2.16
- Anemometer
How to use Vagrant
- Create an empty folder
- Download https://goo.gl/ap6r6E there (rename it to
‘Vagrantfile’)
- Run ‘vagrant up’ in that folder
- Wait until a VM starts
- Run ‘vagrant ssh’ to get in
- My .mysql_history: https://goo.gl/AyrTW7
- Tables and DDL
- Queries and DML
- Indexes and compound indexes
- Transactions and how they work, isolation levels
- Authorization and authentication, client protocol
What is a table?
- A collection of related data
What is a table?
- A collection of related data
- Consists of columns and rows
What is a table?
The DDL
- Manipulates the database structure (also called schema)
DDL statements
- CREATE
- ALTER
- DROP
- TRUNCATE
- RENAME
How to create a table?
CREATE TABLE language (
language_id TINYINT UNSIGNED NOT NULL AUTO_INCREMENT,
name CHAR(20) NOT NULL,
last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (language_id)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;
Primary keys
- Identify a record uniquely
- So, adding two equal keys is not possible
- Can be natural like “passport number”
- Or surrogate
- Surrogate keys are auto-generated on the DB side
Natural PKs can be composite
CREATE TABLE film_actor (
actor_id SMALLINT UNSIGNED NOT NULL,
film_id SMALLINT UNSIGNED NOT NULL,
last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (actor_id,film_id),
KEY idx_fk_film_id (`film_id`),
CONSTRAINT fk_film_actor_actor FOREIGN KEY (actor_id) REFERENCES actor (actor_id),
CONSTRAINT fk_film_actor_film FOREIGN KEY (film_id) REFERENCES film (film_id)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;
Autoincrement primary keys
- Are surrogate
- Are 1,2,3,4 or 8 bytes long
- BTW, INT(10) is 4 bytes long
- Are incremented on every INSERT
- Should be always used
- BTW, InnoDB table is a clustered index* around its PK
- If no explicit PK exists 6-byte row ID will be used
Exercise #1
- Create a table
- Tables and DDL
- Queries and DML
- Indexes and compound indexes
- Transactions and how they work, isolation levels
- Authorization and authentication, client protocol
A trivial SELECT query
- SELECT * FROM employees WHERE hire_date='1986-06-26'
- Please, never use “SELECT *”, always select certain
columns!
- A slightly better version:
- SELECT emp_no, first_name, last_name FROM employees
WHERE hire_date='1986-06-26';
Using a single table is impractical*
- Four types of JOINs:
- INNER JOIN
- LEFT OUTER JOIN
- RIGHT OUTER JOIN
- CROSS JOIN
- Left and right outer joins are equivalent
Left outer join example
- SELECT e.emp_no, first_name, last_name, salary FROM
employees e LEFT OUTER JOIN salaries s on e.emp_no =
s.emp_no WHERE hire_date='1986-06-26';
- This query selects an employee even if no payment
records exist in the salaries table
Aggregate queries and GROUP BY
- SELECT e.emp_no, first_name, last_name, SUM(salary)
FROM employees e LEFT OUTER JOIN salaries s on
e.emp_no = s.emp_no WHERE hire_date='1986-06-26'
GROUP BY e.emp_no;
How to get people w/no salary recs
- INSERT INTO employees(emp_no, first_name, last_name)
VALUES(600000, 'Alex', 'Chistyakov');
- Let’s count number of salary records using COUNT()
aggregate function
HAVING is like WHERE
- SELECT e.emp_no, first_name, last_name,
COUNT(salary) FROM employees e LEFT OUTER JOIN
salaries s on e.emp_no = s.emp_no GROUP BY e.emp_no
HAVING COUNT(salary) = 0;
Another way to do the same
- SELECT e.emp_no, first_name, last_name,
COUNT(salary) FROM employees e LEFT OUTER JOIN
salaries s on e.emp_no = s.emp_no WHERE s.emp_no IS
NULL;
- This query is more optimal*
Exercise #2
- Write a SELECT query which get all employees with total
sum of all salary records greater than 40000
- Tables and DDL
- Queries and DML
- Indexes and compound indexes
- Transactions and how they work, isolation levels
- Authorization and authentication, client protocol
Why indexes?
- Latency Numbers Every Programmer Should Know:
https://goo.gl/v4CEWU
- Indexes helps to avoid unnecessary disk operations
How indexes work?
- Index is a data structure optimized for search
- There are several types of indexes: hash indexes, B-tree
indexes
- Hash indexes allow to find exact rows
- B-tree indexes allow to find ranges
- InnoDB and Aria support B-tree indexes only
B-tree index
- “B” stands for “balanced”, not for “binary”
“SQL Tuning” by Dan Tow
- https://goo.gl/jRbD5H
- A must read for every DBA!
- Discusses how to build effective
indexes in great details
- Unfortunately does not cover
aggregate functions and sorting
Column cardinality
- Cardinality is a measure of data uniqueness
- Columns with more unique values have higher cardinality
- Columns with few unique values have lower cardinality
A composite index
- Covers two or more columns
- Allows to find rows by subsequently applying a filter
column-by-column
- Order of columns in a composite index matters!
Index selectivity
- An ability of a certain condition to filter
- Is expressed as a number of columns after filtering
divided by a total number of columns
- Lower values mean greater selectivity
- Some authors define selectivity as a total number of
columns divided by a resulting number of columns
Building a good composite index
- Columns with higher individual selectivity should go first
in a composite index
- Non-selective columns should be the latest
Functional indexes
- Original MySQL does not have functional indexes
- MariaDB adds support for virtual columns
- Functional indexes can be created over virtual columns
Virtual column example
- ALTER TABLE employees ADD lower_last_name
varchar(16) GENERATED ALWAYS AS (lower(last_name))
PERSISTENT;
- CREATE INDEX lower_last_name ON
employees(lower_last_name);
- SELECT e.emp_no, first_name, last_name FROM
employees e WHERE lower_last_name LIKE 'chistya%';
Let’s add %
- SELECT e.emp_no, first_name, last_name FROM
employees e WHERE lower_last_name LIKE '%chistya%';
- This will always lead to a full scan in current MariaDB
and MySQL implementations
- Full Text Search engine should be used instead
- I recommend Sphinx or Solr
Using ORDER BY
- In most real life cases can’t be covered by an index
- Dan Tow doesn’t consider these cases at all
- No good solution exists
Exercise #3
- Write a select which gets all salary records for the
employee w/emp_no = 10001 ordered by amount of the
salary record
- Create a covering index for this query
Things not to do in your life
- Please never ever do ORDER BY RAND()!
- How to do it properly: get a good random number on the
client side
- LIMIT 50 OFFSET 5000000 is the next thing not to do
- How to do it properly: “emp_no > $last_emp_no LIMIT
50”
- Tables and DDL
- Queries and DML
- Indexes and compound indexes
- Transactions and how they work, isolation levels
- Authorization and authentication, client protocol
A bit of history
- MySQL supported pluggable storage engines for years
- Two most notable were MyISAM and InnoDB
- MyISAM did not support transactions in any way
- InnoDB was transactional
MariaDB engines
- Many mysql.* tables are still in MyISAM format
- Aria storage engine emerged and is optionally
transactional in a crash-proof sense (does not support
explicit transactions though)
- InnoDB fully supports transactions
- I recommend to use InnoDB
A bit of InnoDB internals
- /var/lib/mysql/ib_logfile[01] are InnoDB redo logs
- The redo log works as a circular buffer
- It’s not practical to set the InnoDB log size
(innodb_log_file_size) to more than 128M
- This change requires restart
Generic recovery process
- Works the same way for any engine with WAL/redo
log/intent log/whatever
- The service starts after crash
- Log records are examined
- Finished transactions are applied to their final
destinations, unfinished ones are thrown out
- Aria performs these steps when in transaction mode too
COMMIT and auto-commit
- Every query starts and commits an implicit transaction
by default
- SET autocommit = 0; disables this
- START TRANSACTION or BEGIN should be used then to
start a transaction
- And COMMIT to finish it
- DDL statements perform COMMIT implicitly
ROLLBACK and savepoints
- ROLLBACK is used to abort a transaction
- Transactions can’t be nested but this behavior can be
emulated using savepoints
- SAVEPOINT label
- ROLLBACK TO label
- RELEASE SAVEPOINT label
A bit of InnoDB internals - MVCC
- MVCC stands for “Multiversion concurrency control”
- Records are declared dead but still occupy disk space
- InnoDB storage file never shrinks
- InnoDB uses a single file for everything by default and
this file can’t be compacted
It’s possible to overcome this
- innodb_file_per_table=1
- Every table will occupy a separate file (two separate
files in fact)
- Beware of Unix file descriptors limits!
- ulimit -n 65535 somewhere before starting mysqld_safe
Long transactions can be evil
- DDL statements require an exclusive lock on table
metadata
- An explicit transaction holds a read lock on every table it
uses
- If number of transactions per second is high enough the
DDL statement will wait forever
Transactions: logical perspective
- The SQL standard defines 4 transaction isolation levels
- READ UNCOMMITTED
- READ COMMITTED
- REPEATABLE READ
- SERIALIZABLE
READ UNCOMMITTED
- The weakest level
- Allows dirty reads
- A transaction can get non-committed data of other
transaction
READ COMMITTED
- Non-repeatable reads are possible
- Phantom reads are possible
REPEATABLE READ
- The default isolation level
- Non-repeatable reads are not possible
- Phantom reads are possible
SERIALIZABLE
- The strongest level
- Non-repeatable reads and phantom reads are impossible
Exercise #4
- Open two different connections to the employees DB, set
autocommit to 0;
- Set isolation level to READ COMMITED in both windows,
select total number of employees whose names started
with Alex in the 1st session, delete the employee with ID
499559 in the 2nd session (don’t forget to COMMIT),
repeat the query in the 1st session
Exercise #4
- Set isolation level to REPEATABLE READ in both windows,
select total number of employees whose names started
with Alex in the 1st session, delete the employee with ID
499517 in the 2nd session (don’t forget to COMMIT),
repeat the query in the 1st session
Exercise #4
- Set isolation level to REPEATABLE READ in both windows,
select total number of employees whose names started
with Alex in the 1st session, insert an employee called
Alexis Doe in the 2nd session (don’t forget to COMMIT),
repeat the query in the 1st session
Exercise #4
- Set isolation level to SERIALIZABLE in both windows,
select total number of employees whose names started
with Alex in the 1st session, insert an employee called
Alex Didnotfail in the 2nd session (don’t forget to
COMMIT), repeat the query in the 1st session
- Tables and DDL
- Queries and DML
- Indexes and compound indexes
- Transactions and how they work, isolation levels
- Authorization and authentication, client protocol
mysql.user table
- Stores user privileges
- Can (but should not) be manipulated directly
- FLUSH PRIVILEGES rereads effective rights from it
- Uses MyISAM storage
GRANT statement
- Creates user accounts
- Grants privileges to them
- Is documented at https://goo.gl/zBHTd4
A superuser
- Has ALL PRIVILEGES ON *.*
- Has a number of SUPER privileges
A list of privileges
- Privileges can be global, database level, table level,
column level, function level and procedure level
- A list is available in GRANT command documentation
Default client credentials
- Can be set in ~/.my.cnf file like this:
[client]
user = root
password = Pheexaigee8a
Using views to limit rights
- Create a view using a privileged table columns
- Grant privileges to that view
Using stored procedures
- Create a stored procedure to perform AAA tasks
- Grant privileges to that stored procedure
MySQL wire protocol
- Is encrypted using a session key
- Can’t be easily proxied on L3 because of that
Exercise #5
- Grant all privileges on the employees.salaries table to a
user called “manager” with password da5ca9aeNgee%, a
user can connect from any host
- Create a view on a table employees consisting of emp_no
and the first and last names and grant a read privilege on
it to a user called “reader” with password eLegah0aez8a
- Basics of performance monitoring
- Notion of replication, types of replication
- Traditional replication in details
- Galera cluster and how it works
- MMM, PRM and query proxying
MySQL slow queries log
- The simplest way to do performance tuning
- Should be enabled in the MariaDB config file
- Slow queries will be written to a file for subsequent
analysis
Slow queries log config vars
- slow_query_log = on
- slow_query_log_file = /var/log/mysql/mariadb-slow.log
- long_query_time = 0.1
- log-queries-not-using-indexes
Analyzing the log w/Percona Tools
- pt-query-digest
- Documented at https://goo.gl/YCv1ya
- In the simplest case produces a textual report on most
time-consuming queries
Analyzing the log w/Anemometer
- Anemometer is a web-based slow query monitor created
at Box (https://github.com/box/Anemometer)
- Anemometer uses pt-query-digest to process the slow
query log internally
- Anemometer requires PHP, a webserver and a number of
other tools
- So, we use an Ansible role to simplify its deployment
Ansible role for Anemometer
- Ansible is a popular Configuration Management tool
- Ansible is written in Python and uses YAML as a
configuration description language
- A role for Anemometer is at https://goo.gl/us6V82
- This role works for Ubuntu 14.04 hosts and does not work
for 16.04 yet
- This is trivial to correct, expect a fix in a week
Demo time!
- Let’s analyze live queries in our Vagrant box
Partitioning and sharding
- Partitioning is a process of splitting a big table in smaller
subset on the same server
- Partitioning works well for time-series data
- Sharding is a process of splitting a big table in a number
of unrelated tables on different servers
- Sharding requires serious modifications of the app code
Partitioning in MariaDB
- MariaDB inherits MySQL support for partitioning
- Partitioning is documented at https://goo.gl/1CwIKX
- Certain limitations apply:
- Queries are not parallelized
- Partitioned table can’t contain or be referenced by
foreign keys
Partitioning in the real life
- Is tricky to set up properly
- Is often misused (I personally have never seen MySQL
partitioning set up properly)
- I strongly recommend not to use partitioning
Exercise #6
- Get familiar with the Anemometer tool
- Read and explain a query plan
- Basics of performance monitoring
- Notion of replication, types of replication
- Traditional replication in details
- Galera cluster and how it works
- MMM, PRM and query proxying
What is replication?
- Storing the same data on multiple MariaDB servers
- Establishing a master/slave relationship between the
original and the copies
- Distributing data modifications from a master node to
slave nodes
Master and slave nodes
- The master node gets data modification queries
(INSERTs, UPDATEs and DELETEs)
- The master node sends data changes to slaves
- Slave nodes are read-only and get updates from the
master
- Data modification on slave nodes is not prohibited in
MySQL/MariaDB world
Types of replication
- Replication can be synchronous or asynchronous
- Replication can also be master-slave or master-master
- All 4 options are possible: “synchronous master-slave”,
“asynchronous master-slave”, “synchronous
master-master” and “asynchronous master-master”
- Asynchronous master-slave is the default MariaDB setting
Master-slave and master-master
- There is only a single master in a MS replication topology
- There is more than one master in a MM setup
- A master should propagate data changes to all hosts in
the replication topology
- So, every master is also a slave in a MM setup
Sync or async
- Async: a transaction on a master is finished as soon as
it’s written to a transaction log on a master
- Semisync: a transaction on a master is finished only
after it’s written to a transaction log on one of slaves
- Sync: a transaction on a master is finished when it’s
acknowledged and committed on all slaves
Replication lag
- Replication lag is a delay between the same operations
on a master and on a slave
- Replication lag is meaningful for async replication only
- Replication lag should be minimized
Multi-master replication scalability
- Multi-master replication does not scale on writes!
- It’s a popular belief that it does (because there is more
than one master)
- But every master should perform exactly the same set of
write operations!
Multi-master tips and tricks
- Avoid writing to the same table on different masters!
- Split your schema to several non-related table sets
logically bound to different services if possible
- Work with these table sets on different masters
independently
- Basics of performance monitoring
- Notion of replication, types of replication
- Traditional replication in details
- Galera cluster and how it works
- MMM, PRM and query proxying
The binary log
- The binary log stores data modification events (both DDL
and DML changes)
- The binary log is storage neutral (works for Aria, InnoDB,
etc.)
- The binary log is not a transaction log
- The binary log can store events in 3 different formats
Binary log formats
- SBR (statement-based replication)
- RBR (row-based replication)
- Mixed (stores statements or rows when appropriate)
- Mixed seems to be the best of both worlds
- But it is not, in fact (avoid using it)
Statement-based replication
- Stores INSERT/UPDATE/DELETE and
CREATE/DROP/TRUNCATE statements as is
- Requires less space in the log
- Is not 100% accurate for all statements
SBR non-determinism
- INSERT INTO t1(c1, mtime) VALUES(1, NOW())
- NOW() can be different on master and slave
- INSERT INTO t2(c1, c2) VALUES(1, RAND())
- RAND() is definitely different on master and slave
- Fixes are trivial - master should send exact values
- DELETE FROM t1 LIMIT 10; - fix is not trivial
SBR is broken (mixed is broken too)
- Error 1062 (Duplicate entry NNN for key X)
- But why?..I just inserted a bunch of rows!
- This is a bug somehow related to range locking on a
primary key on slave side
- There is a lot of instructions on the Internet, something
like “set slave-skip-errors to 1062”
Never trust random Internet guys
- Don’t do “slave-skip-errors”
- To fix this bug properly

- NEVER USE SBR OR MIXED LOG FORMATS, USE RBR!
- The only problem is that RBR is broken too
The binary log concept is broken
- Correctly implemented binary log stores physical changes
to the storage layer (WAL records)
- MySQL historically used pluggable storage layers, some of
them were non-transactional
- The binary log is on the wrong abstraction layer
- This can’t be easily fixed
RBR is broken (much less than SBR)
- DELETE FROM t1; generates a lot of rows to be written to
the binary log
- The slave can begin lagging
- A slave SQL thread uses indexes to apply row deltas
- Having a primary key is inevitable!
- It’s better to use surrogate keys
libslave
- A library to mimic a MySQL slave
- https://github.com/tarantool/libslave
- Can be embedded to an app, allows an app to connect to
the MySQL master and read the binlog
Cascading replication topologies
- Replication can (and should be)
cascaded (5 slaves on a single master
is a bad idea)
- A slave can be a master for a slave
- Config should be tweaked:
log-slave-updates=1
Replication rings
- If you absolutely need
master-master, you can have one
- Every master should have its own
key space
- auto_increment_offset=1
auto_increment_increment=10
Semisync replication
- Added since MariaDB 5.5, declared stable since 10.1.3
- Documented at https://goo.gl/wuiKfJ
- If a slave fails to acknowledge before a certain timeout,
a master switches to async automatically and switches
back when a slave catches up
Parallel replication
- Traditional MariaDB replication uses a single SQL thread
on the slave side
- Starting with 10.0.5 it’s possible to use several threads
- Documented at https://goo.gl/0p4SH9
Delayed replication
- Replication is not a backup!
- Delayed replication is (well, can be)
- Introduced in MariaDB 10.2.3
- Documented at https://goo.gl/BZguD9
- Replication delay can be achieved using pt-slave-delay
tool from Percona Toolkit
GTID
- Globally unique binlog events identification
- Introduced in 10.0.2
- Documented at https://goo.gl/xgJ27M
- Has a number of significant benefits: slave server can be
easily reconnected to another master, slave log position
is saved in a transactional way
- Basics of performance monitoring
- Notion of replication, types of replication
- Traditional replication in details
- Galera cluster and how it works
- MMM, PRM and query proxying
WSREP
- WSREP is a library for distributing working sets
- The Galera cluster is built around that library
The Galera cluster
- Is InnoDB-only
- Is semisync
- Does not use traditional replication at all
A common Galera cluster setup
- Two master nodes and one arbiter node
- The arbiter node does not store anything
- Basics of performance monitoring
- Notion of replication, types of replication
- Traditional replication in details
- Galera cluster and how it works
- MMM, PRM and query proxying
Questions?
- Please feel free to email me at alex@gitinsky.com
- My Skype ID is demeliorator
Thank you!
- Good luck in the wonderful world of MariaDB!

Weitere Àhnliche Inhalte

Was ist angesagt?

Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)
Antony T Curtis
 
A tour on ruby and friends
A tour on ruby and friendsA tour on ruby and friends
A tour on ruby and friends
旻琊 朘
 
Drizzles Approach To Improving Performance Of The Server
Drizzles  Approach To  Improving  Performance Of The  ServerDrizzles  Approach To  Improving  Performance Of The  Server
Drizzles Approach To Improving Performance Of The Server
PerconaPerformance
 
Beyond symfony 1.2 (Symfony Camp 2008)
Beyond symfony 1.2 (Symfony Camp 2008)Beyond symfony 1.2 (Symfony Camp 2008)
Beyond symfony 1.2 (Symfony Camp 2008)
Fabien Potencier
 
ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...
ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...
ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...
ZFConf Conference
 

Was ist angesagt? (19)

Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)
 
Rails on Oracle 2011
Rails on Oracle 2011Rails on Oracle 2011
Rails on Oracle 2011
 
Getting Started with PL/Proxy
Getting Started with PL/ProxyGetting Started with PL/Proxy
Getting Started with PL/Proxy
 
Kickin' Ass with Cache-Fu (without notes)
Kickin' Ass with Cache-Fu (without notes)Kickin' Ass with Cache-Fu (without notes)
Kickin' Ass with Cache-Fu (without notes)
 
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Rails-like JavaScript Using CoffeeScript, Backbone.js and JasmineRails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
 
A tour on ruby and friends
A tour on ruby and friendsA tour on ruby and friends
A tour on ruby and friends
 
JavaScript Basics and Best Practices - CC FE & UX
JavaScript Basics and Best Practices - CC FE & UXJavaScript Basics and Best Practices - CC FE & UX
JavaScript Basics and Best Practices - CC FE & UX
 
Drizzles Approach To Improving Performance Of The Server
Drizzles  Approach To  Improving  Performance Of The  ServerDrizzles  Approach To  Improving  Performance Of The  Server
Drizzles Approach To Improving Performance Of The Server
 
Kickin' Ass with Cache-Fu (with notes)
Kickin' Ass with Cache-Fu (with notes)Kickin' Ass with Cache-Fu (with notes)
Kickin' Ass with Cache-Fu (with notes)
 
Beyond symfony 1.2 (Symfony Camp 2008)
Beyond symfony 1.2 (Symfony Camp 2008)Beyond symfony 1.2 (Symfony Camp 2008)
Beyond symfony 1.2 (Symfony Camp 2008)
 
Solid Software Design Principles
Solid Software Design PrinciplesSolid Software Design Principles
Solid Software Design Principles
 
Php MySql For Beginners
Php MySql For BeginnersPhp MySql For Beginners
Php MySql For Beginners
 
Php classes in mumbai
Php classes in mumbaiPhp classes in mumbai
Php classes in mumbai
 
Rich Model And Layered Architecture in SF2 Application
Rich Model And Layered Architecture in SF2 ApplicationRich Model And Layered Architecture in SF2 Application
Rich Model And Layered Architecture in SF2 Application
 
SQL Cockpit - Releasenotes 3.0
SQL Cockpit - Releasenotes 3.0SQL Cockpit - Releasenotes 3.0
SQL Cockpit - Releasenotes 3.0
 
ERRest - The Next Steps
ERRest - The Next StepsERRest - The Next Steps
ERRest - The Next Steps
 
Impress Your Friends with EcmaScript 2015
Impress Your Friends with EcmaScript 2015Impress Your Friends with EcmaScript 2015
Impress Your Friends with EcmaScript 2015
 
The promise of asynchronous PHP
The promise of asynchronous PHPThe promise of asynchronous PHP
The promise of asynchronous PHP
 
ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...
ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...
ZFConf 2010: Zend Framework & MVC, Model Implementation (Part 2, Dependency I...
 

Ähnlich wie MariaDB workshop

MySQL Scaling Presentation
MySQL Scaling PresentationMySQL Scaling Presentation
MySQL Scaling Presentation
Tommy Falgout
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
Chinmay Kulkarni
 

Ähnlich wie MariaDB workshop (20)

SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
 
Oracle Objects And Transactions
Oracle Objects And TransactionsOracle Objects And Transactions
Oracle Objects And Transactions
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
Percona Live 2012PPT introduction-to-mysql-replication
Percona Live 2012PPT introduction-to-mysql-replicationPercona Live 2012PPT introduction-to-mysql-replication
Percona Live 2012PPT introduction-to-mysql-replication
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
MySQL for business developer - Titouan BENOIT
MySQL for business developer - Titouan BENOITMySQL for business developer - Titouan BENOIT
MySQL for business developer - Titouan BENOIT
 
MySQL Scaling Presentation
MySQL Scaling PresentationMySQL Scaling Presentation
MySQL Scaling Presentation
 
Triggers and Stored Procedures
Triggers and Stored ProceduresTriggers and Stored Procedures
Triggers and Stored Procedures
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
 
The Ideal Performance Architecture
The Ideal Performance ArchitectureThe Ideal Performance Architecture
The Ideal Performance Architecture
 
MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8
 
MySQL & Expression Engine EEUK2013
MySQL & Expression Engine EEUK2013MySQL & Expression Engine EEUK2013
MySQL & Expression Engine EEUK2013
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Dan Hotka's Top 10 Oracle 12c New Features
Dan Hotka's Top 10 Oracle 12c New FeaturesDan Hotka's Top 10 Oracle 12c New Features
Dan Hotka's Top 10 Oracle 12c New Features
 
Percona toolkit
Percona toolkitPercona toolkit
Percona toolkit
 

Mehr von Alex Chistyakov

Mehr von Alex Chistyakov (20)

My slides from DevOpsDays 2019
My slides from DevOpsDays 2019My slides from DevOpsDays 2019
My slides from DevOpsDays 2019
 
My slides from BMM №3 May 2019
My slides from BMM №3 May 2019My slides from BMM №3 May 2019
My slides from BMM №3 May 2019
 
My slides from DevOps-40 meetup Jun 2019
My slides from DevOps-40 meetup Jun 2019 My slides from DevOps-40 meetup Jun 2019
My slides from DevOps-40 meetup Jun 2019
 
My slides from SECR'2018
My slides from SECR'2018My slides from SECR'2018
My slides from SECR'2018
 
My slides from the first SPb SRE community meetup at DataArt
My slides from the first SPb SRE community meetup at DataArtMy slides from the first SPb SRE community meetup at DataArt
My slides from the first SPb SRE community meetup at DataArt
 
My slides from CC'2019
My slides from CC'2019My slides from CC'2019
My slides from CC'2019
 
My slides from BMM №4 Nov 2019
My slides from BMM №4 Nov 2019My slides from BMM №4 Nov 2019
My slides from BMM №4 Nov 2019
 
My slides from DevOps-40 meetup Oct 2019
My slides from DevOps-40 meetup Oct 2019My slides from DevOps-40 meetup Oct 2019
My slides from DevOps-40 meetup Oct 2019
 
My slides from DevOps-40 meetup Dec 2019
My slides from DevOps-40 meetup Dec 2019My slides from DevOps-40 meetup Dec 2019
My slides from DevOps-40 meetup Dec 2019
 
Configuration management and Kubernetes
Configuration management and KubernetesConfiguration management and Kubernetes
Configuration management and Kubernetes
 
Ansible and other stuff
Ansible and other stuffAnsible and other stuff
Ansible and other stuff
 
Python performance engineering in 2017
Python performance engineering in 2017Python performance engineering in 2017
Python performance engineering in 2017
 
My talk at SPb SQA sub-meetup of ITGM
My talk at SPb SQA sub-meetup of ITGMMy talk at SPb SQA sub-meetup of ITGM
My talk at SPb SQA sub-meetup of ITGM
 
My talk at SECR 2017
My talk at SECR 2017My talk at SECR 2017
My talk at SECR 2017
 
On scaling teams
On scaling teamsOn scaling teams
On scaling teams
 
Docker for JS people
Docker for JS peopleDocker for JS people
Docker for JS people
 
My talk on DevOps engineer's adventures in the Windows world at UWDC 2017
My talk on DevOps engineer's adventures in the Windows world at UWDC 2017My talk on DevOps engineer's adventures in the Windows world at UWDC 2017
My talk on DevOps engineer's adventures in the Windows world at UWDC 2017
 
My talk on GitHub open data at ITGM #10
 My talk on GitHub open data at ITGM #10 My talk on GitHub open data at ITGM #10
My talk on GitHub open data at ITGM #10
 
My talk on DevOps :) at Stachka 2017
My talk on DevOps :) at Stachka 2017My talk on DevOps :) at Stachka 2017
My talk on DevOps :) at Stachka 2017
 
My talk on programming languages at SPbLUG Mar 2017
My talk on programming languages at SPbLUG Mar 2017My talk on programming languages at SPbLUG Mar 2017
My talk on programming languages at SPbLUG Mar 2017
 

KĂŒrzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

KĂŒrzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

MariaDB workshop

  • 2. Outline - Tables and DDL - Queries and DML - Indexes and compound indexes - Transactions and how they work, isolation levels - Authorization and authentication, client protocol
  • 3. Outline - Basics of performance monitoring - Notion of replication, types of replication - Traditional replication in details - Galera cluster and how it works - MMM, PRM and query proxying
  • 4. What’s in a box? - Ubuntu 16.04.2 - Python 2.7.12 - MariaDB 10.0.29 - Sakila DB, Employees DB - Percona Toolkit 2.2.16 - Anemometer
  • 5. How to use Vagrant - Create an empty folder - Download https://goo.gl/ap6r6E there (rename it to ‘Vagrantfile’) - Run ‘vagrant up’ in that folder - Wait until a VM starts - Run ‘vagrant ssh’ to get in - My .mysql_history: https://goo.gl/AyrTW7
  • 6. - Tables and DDL - Queries and DML - Indexes and compound indexes - Transactions and how they work, isolation levels - Authorization and authentication, client protocol
  • 7. What is a table? - A collection of related data
  • 8. What is a table? - A collection of related data - Consists of columns and rows
  • 9. What is a table?
  • 10. The DDL - Manipulates the database structure (also called schema)
  • 11. DDL statements - CREATE - ALTER - DROP - TRUNCATE - RENAME
  • 12. How to create a table? CREATE TABLE language ( language_id TINYINT UNSIGNED NOT NULL AUTO_INCREMENT, name CHAR(20) NOT NULL, last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (language_id) )ENGINE=InnoDB DEFAULT CHARSET=utf8;
  • 13. Primary keys - Identify a record uniquely - So, adding two equal keys is not possible - Can be natural like “passport number” - Or surrogate - Surrogate keys are auto-generated on the DB side
  • 14. Natural PKs can be composite CREATE TABLE film_actor ( actor_id SMALLINT UNSIGNED NOT NULL, film_id SMALLINT UNSIGNED NOT NULL, last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (actor_id,film_id), KEY idx_fk_film_id (`film_id`), CONSTRAINT fk_film_actor_actor FOREIGN KEY (actor_id) REFERENCES actor (actor_id), CONSTRAINT fk_film_actor_film FOREIGN KEY (film_id) REFERENCES film (film_id) )ENGINE=InnoDB DEFAULT CHARSET=utf8;
  • 15. Autoincrement primary keys - Are surrogate - Are 1,2,3,4 or 8 bytes long - BTW, INT(10) is 4 bytes long - Are incremented on every INSERT - Should be always used - BTW, InnoDB table is a clustered index* around its PK - If no explicit PK exists 6-byte row ID will be used
  • 17. - Tables and DDL - Queries and DML - Indexes and compound indexes - Transactions and how they work, isolation levels - Authorization and authentication, client protocol
  • 18. A trivial SELECT query - SELECT * FROM employees WHERE hire_date='1986-06-26' - Please, never use “SELECT *”, always select certain columns! - A slightly better version: - SELECT emp_no, first_name, last_name FROM employees WHERE hire_date='1986-06-26';
  • 19. Using a single table is impractical* - Four types of JOINs: - INNER JOIN - LEFT OUTER JOIN - RIGHT OUTER JOIN - CROSS JOIN - Left and right outer joins are equivalent
  • 20. Left outer join example - SELECT e.emp_no, first_name, last_name, salary FROM employees e LEFT OUTER JOIN salaries s on e.emp_no = s.emp_no WHERE hire_date='1986-06-26'; - This query selects an employee even if no payment records exist in the salaries table
  • 21. Aggregate queries and GROUP BY - SELECT e.emp_no, first_name, last_name, SUM(salary) FROM employees e LEFT OUTER JOIN salaries s on e.emp_no = s.emp_no WHERE hire_date='1986-06-26' GROUP BY e.emp_no;
  • 22. How to get people w/no salary recs - INSERT INTO employees(emp_no, first_name, last_name) VALUES(600000, 'Alex', 'Chistyakov'); - Let’s count number of salary records using COUNT() aggregate function
  • 23. HAVING is like WHERE - SELECT e.emp_no, first_name, last_name, COUNT(salary) FROM employees e LEFT OUTER JOIN salaries s on e.emp_no = s.emp_no GROUP BY e.emp_no HAVING COUNT(salary) = 0;
  • 24. Another way to do the same - SELECT e.emp_no, first_name, last_name, COUNT(salary) FROM employees e LEFT OUTER JOIN salaries s on e.emp_no = s.emp_no WHERE s.emp_no IS NULL; - This query is more optimal*
  • 25. Exercise #2 - Write a SELECT query which get all employees with total sum of all salary records greater than 40000
  • 26. - Tables and DDL - Queries and DML - Indexes and compound indexes - Transactions and how they work, isolation levels - Authorization and authentication, client protocol
  • 27. Why indexes? - Latency Numbers Every Programmer Should Know: https://goo.gl/v4CEWU - Indexes helps to avoid unnecessary disk operations
  • 28. How indexes work? - Index is a data structure optimized for search - There are several types of indexes: hash indexes, B-tree indexes - Hash indexes allow to find exact rows - B-tree indexes allow to find ranges - InnoDB and Aria support B-tree indexes only
  • 29. B-tree index - “B” stands for “balanced”, not for “binary”
  • 30. “SQL Tuning” by Dan Tow - https://goo.gl/jRbD5H - A must read for every DBA! - Discusses how to build effective indexes in great details - Unfortunately does not cover aggregate functions and sorting
  • 31. Column cardinality - Cardinality is a measure of data uniqueness - Columns with more unique values have higher cardinality - Columns with few unique values have lower cardinality
  • 32. A composite index - Covers two or more columns - Allows to find rows by subsequently applying a filter column-by-column - Order of columns in a composite index matters!
  • 33. Index selectivity - An ability of a certain condition to filter - Is expressed as a number of columns after filtering divided by a total number of columns - Lower values mean greater selectivity - Some authors define selectivity as a total number of columns divided by a resulting number of columns
  • 34. Building a good composite index - Columns with higher individual selectivity should go first in a composite index - Non-selective columns should be the latest
  • 35. Functional indexes - Original MySQL does not have functional indexes - MariaDB adds support for virtual columns - Functional indexes can be created over virtual columns
  • 36. Virtual column example - ALTER TABLE employees ADD lower_last_name varchar(16) GENERATED ALWAYS AS (lower(last_name)) PERSISTENT; - CREATE INDEX lower_last_name ON employees(lower_last_name); - SELECT e.emp_no, first_name, last_name FROM employees e WHERE lower_last_name LIKE 'chistya%';
  • 37. Let’s add % - SELECT e.emp_no, first_name, last_name FROM employees e WHERE lower_last_name LIKE '%chistya%'; - This will always lead to a full scan in current MariaDB and MySQL implementations - Full Text Search engine should be used instead - I recommend Sphinx or Solr
  • 38. Using ORDER BY - In most real life cases can’t be covered by an index - Dan Tow doesn’t consider these cases at all - No good solution exists
  • 39. Exercise #3 - Write a select which gets all salary records for the employee w/emp_no = 10001 ordered by amount of the salary record - Create a covering index for this query
  • 40. Things not to do in your life - Please never ever do ORDER BY RAND()! - How to do it properly: get a good random number on the client side - LIMIT 50 OFFSET 5000000 is the next thing not to do - How to do it properly: “emp_no > $last_emp_no LIMIT 50”
  • 41. - Tables and DDL - Queries and DML - Indexes and compound indexes - Transactions and how they work, isolation levels - Authorization and authentication, client protocol
  • 42. A bit of history - MySQL supported pluggable storage engines for years - Two most notable were MyISAM and InnoDB - MyISAM did not support transactions in any way - InnoDB was transactional
  • 43. MariaDB engines - Many mysql.* tables are still in MyISAM format - Aria storage engine emerged and is optionally transactional in a crash-proof sense (does not support explicit transactions though) - InnoDB fully supports transactions - I recommend to use InnoDB
  • 44. A bit of InnoDB internals - /var/lib/mysql/ib_logfile[01] are InnoDB redo logs - The redo log works as a circular buffer - It’s not practical to set the InnoDB log size (innodb_log_file_size) to more than 128M - This change requires restart
  • 45. Generic recovery process - Works the same way for any engine with WAL/redo log/intent log/whatever - The service starts after crash - Log records are examined - Finished transactions are applied to their final destinations, unfinished ones are thrown out - Aria performs these steps when in transaction mode too
  • 46. COMMIT and auto-commit - Every query starts and commits an implicit transaction by default - SET autocommit = 0; disables this - START TRANSACTION or BEGIN should be used then to start a transaction - And COMMIT to finish it - DDL statements perform COMMIT implicitly
  • 47. ROLLBACK and savepoints - ROLLBACK is used to abort a transaction - Transactions can’t be nested but this behavior can be emulated using savepoints - SAVEPOINT label - ROLLBACK TO label - RELEASE SAVEPOINT label
  • 48. A bit of InnoDB internals - MVCC - MVCC stands for “Multiversion concurrency control” - Records are declared dead but still occupy disk space - InnoDB storage file never shrinks - InnoDB uses a single file for everything by default and this file can’t be compacted
  • 49. It’s possible to overcome this - innodb_file_per_table=1 - Every table will occupy a separate file (two separate files in fact) - Beware of Unix file descriptors limits! - ulimit -n 65535 somewhere before starting mysqld_safe
  • 50. Long transactions can be evil - DDL statements require an exclusive lock on table metadata - An explicit transaction holds a read lock on every table it uses - If number of transactions per second is high enough the DDL statement will wait forever
  • 51. Transactions: logical perspective - The SQL standard defines 4 transaction isolation levels - READ UNCOMMITTED - READ COMMITTED - REPEATABLE READ - SERIALIZABLE
  • 52. READ UNCOMMITTED - The weakest level - Allows dirty reads - A transaction can get non-committed data of other transaction
  • 53. READ COMMITTED - Non-repeatable reads are possible - Phantom reads are possible
  • 54. REPEATABLE READ - The default isolation level - Non-repeatable reads are not possible - Phantom reads are possible
  • 55. SERIALIZABLE - The strongest level - Non-repeatable reads and phantom reads are impossible
  • 56. Exercise #4 - Open two different connections to the employees DB, set autocommit to 0; - Set isolation level to READ COMMITED in both windows, select total number of employees whose names started with Alex in the 1st session, delete the employee with ID 499559 in the 2nd session (don’t forget to COMMIT), repeat the query in the 1st session
  • 57. Exercise #4 - Set isolation level to REPEATABLE READ in both windows, select total number of employees whose names started with Alex in the 1st session, delete the employee with ID 499517 in the 2nd session (don’t forget to COMMIT), repeat the query in the 1st session
  • 58. Exercise #4 - Set isolation level to REPEATABLE READ in both windows, select total number of employees whose names started with Alex in the 1st session, insert an employee called Alexis Doe in the 2nd session (don’t forget to COMMIT), repeat the query in the 1st session
  • 59. Exercise #4 - Set isolation level to SERIALIZABLE in both windows, select total number of employees whose names started with Alex in the 1st session, insert an employee called Alex Didnotfail in the 2nd session (don’t forget to COMMIT), repeat the query in the 1st session
  • 60. - Tables and DDL - Queries and DML - Indexes and compound indexes - Transactions and how they work, isolation levels - Authorization and authentication, client protocol
  • 61. mysql.user table - Stores user privileges - Can (but should not) be manipulated directly - FLUSH PRIVILEGES rereads effective rights from it - Uses MyISAM storage
  • 62. GRANT statement - Creates user accounts - Grants privileges to them - Is documented at https://goo.gl/zBHTd4
  • 63. A superuser - Has ALL PRIVILEGES ON *.* - Has a number of SUPER privileges
  • 64. A list of privileges - Privileges can be global, database level, table level, column level, function level and procedure level - A list is available in GRANT command documentation
  • 65. Default client credentials - Can be set in ~/.my.cnf file like this: [client] user = root password = Pheexaigee8a
  • 66. Using views to limit rights - Create a view using a privileged table columns - Grant privileges to that view
  • 67. Using stored procedures - Create a stored procedure to perform AAA tasks - Grant privileges to that stored procedure
  • 68. MySQL wire protocol - Is encrypted using a session key - Can’t be easily proxied on L3 because of that
  • 69. Exercise #5 - Grant all privileges on the employees.salaries table to a user called “manager” with password da5ca9aeNgee%, a user can connect from any host - Create a view on a table employees consisting of emp_no and the first and last names and grant a read privilege on it to a user called “reader” with password eLegah0aez8a
  • 70. - Basics of performance monitoring - Notion of replication, types of replication - Traditional replication in details - Galera cluster and how it works - MMM, PRM and query proxying
  • 71. MySQL slow queries log - The simplest way to do performance tuning - Should be enabled in the MariaDB config file - Slow queries will be written to a file for subsequent analysis
  • 72. Slow queries log config vars - slow_query_log = on - slow_query_log_file = /var/log/mysql/mariadb-slow.log - long_query_time = 0.1 - log-queries-not-using-indexes
  • 73. Analyzing the log w/Percona Tools - pt-query-digest - Documented at https://goo.gl/YCv1ya - In the simplest case produces a textual report on most time-consuming queries
  • 74. Analyzing the log w/Anemometer - Anemometer is a web-based slow query monitor created at Box (https://github.com/box/Anemometer) - Anemometer uses pt-query-digest to process the slow query log internally - Anemometer requires PHP, a webserver and a number of other tools - So, we use an Ansible role to simplify its deployment
  • 75. Ansible role for Anemometer - Ansible is a popular Configuration Management tool - Ansible is written in Python and uses YAML as a configuration description language - A role for Anemometer is at https://goo.gl/us6V82 - This role works for Ubuntu 14.04 hosts and does not work for 16.04 yet - This is trivial to correct, expect a fix in a week
  • 76. Demo time! - Let’s analyze live queries in our Vagrant box
  • 77. Partitioning and sharding - Partitioning is a process of splitting a big table in smaller subset on the same server - Partitioning works well for time-series data - Sharding is a process of splitting a big table in a number of unrelated tables on different servers - Sharding requires serious modifications of the app code
  • 78. Partitioning in MariaDB - MariaDB inherits MySQL support for partitioning - Partitioning is documented at https://goo.gl/1CwIKX - Certain limitations apply: - Queries are not parallelized - Partitioned table can’t contain or be referenced by foreign keys
  • 79. Partitioning in the real life - Is tricky to set up properly - Is often misused (I personally have never seen MySQL partitioning set up properly) - I strongly recommend not to use partitioning
  • 80. Exercise #6 - Get familiar with the Anemometer tool - Read and explain a query plan
  • 81. - Basics of performance monitoring - Notion of replication, types of replication - Traditional replication in details - Galera cluster and how it works - MMM, PRM and query proxying
  • 82. What is replication? - Storing the same data on multiple MariaDB servers - Establishing a master/slave relationship between the original and the copies - Distributing data modifications from a master node to slave nodes
  • 83. Master and slave nodes - The master node gets data modification queries (INSERTs, UPDATEs and DELETEs) - The master node sends data changes to slaves - Slave nodes are read-only and get updates from the master - Data modification on slave nodes is not prohibited in MySQL/MariaDB world
  • 84. Types of replication - Replication can be synchronous or asynchronous - Replication can also be master-slave or master-master - All 4 options are possible: “synchronous master-slave”, “asynchronous master-slave”, “synchronous master-master” and “asynchronous master-master” - Asynchronous master-slave is the default MariaDB setting
  • 85. Master-slave and master-master - There is only a single master in a MS replication topology - There is more than one master in a MM setup - A master should propagate data changes to all hosts in the replication topology - So, every master is also a slave in a MM setup
  • 86. Sync or async - Async: a transaction on a master is finished as soon as it’s written to a transaction log on a master - Semisync: a transaction on a master is finished only after it’s written to a transaction log on one of slaves - Sync: a transaction on a master is finished when it’s acknowledged and committed on all slaves
  • 87. Replication lag - Replication lag is a delay between the same operations on a master and on a slave - Replication lag is meaningful for async replication only - Replication lag should be minimized
  • 88. Multi-master replication scalability - Multi-master replication does not scale on writes! - It’s a popular belief that it does (because there is more than one master) - But every master should perform exactly the same set of write operations!
  • 89. Multi-master tips and tricks - Avoid writing to the same table on different masters! - Split your schema to several non-related table sets logically bound to different services if possible - Work with these table sets on different masters independently
  • 90. - Basics of performance monitoring - Notion of replication, types of replication - Traditional replication in details - Galera cluster and how it works - MMM, PRM and query proxying
  • 91. The binary log - The binary log stores data modification events (both DDL and DML changes) - The binary log is storage neutral (works for Aria, InnoDB, etc.) - The binary log is not a transaction log - The binary log can store events in 3 different formats
  • 92. Binary log formats - SBR (statement-based replication) - RBR (row-based replication) - Mixed (stores statements or rows when appropriate) - Mixed seems to be the best of both worlds - But it is not, in fact (avoid using it)
  • 93. Statement-based replication - Stores INSERT/UPDATE/DELETE and CREATE/DROP/TRUNCATE statements as is - Requires less space in the log - Is not 100% accurate for all statements
  • 94. SBR non-determinism - INSERT INTO t1(c1, mtime) VALUES(1, NOW()) - NOW() can be different on master and slave - INSERT INTO t2(c1, c2) VALUES(1, RAND()) - RAND() is definitely different on master and slave - Fixes are trivial - master should send exact values - DELETE FROM t1 LIMIT 10; - fix is not trivial
  • 95. SBR is broken (mixed is broken too) - Error 1062 (Duplicate entry NNN for key X) - But why?..I just inserted a bunch of rows! - This is a bug somehow related to range locking on a primary key on slave side - There is a lot of instructions on the Internet, something like “set slave-skip-errors to 1062”
  • 96. Never trust random Internet guys - Don’t do “slave-skip-errors” - To fix this bug properly
 - NEVER USE SBR OR MIXED LOG FORMATS, USE RBR! - The only problem is that RBR is broken too
  • 97. The binary log concept is broken - Correctly implemented binary log stores physical changes to the storage layer (WAL records) - MySQL historically used pluggable storage layers, some of them were non-transactional - The binary log is on the wrong abstraction layer - This can’t be easily fixed
  • 98. RBR is broken (much less than SBR) - DELETE FROM t1; generates a lot of rows to be written to the binary log - The slave can begin lagging - A slave SQL thread uses indexes to apply row deltas - Having a primary key is inevitable! - It’s better to use surrogate keys
  • 99. libslave - A library to mimic a MySQL slave - https://github.com/tarantool/libslave - Can be embedded to an app, allows an app to connect to the MySQL master and read the binlog
  • 100. Cascading replication topologies - Replication can (and should be) cascaded (5 slaves on a single master is a bad idea) - A slave can be a master for a slave - Config should be tweaked: log-slave-updates=1
  • 101. Replication rings - If you absolutely need master-master, you can have one - Every master should have its own key space - auto_increment_offset=1 auto_increment_increment=10
  • 102. Semisync replication - Added since MariaDB 5.5, declared stable since 10.1.3 - Documented at https://goo.gl/wuiKfJ - If a slave fails to acknowledge before a certain timeout, a master switches to async automatically and switches back when a slave catches up
  • 103. Parallel replication - Traditional MariaDB replication uses a single SQL thread on the slave side - Starting with 10.0.5 it’s possible to use several threads - Documented at https://goo.gl/0p4SH9
  • 104. Delayed replication - Replication is not a backup! - Delayed replication is (well, can be) - Introduced in MariaDB 10.2.3 - Documented at https://goo.gl/BZguD9 - Replication delay can be achieved using pt-slave-delay tool from Percona Toolkit
  • 105. GTID - Globally unique binlog events identification - Introduced in 10.0.2 - Documented at https://goo.gl/xgJ27M - Has a number of significant benefits: slave server can be easily reconnected to another master, slave log position is saved in a transactional way
  • 106. - Basics of performance monitoring - Notion of replication, types of replication - Traditional replication in details - Galera cluster and how it works - MMM, PRM and query proxying
  • 107. WSREP - WSREP is a library for distributing working sets - The Galera cluster is built around that library
  • 108. The Galera cluster - Is InnoDB-only - Is semisync - Does not use traditional replication at all
  • 109. A common Galera cluster setup - Two master nodes and one arbiter node - The arbiter node does not store anything
  • 110. - Basics of performance monitoring - Notion of replication, types of replication - Traditional replication in details - Galera cluster and how it works - MMM, PRM and query proxying
  • 111. Questions? - Please feel free to email me at alex@gitinsky.com - My Skype ID is demeliorator
  • 112. Thank you! - Good luck in the wonderful world of MariaDB!