MySQL 5.7 New Features for Developers session for DOAG (Oracle user group conference) in 2016. A similar version was also presented in Israel MySQL User Group on November 2016.
This presentation review new features in MySQL 5.7: Optimizer, InnoDB engine, JSON native data type, performance and sys schemas
2. Who am I?
⢠Zohar Elkayam, CTO at Brillix
⢠Programmer, DBA, team leader, database trainer, public
speaker, and a senior consultant for over 18 years
⢠Oracle ACE Associate
⢠Part of ilOUG â Israel Oracle User Group
⢠Blogger â www.realdbamagic.com and www.ilDBA.co.il
2 http://brillix.co.il
3. About Brillix
⢠We offer complete, integrated end-to-end solutions based on best-of-
breed innovations in database, security and big data technologies
⢠We provide complete end-to-end 24x7 expert remote database
services
⢠We offer professional customized on-site trainings, delivered by our
top-notch world recognized instructors
3
5. Agenda
â˘Optimizer, Performance and InnoDB changes
â˘Native JSON datatype
â˘The Performance Schema and SYS Schema
â˘Other features we should know
â˘What are we waiting for in MySQL 8.0?
http://brillix.co.il5
6. Versions Guide
â˘MySQL 5.7 released in October 2015
â˘Current version 5.7.16 (released October 2016)
⢠Over 200+ new features!
â˘Next major version will be MySQL 8.0
(Currently, DMR â a Development Milestone Release)
http://brillix.co.il6
9. Optimizer Changes
⢠Parser and Optimizer refactoring
⢠Readability, maintainability and stability
⢠Separate parsing, optimizing, execution stages
⢠Easier feature additions, with lessened risk
⢠New hint framework: easier to manage, new hints
⢠Cost-based optimizer
⢠Configurable and tunable: mysql.server_cost and
mysql.engine_cost
⢠API for where data resides: on disk or in cache
http://brillix.co.il9
11. Cost-based Query Optimization
â˘Assign cost to operations
â˘Assign cost to partial or alternative plans
â˘Search for plan with lowest cost (âbest planâ)
â˘Cost base optimization control
⢠Access method
⢠Join order
⢠Subquery strategy
http://brillix.co.il11
13. 0
20
40
60
80
100
Q3 Q7 Q8 Q9 Q12
Executiontimerelativeto5.6(%)
5 out of 22 queries get a much improved query plan (others remain the same)
MySQL 5.6
MySQL 5.7
13
Source: MySQL 5.7: 20 Years in the Making! By Geir Høydalsvik, Sr. Director, MySQL Engineering
Optimizer Cost Model: Performance Improvements
DBT-3 (Size Factor 10, CPU Bound)
14. 0
20
40
60
80
100
Q2 Q18
Executiontimerelativeto5.6(%)
CPU bound
5.6
5.7
Optimizer Cost Model: Performance Improvements
DBT-3 (Size Factor 10)
2 out of 22 queries get a significantly improved query plan (others remain the same)
0
20
40
60
80
100
Q2 Q18
Executiontimerelativeto5.6(%)
Disk bound
5.6
5.7
14 Source: MySQL 5.7: 20 Years in the Making! By Geir Høydalsvik, Sr. Director, MySQL Engineering
15. Adjustable Cost Constants (Experimental!)
⢠We can change the costs factors by changing system
tables:
⢠Use mysql.engine_cost and mysql.server_costs to
change default values
⢠Use FLUSH command to make server aware of new values
(only new connections will see updated cost constants)
http://brillix.co.il15
update mysql.engine_cost set cost_value=2
where cost_name='io_block_read_cost';
flush optimizer_costs;
16. The Query Rewrite Plugin
⢠New pre and post parse query rewrite APIs
⢠Users can write their own plug-ins
⢠Provides a post-parse query plugin
⢠Rewrite problematic queries without the need to make application
changes
⢠Add hints
⢠Modify join order
⢠Improve problematic queries from ORMs, third party apps, etc.
⢠Eliminates many legacy use cases for proxies
http://brillix.co.il16
17. InnoDB Improvements: Temporary Tables
⢠Temp tables are no longer stored in normal system tables
⢠Definitions can be maintained in-memory (without persisting
to the disk)
⢠Locking constraints can be relaxed since only one client can
see these tables.
http://brillix.co.il17
18. InnoDB Improvements: Native Partitioning
â˘Eliminates previous limitations
â˘Eliminates resource usage problems
⢠No longer using the ha_partition handler
⢠Reduce memory usage by 90%
⢠MySQL 5.7.9 will try to upgrade old partitions to native or
Use ALTER TABLE ... UPGRADE PARTITIONING
command
â˘Transportable tablespace support (5.7.4)
http://brillix.co.il18
19. InnoDB Bulk Load Performance
â˘Bulk loading is now used when creating or rebuilding
indexes
â˘Much faster INDEX creation:
⢠2-3 x performance improvement for ADD/CREATE INDEX
operations
⢠2-5% improvement for standard INSERT operations
http://brillix.co.il19
20. InnoDB Page level Compression
â˘Not the same as Table Compression (5.1 feature)
â˘Reduces IO for better performance
â˘The compressed data is written to disk, where the hole
punching mechanism then releases empty blocks from
the end of the page.
â˘If compression fails, data is written out as-is.
â˘Supports Zlib and LZ4 compressions
http://brillix.co.il20
23. The JSON Native Datatype
⢠New native data type: JSON
⢠Supports all JSON internal types
⢠Numbers, strings, bool
⢠Objects, JSON arrays
⢠Supports also extended data types: date, time, datetime, and
timestamp
⢠Efficient access: optimized for read intensive workload
⢠Performance: fast access to array cells by creating indexes
⢠MySQL Document Store (5.7.12)
http://brillix.co.il23
24. Why Not Just Use TEXT/Varchar?
â˘Document validation: parse and validation on insert
â˘Efficient binary format
⢠Allows quicker access to object members and array
elements
⢠Binary format of JSON type is very efficient at searching
⢠Storing as TEXT performs over 10x worse at traversal
â˘Built-in handling functions
http://brillix.co.il24
25. Built-in JSON Functions
â˘5.7 has built in functions to CREATE, SEARCH,
MODIFY and RETURN JSON documents and JSON
values
â˘For a complete list of the JSON functions:
https://dev.mysql.com/doc/refman/5.7/en/json-function-
reference.html
http://brillix.co.il25
26. Create JSON_OBJECT and JSON_ARRAY
â˘Use JSON_OBJECT to create JSON from tables:
â˘JSON_ARRAY is a function that generates a
JSON array from a list of values
http://brillix.co.il26
mysql> select JSON_OBJECT('id', id, 'firstName', first_name, 'lastName',
last_name) from employees_old;
+-----------------------------------------------------------------------+
| JSON_OBJECT('id', id, 'firstName', first_name, 'lastName', last_name) |
+-----------------------------------------------------------------------+
| {"id": 1, "lastName": "Elkayam", "firstName": "Tamar"} |
| {"id": 2, "lastName": "Elkayam", "firstName": "Efrat"} |
| {"id": 3, "lastName": "Elkayam", "firstName": "Zohar"} |
| {"id": 4, "lastName": "Elkayam", "firstName": "Ido"} |
+-----------------------------------------------------------------------+
4 rows in set (0.00 sec)
27. Extracting a Value: JSON_EXTRACT
â˘JSON_EXTRACT â Extracts data from the JSON
⢠Uses JSON Path
⢠supports two shorthand operators:
http://brillix.co.il27
JSON_EXTACT (column_name, â$.typeâ(
column_name->"$.type" (extract)
column_name->>"$.type" (extract + unquote)
28. Searching In JSON
â˘Use JSON_SEARCH to find paths with certain data
â˘Use JSON_CONTAINS and JSON_CONTAIN_PATH
to know if data exists in the document
â˘Use the extract shorthand syntax to locate the rows
containing data
http://brillix.co.il28
JSON_SEARCH (column_name, âone-or-allâ, âvalueâ(
WHERE column_name->"$.type" = âvalueâ
29. Modifying JSON Documents
â˘We can modify existing JSON with various functions
⢠JSON_APPEND/JSON_ARRAY_APPEND: appends
value/JSON array to a JSON documents
⢠JSON_INSERT/JSON_ARRAY_INSERT: adds data/JSON
array to a JSON
⢠JSON_MERGE: merge JSONs
⢠JSON_REPLACE: replace values at path
⢠JSON_SET: replace value at path or adds it if not exist
http://brillix.co.il29
30. Performance: Functional Indexes with JSON
â˘Indexing of Documents using Generated Columns
â˘InnoDB supports indexes on both STORED and
VIRTUAL (default) Generated Columns
â˘New expression analyzer automatically uses the best
âfunctionalâ index available
http://brillix.co.il30
alter table employees add id numeric as (data->>"$.id");
create index emp_id on employees(id);
32. Performance Schema
â˘Performance monitoring schema
â˘Include various table to help us with instrumentation
and performance monitoring
⢠Memory Instrumentation
⢠Statement Instrumentation
⢠Transactions and Locks
⢠Additional Information: replication, stored routines, etc.
http://brillix.co.il32
33. Whatâs New in Performance Schema
â˘User variables, status variables (session/global)
â˘Reduced footprint, memory usage and overhead
â˘Total of 35 new tables â 5.6 had 52, 5.7 has 87 (with
983 instrumentations!)
â˘But using the performance schema might be a bit too
complicatedâŚ.
http://brillix.co.il33
34. The SYS Schema
⢠A set of objects that helps DBAs and developers interpret
data collected by the Performance Schema
⢠Provides helper objects that answer common performance,
health, usage, and monitoring questions
⢠Introduced in 5.6 as part of an extension to MySQL
(available at GitHub: https://github.com/mysql/mysql-sys)
⢠MySQL 5.7.7+ includes the SYS schema by default
http://brillix.co.il34
35. MySQL SYS Summary Views
⢠Reference set of views solving various administrator use cases
⢠Simple views, you can create/copy your own, sys is not âlocked
downâ
⢠Build upon both performance_schema and
INFORMATION_SCHEMA
⢠Both formatted and raw views are available
⢠All raw views are prefixed with x$
⢠Raw views are there for tools to poll
⢠Formatted views are for humans and the command line
http://brillix.co.il35
36. What Data Can We See In Summary Views?
â˘SYS schema comes with multiple dimensions for
summary views:
⢠User/Host summary views
⢠IO summary views
⢠Schema analysis
⢠Wait and Lock wait analysis
⢠Statement analysis
⢠Memory views
http://brillix.co.il36
37. Formatter and Helper Functions
⢠SYS Schema also provides formatter and helper functions:
⢠Make output human readable
⢠Format time appropriately
⢠Format bytes appropriately
⢠Truncate output for large width values for CLI
⢠Extract object names
⢠Check instrumentation state
⢠Performance Schema Config Helper Procedures
http://brillix.co.il37
41. Generated Columns (5.7.5)
â˘Virtual Columns â functional columns calculated on the
fly OR stored result
â˘Generated column can use JSON extracts
â˘Support Indexes â Functional Indexes
â˘Example by Percona:
https://www.percona.com/blog/2015/04/29/generated-
virtual-columns-in-mysql-5-7-labs/
http://brillix.co.il41
42. Server Side Timeouts (5.7.5)
â˘Interrupting the execution of a statement when it takes
longer to complete
⢠Defined at the Global for server, per session, or for
individual SELECT statements
â˘Example
http://brillix.co.il42
SET GLOBAL MAX_STATEMENT_TIME=1000;
SET SESSION MAX_STATEMENT_TIME=2000;
SELECT /* MAX_STATEMENT_TIME=1000 */ * FROM table;
43. What Are We Waiting For In MySQL 8.0?
â˘Improve with UTF8 support: default will be utf8mb4
instead of latin1
â˘Transactional data dictionary
â˘Common Table Expression (CTE)
â˘Invisible mode for Indexes and other object types
http://brillix.co.il43
44. What Are We Waiting For In MySQL 8.0?
(cont.)
â˘Performance changes
⢠Cost based optimizer statistics
⢠Remove buffer pool mutexes
⢠Redo log improvements for scalability
⢠Memcache improvement to support multiple get and range
scan
â˘Bug Fixes: for example, persistent auto increment (bug
#199, created on 27 March 2003)
http://brillix.co.il44
45. What Features Did We NOT Talk About?
â˘MySQL Document Store (The X Plugin and X Protocol)
â˘Security Improvements
â˘GIS Improvements
â˘Replication Improvements
â˘More (200+ new features!):
see http://www.thecompletelistoffeatures.com/
for more details!
http://brillix.co.il45
47. Summary
â˘Performance, Optimizer and other InnoDB features are
really important
â˘Use JSON and other document store features to make
more with MySQL
â˘Learn how to use SYS Schema to leverage internal
knowledge to your advantage
http://brillix.co.il47
UNION ALL queries no longer use temporary tables
Improved optimizations for queries with IN expressions
Improved optimizations for full-text queries
More efficient sorting
13
14
Only new connections will see updated cost constants
As you may know, MySQL supports a plugin API that enables creation of server components. Plugins can be loaded at server startup, or loaded and unloaded at runtime without restarting the server.
In 5.7, we provide you two rewrite APIs so that you can write your own plugin.
Pre parse API: interface is query text, you rewrite queries by replacing text with other text
Post parse API: you have to modify the parse tree. May not be as difficult as it sounds. We offer a parser service through the plugin API. It contains methods to invoke parsing, get a normalized query text from parser tree, walk the parse tree.
In addition to providing interfaces for writing your own plugin, we also provide a production quality query rewrite plugin which rewrites queries without the need to make application changes. You can use this plugin to add hints, mofify join order and more.
Our query rewrite plugin uses the post parse plug-in interface, because it gives next to zero performanc overhead. I will explain how this works.
The main point of choosing post parsing is that digest is computed during parsing, and we use the digest to match incoming query with rules.
Insert/Update/Delete workload:Â (2-4x performance gain)
insert workload:Â insert n entries: sorted, n entries: reverse sorted, n entries: random (total 3n entries)
delete workload:Â delete n entries: first initiate insert workload to load table, delete n entries using primary key, delete n entries using secondary index, delete n entries using primary key such that complete table is empty.
update workload:Â update n entries: first initiate insert workload to load table, update n entries using primary key, update n entries using secondary index, update all 3n entries 2 times. (no explicit key specified).
In the past, partitioning required the use of the ha_partition handler, which requires one handler object per partition instead of one per table.
Partitioned tables are also easier to move using the Transportable Tablespace feature.
Before "native partitions" (5.7.6), each partition consumed a chunk of memory. The InnoDB native partitioning removes overhead and reduces memory requirements by up to 90%
ALTER TABLE ... UPGRADE PARTITIONING - locks the table!
As of 5.7.4, OPTIMIZE TABLE uses online DDL (ALGORITHM=INPLACE) for both regular and partitioned InnoDB tables. The table rebuild, triggered by OPTIMIZE TABLE and performed under the cover by ALTER TABLE ... FORCE, is now performed using online DDL (ALGORITHM=INPLACE) and only locks the table for a brief interval, which reduces downtime for concurrent DML operations.