SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Character encoding
Breaking and unbreaking your data
Maciej Dobrzanski
maciek@psce.com | @mushupl
Brussels, 1 Feb 2015
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Character Encoding
• Binary representation of glyphs
• Each character can be represented by 1 or more bytes
• Popular schemes
• ASCII
• Unicode
• UTF-8, UTF-16, UTF-32
• Language specific character sets
• US (Latin US)
• Europe (Latin 1, Latin 2)
• Asia (EUC-KR, GB18030)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Character Encoding
• Character set defines the visual interpretation of binary information
• One glyph can be associated with several numeric codes
• One numeric code may be used to represent several different glyphs
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Please state the nature of the emergency
• Application configuration
• Database configuration
• Table/column definitions
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1: We are all born Swedish
• MySQL uses latin1 by default
• MySQL 5.7 too
• Is anyone actually aware of that?
• Why Swedish?
• latin1_swedish_ci is the default collation
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
• Let’s build an application
mysql> SELECT @@global.character_set_server, @@session.character_set_client;
+-------------------------------+--------------------------------+
| @@global.character_set_server | @@session.character_set_client |
+-------------------------------+--------------------------------+
| latin1 | latin1 |
+-------------------------------+--------------------------------+
1 row in set (0.00 sec)
mysql> CREATE SCHEMA fosdem;
Query OK, 1 row affected (0.00 sec)
mysql> USE fosdem;
mysql> CREATE TABLE locations (city VARCHAR(30) NOT NULL);
Query OK, 0 rows affected (0.15 sec)
mysql> SHOW CREATE TABLE locationsG
*************************** 1. row ***************************
Table: locations
Create Table: CREATE TABLE `locations` (
`city` varchar(30) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
• Everything is correct… NOT!
mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from locations;
+--------------------+
| city |
+--------------------+
| Berlin |
| Kraków |
| 東京都 |
+--------------------+
3 rows in set (0.00 sec)
mysql> SET NAMES latin1;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from locations;
+-----------+
| city |
+-----------+
| Berlin |
| Kraków |
| 東京都 |
+-----------+
3 rows in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
• Let’s fix this
• Or can we ignore it?
• Ruby may not like it
# grep character-set-server /etc/mysql/my.cnf
character-set-server = utf8
mysql> SELECT @@global.character_set_server, @@session.character_set_client;
+-------------------------------+--------------------------------+
| @@global.character_set_server | @@session.character_set_client |
+-------------------------------+--------------------------------+
| utf8 | utf8 |
+-------------------------------+--------------------------------+
1 row in set (0.00 sec)
...we are fixing our tables here...
mysql> SHOW CREATE TABLE locationsG
*************************** 1. row ***************************
Table: locations
Create Table: CREATE TABLE `locations` (
`city` varchar(30) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1: The good news
• It’s usually fixable
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
• Where do you set character sets in MySQL?
• Sesssion settings
• character_set_server
• character_set_client
• character_set_connection
• character_set_database
• character_set_result
• Schema level defaults
• Table level defaults
• Column charsets
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• Having fixed our problem #1, we continue to develop our application
mysql> SELECT @@session.character_set_server, @@session.character_set_client;
+--------------------------------+--------------------------------+
| @@session.character_set_server | @@session.character_set_client |
+--------------------------------+--------------------------------+
| utf8 | utf8 |
+--------------------------------+--------------------------------+
1 row in set (0.00 sec)
mysql> USE fosdem;
mysql> CREATE TABLE people (first_name VARCHAR(30) NOT NULL, last_name VARCHAR(30) NOT NULL);
Query OK, 0 rows affected (0.13 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• Why is the table character set latin1?
mysql> SELECT @@session.character_set_server, @@session.character_set_client;
+--------------------------------+--------------------------------+
| @@session.character_set_server | @@session.character_set_client |
+--------------------------------+--------------------------------+
| utf8 | utf8 |
+--------------------------------+--------------------------------+
1 row in set (0.00 sec)
mysql> USE fosdem;
mysql> SHOW CREATE TABLE peopleG
*************************** 1. row ***************************
Table: people
Create Table: CREATE TABLE `people` (
`first_name` varchar(30) NOT NULL,
`last_name` varchar(30) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• What’s all this, then?
mysql> SHOW SESSION VARIABLES LIKE 'character_set_%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> SHOW CREATE DATABASE fosdemG
*************************** 1. row ***************************
Database: fosdem
Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• Can we fix this?
mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT last_name, HEX(last_name) FROM people;
+------------+----------------------+
| last_name | HEX(last_name) |
+------------+----------------------+
| Lemon | 4C656D6F6E |
| Müller | 4DFC6C6C6572 |
| Dobrza?ski | 446F62727A613F736B69 |
+------------+----------------------+
3 rows in set (0.00 sec)
mysql> SET NAMES latin2;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT last_name, HEX(last_name) FROM people;
+------------+----------------------+
| last_name | HEX(last_name) |
+------------+----------------------+
| Lemon | 4C656D6F6E |
| Müller | 4DFC6C6C6572 |
| Dobrza?ski | 446F62727A613F736B69 |
+------------+----------------------+
3 rows in set (0.00 sec)
• We can’t! :-(
• 0x3F is '?', so my 'ń' was lost
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: The bad news
• It may not be enough to configure the server correctly
• A mismatch between client and server can permantenly break data
• Implicit conversion inside MySQL server
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
• Where do you set character sets in MySQL?
• Sesssion settings
• character_set_server
• character_set_client
• character_set_connection
• character_set_database
• character_set_result
• Schema level defaults – affect new tables
• Table level defaults – affect new columns
• Column charsets
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} ((none)) > SELECT @@global.character_set_server, @@session.character_set_client;
+-------------------------------+--------------------------------+
| @@global.character_set_server | @@session.character_set_client |
+-------------------------------+--------------------------------+
| latin1 | utf8 |
+-------------------------------+--------------------------------+
1 row in set (0.00 sec)
master [localhost] {msandbox} ((none)) > CREATE SCHEMA fosdemG
Query OK, 1 row affected (0.00 sec)
master [localhost] {msandbox} ((none)) > SHOW CREATE SCHEMA fosdemG
*************************** 1. row ***************************
Database: fosdem
Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} ((none)) > USE fosdem;
Database changed
master [localhost] {msandbox} (fosdem) > CREATE TABLE test (a VARCHAR(300), INDEX (a));
Query OK, 0 rows affected (0.62 sec)
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) DEFAULT NULL,
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} (fosdem) > ALTER TABLE test DEFAULT CHARSET = utf8;
Query OK, 0 rows affected (0.08 sec)
Records: 0 Duplicates: 0 Warnings: 0
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) CHARACTER SET latin1 DEFAULT NULL,
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} (fosdem) > ALTER TABLE test ADD b VARCHAR(10);
Query OK, 0 rows affected (0.74 sec)
Records: 0 Duplicates: 0 Warnings: 0
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) CHARACTER SET latin1 DEFAULT NULL,
`b` varchar(10) DEFAULT NULL,
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
I f**ckd up. What do I do?
• Let’s start with what you shouldn’t do
• Keep calm and don’t start by changing something
• Analyze the situation
• Why did the problem occur in the first place?
• Reassess the damage
• Is it consistent?
• Are all rows broken in the same way?
• Are some rows bad, but others are okay?
• Are all bad in several different ways?
• Is it actually repearable?
• No character mapping occurred during writes (e.g. unicode over latin1/latin1)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. What else I shouldn’t do, then?
• Do not rush things as you may easily go from bad to worse
• Do not start fixing this on a replication slave
• You can’t fix this by fixing tables one by one on a live database
• Unless you really have everything in one table
• Do not use: ALTER TABLE … DEFAULT CHARSET = …
• It only changes the default character set for new columns
• Do not use: ALTER TABLE … CONVERT TO CHARACTER SET …
• It’s not for fixing broken encoding
• Do not use: ALTER TABLE … MODIFY col_name … CHARACTER SET …
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• What needs to be fixed?
• Schema defaut character set
• ALTER SCHEMA fosdem DEFAULT CHARSET = utf8
• Tables with text columns: CHAR, VARCHAR, TEXT, TINYTEXT, LONGTEXT
• What about ENUM?
• Use INFORMATION_SCHEMA to grab a list
• What about other tables?
• They too (eventually), but it’s not critical
SELECT CONCAT(c.table_schema, '.', c.table_name) AS candidate_table
FROM information_schema.columns c
WHERE c.table_schema = 'fosdem'
AND c.column_type REGEXP '^(.*CHAR|.*TEXT|ENUM)((.+))?$'
GROUP BY candidate_table;
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• Option 1 – requires downtime
• Dump and restore
• Dump the data preserving the bad configuration and drop the old database
bash# mysqldump -u root -p --skip-set-charset --default-character-set=latin1 fosdem >
fosdem.sql
mysql> DROP SCHEMA fosdem;
• Correct table definitions in the dump file
• Edit DEFAULT CHARSET in all CREATE TABLE statements
• Create the database again and import the data back
mysql> CREATE SCHEMA fosdem DEFAULT CHARSET utf8;
bash# mysql -u root -p --default-character-set=utf8 fosdem < fosdem.sql
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• Option 2 – requires downtime
• Perform a two step conversion with ALTER TABLE
• Original encoding -> VARBINARY/BLOB -> Target encoding
• Conversion from/to BINARY/BLOB removes character set context
• How?
• Stop applications
• On each tabe, for each text column perform:
ALTER TABLE tbl MODIFY col_name VARBINARY(255);
ALTER TABLE tbl MODIFY col_name VARCHAR(255) CHARACTER SET utf8;
• You may specify multiple columns per ALTER TABLE
• Fix the problems (application and/or db configs)
• Restart applications
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• Option 3 – online character set fix; no downtime*
• Thanks to our plugin for pt-online-schema-change
• and a tiny patch for pt-online-schema-change that goes with the plugin 
• How?
• Start pt-online-schema-change on all tables – one by one
• Do not rotate tables (--no-swap-tables) or drop pt-osc triggers
• Wait until all tables have been converted
• Stop applications
• Fix the problems (application and/or db configs)
• Rotate tables – takes just 1 minute
• Restart applications
• Et voilà
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
GOTCHAs!
• Data space requrements may change during conversion
• Latin1 uses 1 byte per character, utf8 will need to assume 3 bytes
• VARCHAR/TEXT fit up to 64KB – it won’t fit 65536 multi-byte characters
• Key length limit is 767 bytes
• Data type and/or index length changes may be required
• Test and plan this ahead
• There may be more prolems than you think
• Detect irrecoverible problems with a simple stored procedure
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
CREATE FUNCTION `cnv_test_conversion` (`value_before` LONGTEXT, `value_after` LONGTEXT) RETURNS tinyint(1)
BEGIN
RETURN (IFNULL(CONVERT(CONVERT(`value_before` USING latin1) USING binary), "") =
IFNULL(CONVERT(`value_after` USING binary), ""));
END;;
01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com
GOTCHAs!
master [localhost] {msandbox} (fosdem) > ALTER TABLE test MODIFY a VARCHAR(300) CHARACTER SET utf8;
Query OK, 0 rows affected, 1 warning (1.23 sec)
Records: 0 Duplicates: 0 Warnings: 1
master [localhost] {msandbox} (fosdem) > SHOW WARNINGSG
*************************** 1. row ***************************
Level: Warning
Code: 1071
Message: Specified key was too long; max key length is 767 bytes
1 row in set (0.00 sec)
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) DEFAULT NULL,
`b` varchar(10) DEFAULT NULL,
KEY `a` (`a`(255))
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
How to do it right?
• Set character-set-server during initial configuration
• When creating new schemas, always specify the desired charset
• CREATE SCHEMA fosdem DEFAULT CHARSET = utf8
• ALTER SCHEMA fosdem DEFAULT CHARSET = utf8
• When creating new tables, also explicitly specify the charset
• CREATE TABLE people (…) DEFAULT CHARSET = utf8
• And don’t forget to configure applications too
• You can try to force charset on the clients
• init-connect = "SET NAMES utf8"
• It might also break applications that don’t want to talk to MySQL using utf8
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Oh, and one more thing…
01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com
• We are sharing WebScaleSQL packages with the MySQL Community!
• Check out http://www.psce.com/blog for details
• Follow @dbasquare to receive updates
01.02.2015 Follow us on Twitter @dbasquare 35
WebScaleSQL
What is WebScaleSQL?
WebScaleSQL is a collaboration among engineers from several companies
such as Facebook, Twitter, Google or Linkedin, that face the same challenges
in deploying MySQL at scale, and seek greater performance from a database
technology tailored for their needs.

Weitere ähnliche Inhalte

Was ist angesagt?

Performance Schema for MySQL Troubleshooting
 Performance Schema for MySQL Troubleshooting Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingSveta Smirnova
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL PerformanceSveta Smirnova
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL PerformanceSveta Smirnova
 
Full Table Scan: friend or foe
Full Table Scan: friend or foeFull Table Scan: friend or foe
Full Table Scan: friend or foeMauro Pagano
 
Basic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAsBasic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAsSveta Smirnova
 
SQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tangoSQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tangoMauro Pagano
 
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...Sergey Petrunya
 
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cMauro Pagano
 
Percona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningPercona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningSergey Petrunya
 
Histograms in 12c era
Histograms in 12c eraHistograms in 12c era
Histograms in 12c eraMauro Pagano
 
Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2Sergey Petrunya
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizerMauro Pagano
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningFromDual GmbH
 
New features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersSergey Petrunya
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013Sergey Petrunya
 
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerSergey Petrunya
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite PluginsSveta Smirnova
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sSveta Smirnova
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explainedMauro Pagano
 
MariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histogramsMariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histogramsSergey Petrunya
 

Was ist angesagt? (20)

Performance Schema for MySQL Troubleshooting
 Performance Schema for MySQL Troubleshooting Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
 
Full Table Scan: friend or foe
Full Table Scan: friend or foeFull Table Scan: friend or foe
Full Table Scan: friend or foe
 
Basic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAsBasic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAs
 
SQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tangoSQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tango
 
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
 
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12c
 
Percona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningPercona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuning
 
Histograms in 12c era
Histograms in 12c eraHistograms in 12c era
Histograms in 12c era
 
Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
New features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizers
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
 
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query Optimizer
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]s
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explained
 
MariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histogramsMariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histograms
 

Andere mochten auch

Data encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chipData encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chipLogicMindtech Nologies
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streamsunivalence
 
CCNA
CCNACCNA
CCNAniict
 
Encoding in Data Communication DC8
Encoding in Data Communication DC8Encoding in Data Communication DC8
Encoding in Data Communication DC8koolkampus
 
Asynchronous and synchronous
Asynchronous and synchronousAsynchronous and synchronous
Asynchronous and synchronousAkhil .B
 
Synchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transferSynchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transferAnuj Modi
 
Ccna Presentation
Ccna PresentationCcna Presentation
Ccna Presentationbcdran
 
Data Encoding
Data EncodingData Encoding
Data EncodingLuka M G
 

Andere mochten auch (9)

Data encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chipData encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chip
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streams
 
Data encoding
Data encodingData encoding
Data encoding
 
CCNA
CCNACCNA
CCNA
 
Encoding in Data Communication DC8
Encoding in Data Communication DC8Encoding in Data Communication DC8
Encoding in Data Communication DC8
 
Asynchronous and synchronous
Asynchronous and synchronousAsynchronous and synchronous
Asynchronous and synchronous
 
Synchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transferSynchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transfer
 
Ccna Presentation
Ccna PresentationCcna Presentation
Ccna Presentation
 
Data Encoding
Data EncodingData Encoding
Data Encoding
 

Ähnlich wie Character Encoding - MySQL DevRoom - FOSDEM 2015

Introduction To Lamp P2
Introduction To Lamp P2Introduction To Lamp P2
Introduction To Lamp P2Amzad Hossain
 
MySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That BiteMySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That BiteRonald Bradford
 
MySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SFMySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SFRonald Bradford
 
15 protips for mysql users pfz
15 protips for mysql users   pfz15 protips for mysql users   pfz
15 protips for mysql users pfzJoshua Thijssen
 
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert VanderkelenOSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert VanderkelenNETWAYS
 
MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07Ronald Bradford
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
 
MySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queriesMySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queriesDamien Seguy
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreMariaDB plc
 
My SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTNMy SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTNRonald Bradford
 
Bt0075, rdbms and my sql
Bt0075, rdbms and my sqlBt0075, rdbms and my sql
Bt0075, rdbms and my sqlsmumbahelp
 
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Tesora
 
Applied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System PresentationApplied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System PresentationRichard Crowley
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527Saewoong Lee
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2Morgan Tocker
 

Ähnlich wie Character Encoding - MySQL DevRoom - FOSDEM 2015 (20)

MySQL SQL Tutorial
MySQL SQL TutorialMySQL SQL Tutorial
MySQL SQL Tutorial
 
Introduction To Lamp P2
Introduction To Lamp P2Introduction To Lamp P2
Introduction To Lamp P2
 
MySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That BiteMySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That Bite
 
MySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SFMySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SF
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
 
15 protips for mysql users pfz
15 protips for mysql users   pfz15 protips for mysql users   pfz
15 protips for mysql users pfz
 
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert VanderkelenOSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
 
MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Explain
ExplainExplain
Explain
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
MySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queriesMySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queries
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
 
Mysql basics1
Mysql basics1Mysql basics1
Mysql basics1
 
My SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTNMy SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTN
 
Bt0075, rdbms and my sql
Bt0075, rdbms and my sqlBt0075, rdbms and my sql
Bt0075, rdbms and my sql
 
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
 
Applied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System PresentationApplied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System Presentation
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
 

Kürzlich hochgeladen

JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Effort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software ProjectsEffort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software ProjectsDEEPRAJ PATHAK
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...kalichargn70th171
 
Advantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxAdvantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxRTS corp
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfkalichargn70th171
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxSasikiranMarri
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdfSteve Caron
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Mastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxMastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxAS Design & AST.
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 

Kürzlich hochgeladen (20)

JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Effort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software ProjectsEffort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software Projects
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
 
Advantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxAdvantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptx
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Mastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxMastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptx
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 

Character Encoding - MySQL DevRoom - FOSDEM 2015

  • 1. Character encoding Breaking and unbreaking your data Maciej Dobrzanski maciek@psce.com | @mushupl Brussels, 1 Feb 2015 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 2. Character Encoding • Binary representation of glyphs • Each character can be represented by 1 or more bytes • Popular schemes • ASCII • Unicode • UTF-8, UTF-16, UTF-32 • Language specific character sets • US (Latin US) • Europe (Latin 1, Latin 2) • Asia (EUC-KR, GB18030) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 3. Character Encoding • Character set defines the visual interpretation of binary information • One glyph can be associated with several numeric codes • One numeric code may be used to represent several different glyphs 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 4. Please state the nature of the emergency • Application configuration • Database configuration • Table/column definitions 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 5. Problem #1: We are all born Swedish • MySQL uses latin1 by default • MySQL 5.7 too • Is anyone actually aware of that? • Why Swedish? • latin1_swedish_ci is the default collation 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 6. Problem #1 • Let’s build an application mysql> SELECT @@global.character_set_server, @@session.character_set_client; +-------------------------------+--------------------------------+ | @@global.character_set_server | @@session.character_set_client | +-------------------------------+--------------------------------+ | latin1 | latin1 | +-------------------------------+--------------------------------+ 1 row in set (0.00 sec) mysql> CREATE SCHEMA fosdem; Query OK, 1 row affected (0.00 sec) mysql> USE fosdem; mysql> CREATE TABLE locations (city VARCHAR(30) NOT NULL); Query OK, 0 rows affected (0.15 sec) mysql> SHOW CREATE TABLE locationsG *************************** 1. row *************************** Table: locations Create Table: CREATE TABLE `locations` ( `city` varchar(30) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 7. Problem #1 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 8. Problem #1 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 9. Problem #1 • Everything is correct… NOT! mysql> SET NAMES utf8; Query OK, 0 rows affected (0.00 sec) mysql> select * from locations; +--------------------+ | city | +--------------------+ | Berlin | | Kraków | | 東京都 | +--------------------+ 3 rows in set (0.00 sec) mysql> SET NAMES latin1; Query OK, 0 rows affected (0.00 sec) mysql> select * from locations; +-----------+ | city | +-----------+ | Berlin | | Kraków | | 東京都 | +-----------+ 3 rows in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 10. Problem #1 • Let’s fix this • Or can we ignore it? • Ruby may not like it # grep character-set-server /etc/mysql/my.cnf character-set-server = utf8 mysql> SELECT @@global.character_set_server, @@session.character_set_client; +-------------------------------+--------------------------------+ | @@global.character_set_server | @@session.character_set_client | +-------------------------------+--------------------------------+ | utf8 | utf8 | +-------------------------------+--------------------------------+ 1 row in set (0.00 sec) ...we are fixing our tables here... mysql> SHOW CREATE TABLE locationsG *************************** 1. row *************************** Table: locations Create Table: CREATE TABLE `locations` ( `city` varchar(30) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 11. Problem #1: The good news • It’s usually fixable 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 12. Problem #2: Settings, defaults, inheritance • Where do you set character sets in MySQL? • Sesssion settings • character_set_server • character_set_client • character_set_connection • character_set_database • character_set_result • Schema level defaults • Table level defaults • Column charsets 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 13. Problem #2 • Having fixed our problem #1, we continue to develop our application mysql> SELECT @@session.character_set_server, @@session.character_set_client; +--------------------------------+--------------------------------+ | @@session.character_set_server | @@session.character_set_client | +--------------------------------+--------------------------------+ | utf8 | utf8 | +--------------------------------+--------------------------------+ 1 row in set (0.00 sec) mysql> USE fosdem; mysql> CREATE TABLE people (first_name VARCHAR(30) NOT NULL, last_name VARCHAR(30) NOT NULL); Query OK, 0 rows affected (0.13 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 14. Problem #2 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 15. Problem #2 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 16. Problem #2 • Why is the table character set latin1? mysql> SELECT @@session.character_set_server, @@session.character_set_client; +--------------------------------+--------------------------------+ | @@session.character_set_server | @@session.character_set_client | +--------------------------------+--------------------------------+ | utf8 | utf8 | +--------------------------------+--------------------------------+ 1 row in set (0.00 sec) mysql> USE fosdem; mysql> SHOW CREATE TABLE peopleG *************************** 1. row *************************** Table: people Create Table: CREATE TABLE `people` ( `first_name` varchar(30) NOT NULL, `last_name` varchar(30) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 17. Problem #2 • What’s all this, then? mysql> SHOW SESSION VARIABLES LIKE 'character_set_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql> SHOW CREATE DATABASE fosdemG *************************** 1. row *************************** Database: fosdem Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */ 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 18. Problem #2 • Can we fix this? mysql> SET NAMES utf8; Query OK, 0 rows affected (0.00 sec) mysql> SELECT last_name, HEX(last_name) FROM people; +------------+----------------------+ | last_name | HEX(last_name) | +------------+----------------------+ | Lemon | 4C656D6F6E | | Müller | 4DFC6C6C6572 | | Dobrza?ski | 446F62727A613F736B69 | +------------+----------------------+ 3 rows in set (0.00 sec) mysql> SET NAMES latin2; Query OK, 0 rows affected (0.00 sec) mysql> SELECT last_name, HEX(last_name) FROM people; +------------+----------------------+ | last_name | HEX(last_name) | +------------+----------------------+ | Lemon | 4C656D6F6E | | Müller | 4DFC6C6C6572 | | Dobrza?ski | 446F62727A613F736B69 | +------------+----------------------+ 3 rows in set (0.00 sec) • We can’t! :-( • 0x3F is '?', so my 'ń' was lost 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 19. Problem #2: The bad news • It may not be enough to configure the server correctly • A mismatch between client and server can permantenly break data • Implicit conversion inside MySQL server 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 20. Problem #2: Settings, defaults, inheritance • Where do you set character sets in MySQL? • Sesssion settings • character_set_server • character_set_client • character_set_connection • character_set_database • character_set_result • Schema level defaults – affect new tables • Table level defaults – affect new columns • Column charsets 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 21. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} ((none)) > SELECT @@global.character_set_server, @@session.character_set_client; +-------------------------------+--------------------------------+ | @@global.character_set_server | @@session.character_set_client | +-------------------------------+--------------------------------+ | latin1 | utf8 | +-------------------------------+--------------------------------+ 1 row in set (0.00 sec) master [localhost] {msandbox} ((none)) > CREATE SCHEMA fosdemG Query OK, 1 row affected (0.00 sec) master [localhost] {msandbox} ((none)) > SHOW CREATE SCHEMA fosdemG *************************** 1. row *************************** Database: fosdem Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */ 1 row in set (0.00 sec)
  • 22. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} ((none)) > USE fosdem; Database changed master [localhost] {msandbox} (fosdem) > CREATE TABLE test (a VARCHAR(300), INDEX (a)); Query OK, 0 rows affected (0.62 sec) master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) DEFAULT NULL, KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
  • 23. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} (fosdem) > ALTER TABLE test DEFAULT CHARSET = utf8; Query OK, 0 rows affected (0.08 sec) Records: 0 Duplicates: 0 Warnings: 0 master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) CHARACTER SET latin1 DEFAULT NULL, KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
  • 24. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} (fosdem) > ALTER TABLE test ADD b VARCHAR(10); Query OK, 0 rows affected (0.74 sec) Records: 0 Duplicates: 0 Warnings: 0 master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) CHARACTER SET latin1 DEFAULT NULL, `b` varchar(10) DEFAULT NULL, KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
  • 25. I f**ckd up. What do I do? • Let’s start with what you shouldn’t do • Keep calm and don’t start by changing something • Analyze the situation • Why did the problem occur in the first place? • Reassess the damage • Is it consistent? • Are all rows broken in the same way? • Are some rows bad, but others are okay? • Are all bad in several different ways? • Is it actually repearable? • No character mapping occurred during writes (e.g. unicode over latin1/latin1) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 26. I f**ckd up. What else I shouldn’t do, then? • Do not rush things as you may easily go from bad to worse • Do not start fixing this on a replication slave • You can’t fix this by fixing tables one by one on a live database • Unless you really have everything in one table • Do not use: ALTER TABLE … DEFAULT CHARSET = … • It only changes the default character set for new columns • Do not use: ALTER TABLE … CONVERT TO CHARACTER SET … • It’s not for fixing broken encoding • Do not use: ALTER TABLE … MODIFY col_name … CHARACTER SET … 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 27. I f**ckd up. So how do I fix it? • What needs to be fixed? • Schema defaut character set • ALTER SCHEMA fosdem DEFAULT CHARSET = utf8 • Tables with text columns: CHAR, VARCHAR, TEXT, TINYTEXT, LONGTEXT • What about ENUM? • Use INFORMATION_SCHEMA to grab a list • What about other tables? • They too (eventually), but it’s not critical SELECT CONCAT(c.table_schema, '.', c.table_name) AS candidate_table FROM information_schema.columns c WHERE c.table_schema = 'fosdem' AND c.column_type REGEXP '^(.*CHAR|.*TEXT|ENUM)((.+))?$' GROUP BY candidate_table; 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 28. I f**ckd up. So how do I fix it? • Option 1 – requires downtime • Dump and restore • Dump the data preserving the bad configuration and drop the old database bash# mysqldump -u root -p --skip-set-charset --default-character-set=latin1 fosdem > fosdem.sql mysql> DROP SCHEMA fosdem; • Correct table definitions in the dump file • Edit DEFAULT CHARSET in all CREATE TABLE statements • Create the database again and import the data back mysql> CREATE SCHEMA fosdem DEFAULT CHARSET utf8; bash# mysql -u root -p --default-character-set=utf8 fosdem < fosdem.sql 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 29. I f**ckd up. So how do I fix it? • Option 2 – requires downtime • Perform a two step conversion with ALTER TABLE • Original encoding -> VARBINARY/BLOB -> Target encoding • Conversion from/to BINARY/BLOB removes character set context • How? • Stop applications • On each tabe, for each text column perform: ALTER TABLE tbl MODIFY col_name VARBINARY(255); ALTER TABLE tbl MODIFY col_name VARCHAR(255) CHARACTER SET utf8; • You may specify multiple columns per ALTER TABLE • Fix the problems (application and/or db configs) • Restart applications 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 30. I f**ckd up. So how do I fix it? • Option 3 – online character set fix; no downtime* • Thanks to our plugin for pt-online-schema-change • and a tiny patch for pt-online-schema-change that goes with the plugin  • How? • Start pt-online-schema-change on all tables – one by one • Do not rotate tables (--no-swap-tables) or drop pt-osc triggers • Wait until all tables have been converted • Stop applications • Fix the problems (application and/or db configs) • Rotate tables – takes just 1 minute • Restart applications • Et voilà 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 31. GOTCHAs! • Data space requrements may change during conversion • Latin1 uses 1 byte per character, utf8 will need to assume 3 bytes • VARCHAR/TEXT fit up to 64KB – it won’t fit 65536 multi-byte characters • Key length limit is 767 bytes • Data type and/or index length changes may be required • Test and plan this ahead • There may be more prolems than you think • Detect irrecoverible problems with a simple stored procedure 01.02.2015 Follow us on Twitter @dbasquare www.psce.com CREATE FUNCTION `cnv_test_conversion` (`value_before` LONGTEXT, `value_after` LONGTEXT) RETURNS tinyint(1) BEGIN RETURN (IFNULL(CONVERT(CONVERT(`value_before` USING latin1) USING binary), "") = IFNULL(CONVERT(`value_after` USING binary), "")); END;;
  • 32. 01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com GOTCHAs! master [localhost] {msandbox} (fosdem) > ALTER TABLE test MODIFY a VARCHAR(300) CHARACTER SET utf8; Query OK, 0 rows affected, 1 warning (1.23 sec) Records: 0 Duplicates: 0 Warnings: 1 master [localhost] {msandbox} (fosdem) > SHOW WARNINGSG *************************** 1. row *************************** Level: Warning Code: 1071 Message: Specified key was too long; max key length is 767 bytes 1 row in set (0.00 sec) master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) DEFAULT NULL, `b` varchar(10) DEFAULT NULL, KEY `a` (`a`(255)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
  • 33. How to do it right? • Set character-set-server during initial configuration • When creating new schemas, always specify the desired charset • CREATE SCHEMA fosdem DEFAULT CHARSET = utf8 • ALTER SCHEMA fosdem DEFAULT CHARSET = utf8 • When creating new tables, also explicitly specify the charset • CREATE TABLE people (…) DEFAULT CHARSET = utf8 • And don’t forget to configure applications too • You can try to force charset on the clients • init-connect = "SET NAMES utf8" • It might also break applications that don’t want to talk to MySQL using utf8 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 34. Oh, and one more thing… 01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com
  • 35. • We are sharing WebScaleSQL packages with the MySQL Community! • Check out http://www.psce.com/blog for details • Follow @dbasquare to receive updates 01.02.2015 Follow us on Twitter @dbasquare 35 WebScaleSQL What is WebScaleSQL? WebScaleSQL is a collaboration among engineers from several companies such as Facebook, Twitter, Google or Linkedin, that face the same challenges in deploying MySQL at scale, and seek greater performance from a database technology tailored for their needs.