SlideShare ist ein Scribd-Unternehmen logo
1 von 40
InnoDB Compression
Getting it ready for Facebook scale


Nizam Ordulu nizam.ordulu@fb.com
Software Engineer, database engineering @Facebook
4/11/12
Why use compression
Why use compression

▪   Save disk space.
▪   Buy fewer servers.
▪   Buy better disks (SSD) without too much
    increase in cost.
▪   Reduce IOPS.
Database Size
IOPS
Sysbench Benchmarks
Sysbench
Default table schema for sysbench
CREATE TABLE `sbtest` (
     `id` int(10) unsigned NOT NULL auto_increment,
     `k` int(10) unsigned NOT NULL default '0',
     `c` char(120) NOT NULL default '',
     `pad` char(60) NOT NULL default '',
     PRIMARY KEY (`id`),
     KEY `k` (`k`)
);
In-memory benchmark
Configuration
▪   Buffer pool size =1G.
▪   16 tables.
▪   250K rows on each table.
▪   Uncompressed db size = 1.1G.
▪   Compressed db size = 600M.
▪   In-memory benchmark.
▪   16 threads.
In-memory benchmark
Load Time
                                             Time(s)
80

70

60

50

40
                                                                                            Time(s)
30

20

10

 0
     mysql-uncompressed   mysql-compressed    fb-mysql-uncompressed   fb-mysql-compressed
In-memory benchmark
Database size after load
                                           Size (M)
1200


1000


 800


 600
                                                                                             Size (M)
 400


 200


   0
       mysql-uncompressed   mysql-compressed   fb-mysql-uncompressed   fb-mysql-compressed
In-memory benchmark
Transactions per second for reads (oltp.lua, read-only)
                            Transactions Per Second (Read-Only)
8000

7000

6000

5000

4000
                                                                                               TPS
3000

2000

1000

  0
       mysql-uncompressed     mysql-compressed   fb-mysql-uncompressed   fb-mysql-compressed
In-memory benchmark
Inserts per second (insert.lua)
                                     Inserts Per Second
 60000


 50000


 40000


 30000
                                                                                                  IPS

 20000


 10000


     0
         mysql-uncompressed   mysql-compressed   fb-mysql-uncompressed fb-mysql-compressed (4X)
IO-bound benchmark for inserts
Inserts per second (insert.lua)
                             Inserts Per Second
60000


50000


40000


30000
                                                               IPS

20000


10000


    0
        mysql-uncompressed             fb-mysql-uncompressed
InnoDB Compression
InnoDB Compression
Basics
▪   16K Pages are compressed to 1K, 2K, 4K, 8K blocks.
▪   Block size is specified during table creation.
▪   8K is safest if data is not too compressible.
▪   blobs and varchars increase compressibility.
▪   In-memory workloads may require larger buffer pool.
InnoDB Compression
Example
CREATE TABLE `sbtest1` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `k` int(10) unsigned NOT NULL DEFAULT '0',
 `c` char(120) NOT NULL DEFAULT '’,
 `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED
KEY_BLOCK_SIZE=8
InnoDB Compression
Page Modification Log (mlog)
▪   InnoDB does not recompress a page on every update.
▪   Updates are appended to the modification log.
▪   mlog is located in the bottom of the compressed page.
▪   When mlog is full, page is recompressed.
InnoDB Compression
Page Modification Log Example
InnoDB Compression
Page Modification Log Example
InnoDB Compression
Page Modification Log Example
InnoDB Compression
Page Modification Log Example
InnoDB Compression
Compression failures are bad
▪   Compression failures:
    ▪   waste CPU cycles,
    ▪   cause mutex contention.
InnoDB Compression
Unzip LRU
▪   A compressed block is decompressed when it is read.
▪   Compressed and uncompressed copy are both in memory.
▪   Any update on the page is applied to both of the copies.
▪   When it is time to evict a page:
    ▪   Evict an uncompressed copy if the system is IO-bound.
    ▪   Evict a page from the normal LRU if the system is CPU-
        bound.
InnoDB Compression
Compressed pages written to redo log
▪   Compressed pages are written to redo log.
▪   Reasons for doing this:
    ▪   Reuse redo logs even if the zlib version changes.
    ▪   Prevent against indeterminism in compression.
▪   Increase in redo log writes.
▪   Increase in checkpoint frequency.
InnoDB Compression
Official advice on tuning compression
If the number of “successful” compression operations
(COMPRESS_OPS_OK) is a high percentage of the total
number of compression operations (COMPRESS_OPS), then the
system is likely performing well. If the ratio is low, then InnoDB is
reorganizing, recompressing, and splitting B-tree nodes more
often than is desirable. In this case, avoid compressing some
tables, or increase KEY_BLOCK_SIZE for some of the
compressed tables. You might turn off compression for tables
that cause the number of “compression failures” in your
application to be more than 1% or 2% of the total. (Such a failure
ratio might be acceptable during a temporary operation such as a
data load).
Facebook Improvements
Facebook Improvements
Finding bugs and testing new features
▪   Expanded mtr test suite with crash-recovery and stress tests.
▪   Simulate compression failures.
▪   Fixed the bugs revealed by the tests and production servers.
Facebook Improvements
Table level compression statistics
▪   Added the following columns to table_statistics:
     ▪   COMPRESS_OPS,
     ▪   COMPRESS_OPS_OK,
     ▪   COMPRESS_USECS,
     ▪   UNCOMPRESS_OPS,
     ▪   UNCOMPRESS_USECS.
Facebook Improvements
Removal of compressed pages from redo log
▪   Removed compressed page images from redo log.
▪   Introduced a new log record for compression.
Facebook Improvements
Adaptive padding
▪   Put less data on each page to prevent compression failures.
▪   pad = 16K – (maximum data size allowed on the uncompressed copy)
Facebook Improvements
Adaptive padding
Facebook Improvements
Adaptive padding
Facebook Improvements
Adaptive padding
▪   Algorithm to determine pad per table:
    ▪   Increase the pad until the compression failure rate reaches
        the specified level.
    ▪   Decrease padding if the failure rate is too low.
▪   Adapts to the compressibility of data over time.
Facebook Improvements
Adaptive padding on insert benchmark
                                                     Inserts Per Second
▪   Padding value for sbtable is 2432. 35000
▪   Compression failure rate:          30000

    ▪   mysql: 41%.                    25000

    ▪   fb-mysql: 5%.                  20000

                                       15000

                                       10000

                                        5000

                                           0
                                               mysql-compressed   fb-mysql-compressed
Facebook Improvements
Compression ops in insert benchmark
1400000


1200000


1000000


 800000
                                                   compress_ops_ok
 600000                                            compress_ops_fail


 400000


 200000


      0
          mysql-compressed   fb-mysql-compressed
Facebook Improvements
Time spent for compression ops in insert benchmark
1200


1000


 800

                                                compress_time(s)
 600
                                                decompress_time(s)

 400


 200


   0
       mysql-compressed   fb-mysql-compressed
Facebook Improvements
Other improvements
▪   Amount of empty allocated pages: 10-15% to 2-5%.
▪   Cache memory allocations for:
    ▪   compression buffers,
    ▪   decompression buffers,
    ▪   buffer page descriptors.
▪   Hardware accelerated checksum for compressed pages.
▪   Remove adler32 calls from zlib functions.
Facebook Improvements
Future work
▪   Make page_zip_compress() more efficient.
▪   Test larger page sizes:32K, 64K.
▪   Prefix compression.
▪   Other compression algorithms: snappy, quicklz etc.
▪   3X compression in production.
Questions
nizam.ordulu@fb.com

Weitere ähnliche Inhalte

Was ist angesagt?

006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradmin
Scott Miao
 

Was ist angesagt? (20)

2 db2 instance creation
2 db2 instance creation2 db2 instance creation
2 db2 instance creation
 
MySQL 5.7 milestone
MySQL 5.7 milestoneMySQL 5.7 milestone
MySQL 5.7 milestone
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
Get More Out of MySQL with TokuDB
Get More Out of MySQL with TokuDBGet More Out of MySQL with TokuDB
Get More Out of MySQL with TokuDB
 
Remote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New Features
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradmin
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupHighly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackup
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery Planning
 
HBase 2.0 cluster topology
HBase 2.0 cluster topologyHBase 2.0 cluster topology
HBase 2.0 cluster topology
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016
 
PostgreSQL Hangout Parameter Tuning
PostgreSQL Hangout Parameter TuningPostgreSQL Hangout Parameter Tuning
PostgreSQL Hangout Parameter Tuning
 
IBM DB2 LUW/UDB DBA Training by www.etraining.guru
IBM DB2 LUW/UDB DBA Training by www.etraining.guruIBM DB2 LUW/UDB DBA Training by www.etraining.guru
IBM DB2 LUW/UDB DBA Training by www.etraining.guru
 
Dpm.2007.For.Sql Sonvu
Dpm.2007.For.Sql SonvuDpm.2007.For.Sql Sonvu
Dpm.2007.For.Sql Sonvu
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
PostgreSQL Hangout Replication Features v9.4
PostgreSQL Hangout Replication Features v9.4PostgreSQL Hangout Replication Features v9.4
PostgreSQL Hangout Replication Features v9.4
 
Dbvisit replicate: logical replication made easy
Dbvisit replicate: logical replication made easyDbvisit replicate: logical replication made easy
Dbvisit replicate: logical replication made easy
 

Ähnlich wie Getting innodb compression_ready_for_facebook_scale

MySQL新技术研究与实践
MySQL新技术研究与实践MySQL新技术研究与实践
MySQL新技术研究与实践
orczhou
 
Tracking Page Changes for Your Database and Bitmap Backups
Tracking Page Changes for Your Database and Bitmap BackupsTracking Page Changes for Your Database and Bitmap Backups
Tracking Page Changes for Your Database and Bitmap Backups
Laurynas Biveinis
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
Morgan Tocker
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
Stephen Rose
 

Ähnlich wie Getting innodb compression_ready_for_facebook_scale (20)

MySQL新技术研究与实践
MySQL新技术研究与实践MySQL新技术研究与实践
MySQL新技术研究与实践
 
Linux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLLinux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQL
 
MySQL configuration - The most important Variables
MySQL configuration - The most important VariablesMySQL configuration - The most important Variables
MySQL configuration - The most important Variables
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
 
Tracking Page Changes for Your Database and Bitmap Backups
Tracking Page Changes for Your Database and Bitmap BackupsTracking Page Changes for Your Database and Bitmap Backups
Tracking Page Changes for Your Database and Bitmap Backups
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
 
Dev Compression
Dev CompressionDev Compression
Dev Compression
 
Database performance tuning for SSD based storage
Database  performance tuning for SSD based storageDatabase  performance tuning for SSD based storage
Database performance tuning for SSD based storage
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
Squeak DBX
Squeak DBXSqueak DBX
Squeak DBX
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databases
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Stefano Doni - Achieve Superhuman Performance with Machine Learning
Stefano Doni - Achieve Superhuman Performance with Machine LearningStefano Doni - Achieve Superhuman Performance with Machine Learning
Stefano Doni - Achieve Superhuman Performance with Machine Learning
 
Exploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient WorkflowsExploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient Workflows
 
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
 
Db2 performance tuning for dummies
Db2 performance tuning for dummiesDb2 performance tuning for dummies
Db2 performance tuning for dummies
 
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Getting innodb compression_ready_for_facebook_scale

  • 1. InnoDB Compression Getting it ready for Facebook scale Nizam Ordulu nizam.ordulu@fb.com Software Engineer, database engineering @Facebook 4/11/12
  • 3. Why use compression ▪ Save disk space. ▪ Buy fewer servers. ▪ Buy better disks (SSD) without too much increase in cost. ▪ Reduce IOPS.
  • 7. Sysbench Default table schema for sysbench CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL auto_increment, `k` int(10) unsigned NOT NULL default '0', `c` char(120) NOT NULL default '', `pad` char(60) NOT NULL default '', PRIMARY KEY (`id`), KEY `k` (`k`) );
  • 8. In-memory benchmark Configuration ▪ Buffer pool size =1G. ▪ 16 tables. ▪ 250K rows on each table. ▪ Uncompressed db size = 1.1G. ▪ Compressed db size = 600M. ▪ In-memory benchmark. ▪ 16 threads.
  • 9. In-memory benchmark Load Time Time(s) 80 70 60 50 40 Time(s) 30 20 10 0 mysql-uncompressed mysql-compressed fb-mysql-uncompressed fb-mysql-compressed
  • 10. In-memory benchmark Database size after load Size (M) 1200 1000 800 600 Size (M) 400 200 0 mysql-uncompressed mysql-compressed fb-mysql-uncompressed fb-mysql-compressed
  • 11. In-memory benchmark Transactions per second for reads (oltp.lua, read-only) Transactions Per Second (Read-Only) 8000 7000 6000 5000 4000 TPS 3000 2000 1000 0 mysql-uncompressed mysql-compressed fb-mysql-uncompressed fb-mysql-compressed
  • 12. In-memory benchmark Inserts per second (insert.lua) Inserts Per Second 60000 50000 40000 30000 IPS 20000 10000 0 mysql-uncompressed mysql-compressed fb-mysql-uncompressed fb-mysql-compressed (4X)
  • 13. IO-bound benchmark for inserts Inserts per second (insert.lua) Inserts Per Second 60000 50000 40000 30000 IPS 20000 10000 0 mysql-uncompressed fb-mysql-uncompressed
  • 15. InnoDB Compression Basics ▪ 16K Pages are compressed to 1K, 2K, 4K, 8K blocks. ▪ Block size is specified during table creation. ▪ 8K is safest if data is not too compressible. ▪ blobs and varchars increase compressibility. ▪ In-memory workloads may require larger buffer pool.
  • 16. InnoDB Compression Example CREATE TABLE `sbtest1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '’, `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
  • 17. InnoDB Compression Page Modification Log (mlog) ▪ InnoDB does not recompress a page on every update. ▪ Updates are appended to the modification log. ▪ mlog is located in the bottom of the compressed page. ▪ When mlog is full, page is recompressed.
  • 22. InnoDB Compression Compression failures are bad ▪ Compression failures: ▪ waste CPU cycles, ▪ cause mutex contention.
  • 23. InnoDB Compression Unzip LRU ▪ A compressed block is decompressed when it is read. ▪ Compressed and uncompressed copy are both in memory. ▪ Any update on the page is applied to both of the copies. ▪ When it is time to evict a page: ▪ Evict an uncompressed copy if the system is IO-bound. ▪ Evict a page from the normal LRU if the system is CPU- bound.
  • 24. InnoDB Compression Compressed pages written to redo log ▪ Compressed pages are written to redo log. ▪ Reasons for doing this: ▪ Reuse redo logs even if the zlib version changes. ▪ Prevent against indeterminism in compression. ▪ Increase in redo log writes. ▪ Increase in checkpoint frequency.
  • 25. InnoDB Compression Official advice on tuning compression If the number of “successful” compression operations (COMPRESS_OPS_OK) is a high percentage of the total number of compression operations (COMPRESS_OPS), then the system is likely performing well. If the ratio is low, then InnoDB is reorganizing, recompressing, and splitting B-tree nodes more often than is desirable. In this case, avoid compressing some tables, or increase KEY_BLOCK_SIZE for some of the compressed tables. You might turn off compression for tables that cause the number of “compression failures” in your application to be more than 1% or 2% of the total. (Such a failure ratio might be acceptable during a temporary operation such as a data load).
  • 27. Facebook Improvements Finding bugs and testing new features ▪ Expanded mtr test suite with crash-recovery and stress tests. ▪ Simulate compression failures. ▪ Fixed the bugs revealed by the tests and production servers.
  • 28. Facebook Improvements Table level compression statistics ▪ Added the following columns to table_statistics: ▪ COMPRESS_OPS, ▪ COMPRESS_OPS_OK, ▪ COMPRESS_USECS, ▪ UNCOMPRESS_OPS, ▪ UNCOMPRESS_USECS.
  • 29. Facebook Improvements Removal of compressed pages from redo log ▪ Removed compressed page images from redo log. ▪ Introduced a new log record for compression.
  • 30. Facebook Improvements Adaptive padding ▪ Put less data on each page to prevent compression failures. ▪ pad = 16K – (maximum data size allowed on the uncompressed copy)
  • 33. Facebook Improvements Adaptive padding ▪ Algorithm to determine pad per table: ▪ Increase the pad until the compression failure rate reaches the specified level. ▪ Decrease padding if the failure rate is too low. ▪ Adapts to the compressibility of data over time.
  • 34. Facebook Improvements Adaptive padding on insert benchmark Inserts Per Second ▪ Padding value for sbtable is 2432. 35000 ▪ Compression failure rate: 30000 ▪ mysql: 41%. 25000 ▪ fb-mysql: 5%. 20000 15000 10000 5000 0 mysql-compressed fb-mysql-compressed
  • 35. Facebook Improvements Compression ops in insert benchmark 1400000 1200000 1000000 800000 compress_ops_ok 600000 compress_ops_fail 400000 200000 0 mysql-compressed fb-mysql-compressed
  • 36. Facebook Improvements Time spent for compression ops in insert benchmark 1200 1000 800 compress_time(s) 600 decompress_time(s) 400 200 0 mysql-compressed fb-mysql-compressed
  • 37. Facebook Improvements Other improvements ▪ Amount of empty allocated pages: 10-15% to 2-5%. ▪ Cache memory allocations for: ▪ compression buffers, ▪ decompression buffers, ▪ buffer page descriptors. ▪ Hardware accelerated checksum for compressed pages. ▪ Remove adler32 calls from zlib functions.
  • 38. Facebook Improvements Future work ▪ Make page_zip_compress() more efficient. ▪ Test larger page sizes:32K, 64K. ▪ Prefix compression. ▪ Other compression algorithms: snappy, quicklz etc. ▪ 3X compression in production.

Hinweis der Redaktion

  1. Introduction, interruptions ok, questions at the end.
  2. Use existing servers for a longer time.
  3. Linear growth until first arrow. Drops correspond to compression of servers In batches. Percentages are computed by taking the current size and the predicted uncompressed size.
  4. For reference, these 3 arrows correspond to the same times as previous arrows.
  5. I chose sysbench because it’s a common benchmark framework
  6. We could guess that this table would be compressible even before looking at the data.
  7. Grabbed the latest 5.1 source code from launchpad. 4 versions: stock mysql uncompressed, stock mysql compressed, mysql with fb patch uncompressed, mysql with facebook patch compressed.
  8. Note that even though compressed mysql with fb patch has higher throughput, it doesn’t increase the disk space used by the database in this case.
  9. Just making sure the read-only perf is ok.
  10. This is the main difference in terms of performance.
  11. The results are not peculiar to In-Memory workloads.
  12. Naïve way to implement compression: compress before flushing to the disk. A less naïve but inefficient way: keep compressed copy in memory & recompress on every update. What innodb does: modification log. Note that this would not be necessary for LSM-based architectures.
  13. Mention the assumptions about the compressibility of a page. Master-slave method for checking consistency.