SlideShare a Scribd company logo
1 of 42
Download to read offline
The Problem
                          The Tests
                      Breakthroughs




Scaling MySQL writes through partitioning

     Philip Tellis / philip@bluesmoon.info


          IPC Spring – 2010-06-01 – Berlin




     IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem
                                   The Tests
                               Breakthroughs


$ finger philip



      Philip Tellis
      geek
      yahoo
      @bluesmoon
      http://bluesmoon.info/
      philip@bluesmoon.info




              IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Web requests




     Millions of beacons from a web page
     No response required
     Can be batch processed
     Very small amounts of data loss is acceptable




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                The Tests     DB infrastructure
                            Breakthroughs     Performance


Large volume




     2000 requests/second on most days
     up to 8000 requests/second on some days
     200MM requests/day
     Some data is fake or abusive




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Access patterns




     Lots of writes throughout the day
     One huge read at the end of the day
     Summarise data and throw out the details
     Many reads of summary data over several months




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                     The Tests     DB infrastructure
                 Breakthroughs     Performance




      Why not use a data warehouse?




IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                     The Tests     DB infrastructure
                 Breakthroughs     Performance




I like to get the most out of my hardware




IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Hardware setup



     MySQL 5.1
     Multi-master replication in two colos, 1 remote slave per
     master
     Only one master writable at any point of time
     4GB RAM (later 16GB), Big disk with RAID 10
     RAID 10 7200rpm 6x1TB




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                The Tests     DB infrastructure
                            Breakthroughs     Performance


DB config


    innodb_buffer _pool_size = 2078M
    innodb_flush_log_at_trx_commit = 1
    innodb_log_buffer _size = 8M
    innodb_max_dirty _pages_pct = 90
    innodb_doublewrite = 1, innodb_support_xa = 1
    sync_binlog = 0
    key _buffer _size = 32M, myisam_sort_buffer _size = 512k
    transaction_isolation = REPEATABLE-READ




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                  The Tests     DB infrastructure
                              Breakthroughs     Performance


Data setup




     Each row 120bytes
     + InnoDB overhead
     innodb_file_per_table so we can see how the table grows
     No Autoincrement fields
     PRIMARY KEY derived from data + one other index




             IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                The Tests     DB infrastructure
                            Breakthroughs     Performance


Schema



    page identifier - INT
    timestamp - INT
    one-way hash of IP - CHAR(32)
    page performance information
    request country
    useragent




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Test requirements




     Insert records until the system breaks down
     Find out why it broke down
     Find out how to make it not break down
     Find out how fast we can insert records (must be >2000 i/s)




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Test requirements




     Insert records until the system breaks down
     Find out why it broke down
     Find out how to make it not break down
     Find out how fast we can insert records (must be >2000 i/s)




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Test requirements




     Insert records until the system breaks down
     Find out why it broke down
     Find out how to make it not break down
     Find out how fast we can insert records (must be >2000 i/s)




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


Test requirements




     Insert records until the system breaks down
     Find out why it broke down
     Find out how to make it not break down
     Find out how fast we can insert records (must be >2000 i/s)




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                                 The Tests     DB infrastructure
                             Breakthroughs     Performance


How I tested




     Insertion script measured insertion speed v/s number of
     records
     Number of records roughly translates to table size
     On DB box we measure disk performance and table size




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Our data
                              The Tests     DB infrastructure
                          Breakthroughs     Performance


Test 1




         IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                The Tests     Going crazy
                            Breakthroughs     Insights


Test 2 - Drop the secondary index




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                               The Tests     Going crazy
                           Breakthroughs     Insights


Test 3 - innodb_buffer_pool_size=1000




          IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                   The Tests     Going crazy
                               Breakthroughs     Insights


Realisation



     Max table size directly proportional to
     innodb_buffer_pool_size
     Extra index reduces insertion rate
     Extra index reduces max table size
     Possible solution: increase RAM and
     innodb_buffer_pool_size
     But this only postpones the problem




              IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                   The Tests     Going crazy
                               Breakthroughs     Insights


Realisation



     Max table size directly proportional to
     innodb_buffer_pool_size
     Extra index reduces insertion rate
     Extra index reduces max table size
     Possible solution: increase RAM and
     innodb_buffer_pool_size
     But this only postpones the problem




              IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                              The Tests     Going crazy
                          Breakthroughs     Insights


Test 4 - innodb_flush_log_at_trx_commit=2




         IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                              The Tests     Going crazy
                          Breakthroughs     Insights


Test 5 - innodb_max_dirty_pages_pct=60




         IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                The Tests     Going crazy
                            Breakthroughs     Insights


Test 6 - Let’s try MyISAM




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                 The Tests     Going crazy
                             Breakthroughs     Insights


Test 7 - Inserts in a transaction




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                  The Tests     Going crazy
                              Breakthroughs     Insights


Other stuff we tried




      innodb_doublewrite=0 – no effect
      Server side prepared statements – no effect
      transaction_isolation=READ-COMMITTED – no effect
      innodb_support_xa=0 – 12% increase in insertion rate
      Combination of the best options – negligible effect




             IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                The Tests     Going crazy
                            Breakthroughs     Insights


What we knew at this point




     Sticking with InnoDB
     We need a large buffer pool
     We need to drop extra indices
     flush_log_at_trx_commit = 2 is good enough
     Transactions are good




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Basic tests
                                 The Tests     Going crazy
                             Breakthroughs     Insights


Our big problem


     Insert rate was barely reaching the rate of incoming data!
     Still breaks down before getting a day’s worth of data




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                 The Tests     Partitioning
                             Breakthroughs     Long running test


Test 8 - Single bulk insert




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                 The Tests     Partitioning
                             Breakthroughs     Long running test


Bulk insert specifications




     40,000 records in one insert statement
     Use INSERT IGNORE
     4-6 seconds per statement
     PRIMARY KEY drops duplicates
     We still have a breakdown when we cross the buffer pool




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                  The Tests     Partitioning
                              Breakthroughs     Long running test


Handling insert failures




      Typically if one record is bad, the entire insert fails
      INSERT IGNORE solves this problem
      Bonus is easy recovery from hardware/network failures




             IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                 The Tests     Partitioning
                             Breakthroughs     Long running test


Test 9 - bulk inserts + partitioning




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                The Tests     Partitioning
                            Breakthroughs     Long running test


What happened?




    Split the table into partitions
    Each partition < 0.5 × innodb_buffer _pool_size
    current and next partition fit in memory at any time
    Partition key is based on incoming data and not on
    SELECTs




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                              The Tests     Partitioning
                          Breakthroughs     Long running test


Schema



 CREATE TABLE (
     ...
 ) PARTITION BY RANGE( ( time DIV 3600 ) MOD 24 ) (
     Partition p0 values less than (2),
     Partition p1 values less than (4),
     ...
     Partition p10 values less than (22),
     Partition p11 values less than (24)
 );




         IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                The Tests     Partitioning
                            Breakthroughs     Long running test


Test 10 - Ran for 7 days




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                  The Tests     Partitioning
                              Breakthroughs     Long running test


Still running




      Terabytes of data
      around 8500 inserts per second
      Potentially 700+ MM inserts per day




             IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem      Bulk inserts
                                 The Tests     Partitioning
                             Breakthroughs     Long running test


A Note on read performance




     Data volume before and after is very different
     Current read rate is approx 8000 records per second




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem
                                 The Tests
                             Breakthroughs


What about a key-value store?



     Some summarisation queries are complex but not
     impossible with a k/v store
     Summary data is still relational in nature
     Avoid sharing resources (RAM/CPU) between two
     separate data stores
     We have not yet discarded this idea




            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem
                                    The Tests
                                Breakthroughs


Lots of critics



 The title should be "how to get poor performance
 by using a completely inappropriate tool"


                    Dude, that’s a really impressive bit of engineering.
                    However, you failed the interview.



 it’s just another case of RDBMS blinding people
 for all other possible solutions



               IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem
                                The Tests
                            Breakthroughs


Summary




    Bulk inserts push up your insert rate
    Partitioning lets you insert more records
    Partition based on incoming data key for fast inserts
    http://tech.bluesmoon.info/2009/09/scaling-writes-in-mysql.html




           IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem
                                 The Tests
                             Breakthroughs


Photo credits


     Disused warehouse on Huddersfield Broad Canal / by TDR1
     http://www.flickr.com/photos/tdr1/3578203727/
     Hardware store dog / by sstrudeau
     http://www.flickr.com/photos/sstrudeau/330379020/
     North Dakota, Broken Down Van / by mattdente
     http://www.flickr.com/photos/mattdente/46944898/
     One red tree / by EssjayNZ
     http://www.flickr.com/photos/essjay/155223631/
     The Leaning Tree / by stage88
     http://www.flickr.com/photos/stage88/3179612722/



            IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning
The Problem
                                  The Tests
                              Breakthroughs


contact me



     Philip Tellis
     yahoo
     geek
     @bluesmoon
     http://bluesmoon.info/
     slideshare.net/bluesmoon
     philip@bluesmoon.info




             IPC Spring – 2010-06-01 – Berlin   Scaling MySQL writes through partitioning

More Related Content

Similar to Scaling MySQL writes through Partitioning - IPC Spring Edition

MySQL Enterprise Edition
MySQL Enterprise EditionMySQL Enterprise Edition
MySQL Enterprise EditionMySQL Brasil
 
22059 slides
22059 slides22059 slides
22059 slidespholden1
 
NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?Martin Scholl
 
Run Your Oracle BI QA Cycles More Effectively
Run Your Oracle BI QA Cycles More EffectivelyRun Your Oracle BI QA Cycles More Effectively
Run Your Oracle BI QA Cycles More EffectivelyKPI Partners
 
CA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_iCA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_iGeorge Jeffcock
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AITorsten Steinbach
 
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012Amazon Web Services
 
Technology Day 2011 MySQL & MariaDB
Technology Day 2011 MySQL & MariaDBTechnology Day 2011 MySQL & MariaDB
Technology Day 2011 MySQL & MariaDBDan-Claudiu Dragoș
 
ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...
ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...
ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...ScyllaDB
 
Architecture Best Practices on Windows Azure
Architecture Best Practices on Windows AzureArchitecture Best Practices on Windows Azure
Architecture Best Practices on Windows AzureNuno Godinho
 
Kscope 14 Presentation : Virtual Data Platform
Kscope 14 Presentation : Virtual Data PlatformKscope 14 Presentation : Virtual Data Platform
Kscope 14 Presentation : Virtual Data PlatformKyle Hailey
 
DB2 for i 7.1 - Whats New?
DB2 for i 7.1 - Whats New?DB2 for i 7.1 - Whats New?
DB2 for i 7.1 - Whats New?COMMON Europe
 
DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14Kyle Hailey
 
VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011asedha
 
My sqlstrategyroadmap
My sqlstrategyroadmapMy sqlstrategyroadmap
My sqlstrategyroadmapslidethanks
 
MySQL Strategy&Roadmap
MySQL Strategy&RoadmapMySQL Strategy&Roadmap
MySQL Strategy&Roadmapslidethanks
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)lakeFS
 
Agile Data: revolutionizing data and database cloning
Agile Data: revolutionizing data and database cloningAgile Data: revolutionizing data and database cloning
Agile Data: revolutionizing data and database cloningKyle Hailey
 
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsA Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsIBMGovernmentCA
 

Similar to Scaling MySQL writes through Partitioning - IPC Spring Edition (20)

MySQL Enterprise Edition
MySQL Enterprise EditionMySQL Enterprise Edition
MySQL Enterprise Edition
 
22059 slides
22059 slides22059 slides
22059 slides
 
NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?
 
Run Your Oracle BI QA Cycles More Effectively
Run Your Oracle BI QA Cycles More EffectivelyRun Your Oracle BI QA Cycles More Effectively
Run Your Oracle BI QA Cycles More Effectively
 
CA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_iCA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_i
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AI
 
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
 
Technology Day 2011 MySQL & MariaDB
Technology Day 2011 MySQL & MariaDBTechnology Day 2011 MySQL & MariaDB
Technology Day 2011 MySQL & MariaDB
 
ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...
ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...
ScyllaDB V Developer Deep Dive Series: Performance Enhancements + AWS I4i Ben...
 
Architecture Best Practices on Windows Azure
Architecture Best Practices on Windows AzureArchitecture Best Practices on Windows Azure
Architecture Best Practices on Windows Azure
 
Kscope 14 Presentation : Virtual Data Platform
Kscope 14 Presentation : Virtual Data PlatformKscope 14 Presentation : Virtual Data Platform
Kscope 14 Presentation : Virtual Data Platform
 
DB2 for i 7.1 - Whats New?
DB2 for i 7.1 - Whats New?DB2 for i 7.1 - Whats New?
DB2 for i 7.1 - Whats New?
 
DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14DevOps, Databases and The Phoenix Project UGF4042 from OOW14
DevOps, Databases and The Phoenix Project UGF4042 from OOW14
 
VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011
 
My sqlstrategyroadmap
My sqlstrategyroadmapMy sqlstrategyroadmap
My sqlstrategyroadmap
 
MySQL Strategy&Roadmap
MySQL Strategy&RoadmapMySQL Strategy&Roadmap
MySQL Strategy&Roadmap
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
 
Agile Data: revolutionizing data and database cloning
Agile Data: revolutionizing data and database cloningAgile Data: revolutionizing data and database cloning
Agile Data: revolutionizing data and database cloning
 
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsA Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
 

More from Philip Tellis

Improving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other HacksImproving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other HacksPhilip Tellis
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
 
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou FurieuxFrontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou FurieuxPhilip Tellis
 
Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonPhilip Tellis
 
Beyond Page Level Metrics
Beyond Page Level MetricsBeyond Page Level Metrics
Beyond Page Level MetricsPhilip Tellis
 
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Philip Tellis
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonPhilip Tellis
 
RUM Distillation 101 -- Part I
RUM Distillation 101 -- Part IRUM Distillation 101 -- Part I
RUM Distillation 101 -- Part IPhilip Tellis
 
Improving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFramesImproving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFramesPhilip Tellis
 
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Philip Tellis
 
The Statistics of Web Performance Analysis
The Statistics of Web Performance AnalysisThe Statistics of Web Performance Analysis
The Statistics of Web Performance AnalysisPhilip Tellis
 
Abusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web PerformanceAbusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web PerformancePhilip Tellis
 
Analysing network characteristics with JavaScript
Analysing network characteristics with JavaScriptAnalysing network characteristics with JavaScript
Analysing network characteristics with JavaScriptPhilip Tellis
 
A Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web TrafficA Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web TrafficPhilip Tellis
 

More from Philip Tellis (20)

Improving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other HacksImproving D3 Performance with CANVAS and other Hacks
Improving D3 Performance with CANVAS and other Hacks
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
 
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou FurieuxFrontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: De débutant à Expert à Fou Furieux
 
Frontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy PersonFrontend Performance: Expert to Crazy Person
Frontend Performance: Expert to Crazy Person
 
Beyond Page Level Metrics
Beyond Page Level MetricsBeyond Page Level Metrics
Beyond Page Level Metrics
 
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
 
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy PersonFrontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
 
mmm... beacons
mmm... beaconsmmm... beacons
mmm... beacons
 
RUM Distillation 101 -- Part I
RUM Distillation 101 -- Part IRUM Distillation 101 -- Part I
RUM Distillation 101 -- Part I
 
Improving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFramesImproving 3rd Party Script Performance With IFrames
Improving 3rd Party Script Performance With IFrames
 
Extending Boomerang
Extending BoomerangExtending Boomerang
Extending Boomerang
 
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
 
The Statistics of Web Performance Analysis
The Statistics of Web Performance AnalysisThe Statistics of Web Performance Analysis
The Statistics of Web Performance Analysis
 
Abusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web PerformanceAbusing JavaScript to Measure Web Performance
Abusing JavaScript to Measure Web Performance
 
Rum for Breakfast
Rum for BreakfastRum for Breakfast
Rum for Breakfast
 
Analysing network characteristics with JavaScript
Analysing network characteristics with JavaScriptAnalysing network characteristics with JavaScript
Analysing network characteristics with JavaScript
 
A Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web TrafficA Node.JS bag of goodies for analyzing Web Traffic
A Node.JS bag of goodies for analyzing Web Traffic
 
Input sanitization
Input sanitizationInput sanitization
Input sanitization
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

Scaling MySQL writes through Partitioning - IPC Spring Edition

  • 1. The Problem The Tests Breakthroughs Scaling MySQL writes through partitioning Philip Tellis / philip@bluesmoon.info IPC Spring – 2010-06-01 – Berlin IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 2. The Problem The Tests Breakthroughs $ finger philip Philip Tellis geek yahoo @bluesmoon http://bluesmoon.info/ philip@bluesmoon.info IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 3. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Web requests Millions of beacons from a web page No response required Can be batch processed Very small amounts of data loss is acceptable IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 4. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Large volume 2000 requests/second on most days up to 8000 requests/second on some days 200MM requests/day Some data is fake or abusive IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 5. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Access patterns Lots of writes throughout the day One huge read at the end of the day Summarise data and throw out the details Many reads of summary data over several months IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 6. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Why not use a data warehouse? IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 7. The Problem Our data The Tests DB infrastructure Breakthroughs Performance I like to get the most out of my hardware IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 8. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Hardware setup MySQL 5.1 Multi-master replication in two colos, 1 remote slave per master Only one master writable at any point of time 4GB RAM (later 16GB), Big disk with RAID 10 RAID 10 7200rpm 6x1TB IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 9. The Problem Our data The Tests DB infrastructure Breakthroughs Performance DB config innodb_buffer _pool_size = 2078M innodb_flush_log_at_trx_commit = 1 innodb_log_buffer _size = 8M innodb_max_dirty _pages_pct = 90 innodb_doublewrite = 1, innodb_support_xa = 1 sync_binlog = 0 key _buffer _size = 32M, myisam_sort_buffer _size = 512k transaction_isolation = REPEATABLE-READ IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 10. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Data setup Each row 120bytes + InnoDB overhead innodb_file_per_table so we can see how the table grows No Autoincrement fields PRIMARY KEY derived from data + one other index IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 11. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Schema page identifier - INT timestamp - INT one-way hash of IP - CHAR(32) page performance information request country useragent IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 12. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Test requirements Insert records until the system breaks down Find out why it broke down Find out how to make it not break down Find out how fast we can insert records (must be >2000 i/s) IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 13. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Test requirements Insert records until the system breaks down Find out why it broke down Find out how to make it not break down Find out how fast we can insert records (must be >2000 i/s) IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 14. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Test requirements Insert records until the system breaks down Find out why it broke down Find out how to make it not break down Find out how fast we can insert records (must be >2000 i/s) IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 15. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Test requirements Insert records until the system breaks down Find out why it broke down Find out how to make it not break down Find out how fast we can insert records (must be >2000 i/s) IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 16. The Problem Our data The Tests DB infrastructure Breakthroughs Performance How I tested Insertion script measured insertion speed v/s number of records Number of records roughly translates to table size On DB box we measure disk performance and table size IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 17. The Problem Our data The Tests DB infrastructure Breakthroughs Performance Test 1 IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 18. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Test 2 - Drop the secondary index IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 19. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Test 3 - innodb_buffer_pool_size=1000 IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 20. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Realisation Max table size directly proportional to innodb_buffer_pool_size Extra index reduces insertion rate Extra index reduces max table size Possible solution: increase RAM and innodb_buffer_pool_size But this only postpones the problem IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 21. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Realisation Max table size directly proportional to innodb_buffer_pool_size Extra index reduces insertion rate Extra index reduces max table size Possible solution: increase RAM and innodb_buffer_pool_size But this only postpones the problem IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 22. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Test 4 - innodb_flush_log_at_trx_commit=2 IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 23. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Test 5 - innodb_max_dirty_pages_pct=60 IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 24. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Test 6 - Let’s try MyISAM IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 25. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Test 7 - Inserts in a transaction IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 26. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Other stuff we tried innodb_doublewrite=0 – no effect Server side prepared statements – no effect transaction_isolation=READ-COMMITTED – no effect innodb_support_xa=0 – 12% increase in insertion rate Combination of the best options – negligible effect IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 27. The Problem Basic tests The Tests Going crazy Breakthroughs Insights What we knew at this point Sticking with InnoDB We need a large buffer pool We need to drop extra indices flush_log_at_trx_commit = 2 is good enough Transactions are good IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 28. The Problem Basic tests The Tests Going crazy Breakthroughs Insights Our big problem Insert rate was barely reaching the rate of incoming data! Still breaks down before getting a day’s worth of data IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 29. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Test 8 - Single bulk insert IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 30. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Bulk insert specifications 40,000 records in one insert statement Use INSERT IGNORE 4-6 seconds per statement PRIMARY KEY drops duplicates We still have a breakdown when we cross the buffer pool IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 31. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Handling insert failures Typically if one record is bad, the entire insert fails INSERT IGNORE solves this problem Bonus is easy recovery from hardware/network failures IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 32. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Test 9 - bulk inserts + partitioning IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 33. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test What happened? Split the table into partitions Each partition < 0.5 × innodb_buffer _pool_size current and next partition fit in memory at any time Partition key is based on incoming data and not on SELECTs IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 34. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Schema CREATE TABLE ( ... ) PARTITION BY RANGE( ( time DIV 3600 ) MOD 24 ) ( Partition p0 values less than (2), Partition p1 values less than (4), ... Partition p10 values less than (22), Partition p11 values less than (24) ); IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 35. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Test 10 - Ran for 7 days IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 36. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test Still running Terabytes of data around 8500 inserts per second Potentially 700+ MM inserts per day IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 37. The Problem Bulk inserts The Tests Partitioning Breakthroughs Long running test A Note on read performance Data volume before and after is very different Current read rate is approx 8000 records per second IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 38. The Problem The Tests Breakthroughs What about a key-value store? Some summarisation queries are complex but not impossible with a k/v store Summary data is still relational in nature Avoid sharing resources (RAM/CPU) between two separate data stores We have not yet discarded this idea IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 39. The Problem The Tests Breakthroughs Lots of critics The title should be "how to get poor performance by using a completely inappropriate tool" Dude, that’s a really impressive bit of engineering. However, you failed the interview. it’s just another case of RDBMS blinding people for all other possible solutions IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 40. The Problem The Tests Breakthroughs Summary Bulk inserts push up your insert rate Partitioning lets you insert more records Partition based on incoming data key for fast inserts http://tech.bluesmoon.info/2009/09/scaling-writes-in-mysql.html IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 41. The Problem The Tests Breakthroughs Photo credits Disused warehouse on Huddersfield Broad Canal / by TDR1 http://www.flickr.com/photos/tdr1/3578203727/ Hardware store dog / by sstrudeau http://www.flickr.com/photos/sstrudeau/330379020/ North Dakota, Broken Down Van / by mattdente http://www.flickr.com/photos/mattdente/46944898/ One red tree / by EssjayNZ http://www.flickr.com/photos/essjay/155223631/ The Leaning Tree / by stage88 http://www.flickr.com/photos/stage88/3179612722/ IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning
  • 42. The Problem The Tests Breakthroughs contact me Philip Tellis yahoo geek @bluesmoon http://bluesmoon.info/ slideshare.net/bluesmoon philip@bluesmoon.info IPC Spring – 2010-06-01 – Berlin Scaling MySQL writes through partitioning