SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Downloaden Sie, um offline zu lesen
MySQL Scale-Out
                                                 by
                                      application partitioning

                                                 Oli Sennhauser
                                                   Rebenweg 6
                                                 CH – 8610 Uster
                                                   Switzerland
                                           oli.sennhauser@bluewin.ch



Introduction                                                very limited way by adding more disks to your IO sys­
Eventually every database system hit its limits. Espe­      tem, but here too you eventually hit a limit (price).
cially on the Internet, where you have millions of users
which theoretically access your database simul­             Scale-out possibilities
taneously, eventually your IO system will be a bottle­      So we have to think about other possibilities to scale-
neck.                                                       out. One possibility would be to use MySQL Cluster.
                                                            This solution can be very fast because it is not IO
Conventional solutions                                      bound. But it has some other limits like: amount of
In general, as a first step, MySQL Replication is used to   available RAM, and joins not performing to well. If
scale-out in such a situation. MySQL Replication scales     these limitations were not applicable, MySQL Cluster
very well when you have a high read/write (r/w) ratio.      would be a good and performant solution.
The higher the better.                                      An other promising but more complex solution with
                                                            nearly no scale-out limits is application partitioning. If
                                                            and when you get into the top-1000 rank on alexa [1],
                   Web server
                   Web server                               you have to think about such solutions.
                    Web server
                     Web server
                                                            Application partitioning
                                                            What does “application partitioning” mean? Application
                                                            partitioning means the following:
               Application server
                Application server
                 Application server
                 Application server                         “Application partitioning distributes application pro­
                                                            cessing across all system resources...”

                                                            There are 2 different kinds of application partitioning:
                                                            horizontal and vertical application partitioning.
                                     MySQL
                                       MySQL
     MySQL                              MySQL               Horizontal application partitioning
                                    Replication
                                        MySQL
    Replication Replication          Replication
                                      Replication           Horizontal application partitioning is also known as
                                       Slave
                                       Replication
                                        Slave
   Master (write)                        Slave              Multi-Tier-Computing [2] which means splitting the
                                    (readonly)
                                          Slave
                                     (readonly)
                                      (readonly)
                                       (readonly)           database back end, the application server (middle tier),
                                                            the web server, and the client doing the display. This
                                                            nowadays is common sense and good practice.
                                                            But with horizontal application partitioning you still
                                                            have not avoided the IO bottleneck on the database back
                                                            end.

But also such a MySQL Replication system (let us call
it “MySQL Replication cluster” [4] rather than “MySQL
Cluster” in this paper) hits its limits when you have a
huge amount of (write) access.
Because database systems have random disk access, it's
not the throughput of your IO system that's relevant but
the IO per second (random seek). You can scale this in a
static information like for example geographic informa­
        Client                                   Web-Client       tion). This can also be done by a separate replication
                                                                  tree:
                                                 Web server                                      WS
                            horizontal
                           application
    Monolithic system      partitioning    Application server /
       containing:                                                                               AS
  * database back end,
                                              Middle-tier
 * application logic and
   * presentation logic
                                            MySQL database                           M1 M2 M3                M'
                                              back end

                                                                                     S1     S2     S3        S'
Vertical application partitioning
With vertical application partitioning you can scale-out
your system to a nearly unlimited degree. The more                The disadvantage of this solution is, that you have to
loosely coupled your vertical application partitions are,         (keep) open at least 1 connection from each application
the better the whole system scales [3].                           server (AS) to each Master (Mn) and Slave (Sn) of each
                                                                  MySQL Replication cluster. So the limitation of this
But what does vertical application partitioning now               system is roughly:
mean? For example suppose you have an on-line contact
website with 1 million users. Some of them, let's say               #Conn./Server : #AS Conn. = #Replication clusters
20%, are actively searching for contacts with other
people. Each of these active searching users does 10                                      1000 : 50 = 20
contact requests per day. This gives approximately 2
million changes into the back end (23 changes per                 When this limit too has been hit, a much more
second). In general one contact request results in more           sophisticated solution with distribution of the users in
than one change in the database and also people are               the AS and WS tier has to be considered:
doing this contact search during peak hours (1/3 of the
day). This can result easily in several hundred changes
per second on the database during peak time. But your
I/O system is roughly limited by this formula:                                             Internet

           250 I/O's /s per disk * #disks = #I/O /s

When you are using MySQL Replication, some caching
                                                                                     WS WS WS
mechanism (MySQL query cache, block buffer cache,
file system cache, battery buffered disk cache, etc.) can
help and when you follow the concept of “relaxation of                               AS      AS         AS
constraints” you can increase this amount of I/O by
some factors. You can handle these 1 million users on an
optimized MySQL Replication Cluster system (when                                     M1       M2        M3
you have tuned it properly).
But what happens when you want to scale by factor 10
or even 100? With 10 million users your system                                        S1      S2        S3
definitely hits its limits. How do we scale here?
In this case we can only scale if we split up one MySQL           But in this concept something like an “asynchronous
Replication Cluster into several pieces. This splitting           inter MySQL Replication Cluster” protocol has to be
can be done for example by user (user_id).                        established.

            WS                                   WS               How to partition an entity
                                                                  An entity can be split up in several different ways:
            AS                                   AS               Partition by RANGE
                                                                  Users are distributed to their MySQL Replication
       M                              M1 M2 M3 M4                 cluster, for example by their user_id. For every 1
                                                                  million users you have to provide a new MySQL
                                                                  Replication cluster:
       S                              S1    S2    S3   S4
                                                                                       AS
It should be considered that the splitting is done by the
entity with the smallest possible interaction. Otherwise a
lot of synchronization work has to be done between the               user_id <      user_id <           user_id <
concerned database nodes. It should also be considered               1'000'000      2'000'000           3'000'000   ...
that some data can or must be kept redundant (general
Advantages:
•   No redistribution of users during growth needed.       or
    You only have to add a new MySQL Replication
    cluster.                                                     Cluster = HASH(last_name) MOD #Clusters
•   Improves slightly locality of reference [5].
                                                           Splitting up by DIV is already discussed in “Partitioning
•   Easy to understand.                                    by RANGE”.
•   Easy to locate data.
•   Likelihood of hot-spots is low.                        Advantages of HASH:
                                                           •   Random distribution, thus no hot-spots
•   Simple distribution logic can be implemented.
                                                           Disadvantages of HASH:
Disadvantages:
                                                           •   For rebalancing the whole system must be mi­
•   On the “old” MySQL Replication clusters it is
                                                               grated!
    likely that you get less and less activity. So you
    either have to waste hardware resources -- which is    •   Hot-spots can happen if done wrong for example
    not too serious because these machines are depre­          HASH(country) MOD # Clusters
    ciated after some years and “only” consume some
    power and space in your IT center -- or you have to    Advantages of MOD:
    migrate users from the oldest MySQL Replication        •   Deterministic distribution, target cluster is easily
    Clusters once in a while -- which causes a lot of          visible “by hand”.
    traffic and probably some downtime on these 2 ma­      Disadvantages of MOD:
    chines.                                                •   For rebalancing the whole system must be mi­
•   Resource balancing causes a lot of migration work.         grated!
•   When resource balancing is done, simple distribu­
    tion logic does not apply anymore. Then a lookup       Partition by LOAD (with lookup table)
    mechanism is needed.                                   A dynamic way to partition users is measuring the load
                                                           of each MySQL Replication cluster (somehow) and dis­
Partition by a certain CHARACTERISTCS                      tributing new users accordingly (similar to a load balan­
Users are distributed by certain characteristics for       cer). For this, for every user a more or less static lookup
example last name, birth date or country.                  table is needed to determine on which MySQL Replica­
                                                           tion cluster a user is located.
                          AS
                                                                                   AS                    lookup
                                                                                                          table

          last_name     last_name     last_name
         BETWEEN       BETWEEN       BETWEEN
         'A' AND 'I'   'J' AND R'    'S' AND 'Z'
                                                                            90% load
Advantages:                                                     50% load               60% load
                                                                                                  20% load
•   Easy to understand.
•   Easy to locate data.                                   Advantages:
Disadvantages:
                                                           •   New MySQL Replication cluster is automatically
•   You can get “hot-spots” for example on the server          loaded more until it reaches saturation.
    with the last name starting with “S” or some coun­
                                                           •   No data redistribution is need.
    tries like US, JP, D etc., and get unused resources
                                                           Disadvantages:
    on servers with for example birth date February
    29th, last names with “X” and “Y” or countries like    •   When old users are not removed after some posting
    the Principality of Liechtenstein, Monaco or An­           peaks can happen on the old systems.
    dorra. This can cause a necessity for redistribution
    of data.                                               Literature
•   This can be avoided by merging some of the values      [1] Alexa top 500 ranking: http://www.alexa.com
    into one MySQL Replication Cluster but then some       [2] Multi-Tier-Computing:
    look-up table must exist.                              http://en.wikipedia.org/wiki/Three-
                                                           tier_%28computing%29
•   Resource balancing is difficult.
                                                           [3] Loose coupling:
                                                           http://en.wikipedia.org/wiki/Loose_coupling
Partitioning by HASH/MODULO                                [4] Cluster:
An entity can also be split up by some other functions     http://en.wikipedia.org/wiki/Computer_cluster
like MODULO. The MySQL Replication Cluster is de­          [5] Locality of reference:
termined by either:                                        http://en.wikipedia.org/wiki/Locality_of_reference
           Cluster = user_id MOD #Clusters

Weitere ähnliche Inhalte

Was ist angesagt?

Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformEduardo Castro
 
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCSDBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCSPrincipled Technologies
 
Databases That Support SharePoint 2013
Databases That Support SharePoint 2013Databases That Support SharePoint 2013
Databases That Support SharePoint 2013David J Rosenthal
 
My cool new Slideshow!
My cool new Slideshow!My cool new Slideshow!
My cool new Slideshow!BitGear
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudRightScale
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows AzureJeremy Taylor
 
Scalability
ScalabilityScalability
Scalabilityfelho
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows AzureJeremy Taylor
 
Stairway to heaven webinar
Stairway to heaven webinarStairway to heaven webinar
Stairway to heaven webinarCloudBees
 
Virtualizing the Next Generation of Server Workloads with AMD™
Virtualizing the Next Generation of Server Workloads with AMD™Virtualizing the Next Generation of Server Workloads with AMD™
Virtualizing the Next Generation of Server Workloads with AMD™James Price
 
Choosing Your Windows Azure Platform Strategy
Choosing Your Windows Azure Platform StrategyChoosing Your Windows Azure Platform Strategy
Choosing Your Windows Azure Platform Strategydrmarcustillett
 
Sap Virtualization Week 2009
Sap Virtualization Week 2009Sap Virtualization Week 2009
Sap Virtualization Week 2009Sherry Yu
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
CloudFest Denver Windows Azure Design Patterns
CloudFest Denver Windows Azure Design PatternsCloudFest Denver Windows Azure Design Patterns
CloudFest Denver Windows Azure Design PatternsDavid Pallmann
 
Brief about Windows Azure Platform
Brief about Windows Azure Platform Brief about Windows Azure Platform
Brief about Windows Azure Platform K.Mohamed Faizal
 
Cloud computing 101
Cloud computing 101Cloud computing 101
Cloud computing 101Otto Mora
 
SQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data WarehouseSQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data WarehouseMark Ginnebaugh
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 

Was ist angesagt? (19)

Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing Platform
 
Tim Cramer, Eucaday
Tim Cramer, EucadayTim Cramer, Eucaday
Tim Cramer, Eucaday
 
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCSDBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
 
Databases That Support SharePoint 2013
Databases That Support SharePoint 2013Databases That Support SharePoint 2013
Databases That Support SharePoint 2013
 
My cool new Slideshow!
My cool new Slideshow!My cool new Slideshow!
My cool new Slideshow!
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the Cloud
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
Scalability
ScalabilityScalability
Scalability
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
Stairway to heaven webinar
Stairway to heaven webinarStairway to heaven webinar
Stairway to heaven webinar
 
Virtualizing the Next Generation of Server Workloads with AMD™
Virtualizing the Next Generation of Server Workloads with AMD™Virtualizing the Next Generation of Server Workloads with AMD™
Virtualizing the Next Generation of Server Workloads with AMD™
 
Choosing Your Windows Azure Platform Strategy
Choosing Your Windows Azure Platform StrategyChoosing Your Windows Azure Platform Strategy
Choosing Your Windows Azure Platform Strategy
 
Sap Virtualization Week 2009
Sap Virtualization Week 2009Sap Virtualization Week 2009
Sap Virtualization Week 2009
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
CloudFest Denver Windows Azure Design Patterns
CloudFest Denver Windows Azure Design PatternsCloudFest Denver Windows Azure Design Patterns
CloudFest Denver Windows Azure Design Patterns
 
Brief about Windows Azure Platform
Brief about Windows Azure Platform Brief about Windows Azure Platform
Brief about Windows Azure Platform
 
Cloud computing 101
Cloud computing 101Cloud computing 101
Cloud computing 101
 
SQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data WarehouseSQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data Warehouse
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 

Andere mochten auch

aon Press Release 3Q 08
aon  Press Release 3Q 08aon  Press Release 3Q 08
aon Press Release 3Q 08finance27
 
raytheon Q4 Earnings Presentation
raytheon Q4 Earnings Presentationraytheon Q4 Earnings Presentation
raytheon Q4 Earnings Presentationfinance12
 
aon 4Q 2008_Earnings Release Final
aon  4Q 2008_Earnings Release Finalaon  4Q 2008_Earnings Release Final
aon 4Q 2008_Earnings Release Finalfinance27
 
aon 1Q 08 Earnings Release
aon  1Q 08 Earnings Releaseaon  1Q 08 Earnings Release
aon 1Q 08 Earnings Releasefinance27
 
Peugeot Financial Results for the 1st half of 2008
Peugeot  Financial Results for the 1st half of 2008Peugeot  Financial Results for the 1st half of 2008
Peugeot Financial Results for the 1st half of 2008earningsreport
 
Meeting with investors of march 2013
Meeting with investors of march 2013Meeting with investors of march 2013
Meeting with investors of march 2013TIM RI
 
ppg industries 4Q07EARNINGSPRESSRELEASE
ppg industries 4Q07EARNINGSPRESSRELEASEppg industries 4Q07EARNINGSPRESSRELEASE
ppg industries 4Q07EARNINGSPRESSRELEASEfinance22
 

Andere mochten auch (7)

aon Press Release 3Q 08
aon  Press Release 3Q 08aon  Press Release 3Q 08
aon Press Release 3Q 08
 
raytheon Q4 Earnings Presentation
raytheon Q4 Earnings Presentationraytheon Q4 Earnings Presentation
raytheon Q4 Earnings Presentation
 
aon 4Q 2008_Earnings Release Final
aon  4Q 2008_Earnings Release Finalaon  4Q 2008_Earnings Release Final
aon 4Q 2008_Earnings Release Final
 
aon 1Q 08 Earnings Release
aon  1Q 08 Earnings Releaseaon  1Q 08 Earnings Release
aon 1Q 08 Earnings Release
 
Peugeot Financial Results for the 1st half of 2008
Peugeot  Financial Results for the 1st half of 2008Peugeot  Financial Results for the 1st half of 2008
Peugeot Financial Results for the 1st half of 2008
 
Meeting with investors of march 2013
Meeting with investors of march 2013Meeting with investors of march 2013
Meeting with investors of march 2013
 
ppg industries 4Q07EARNINGSPRESSRELEASE
ppg industries 4Q07EARNINGSPRESSRELEASEppg industries 4Q07EARNINGSPRESSRELEASE
ppg industries 4Q07EARNINGSPRESSRELEASE
 

Ähnlich wie Application Partitioning Wp

SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerMichael Rys
 
Scalable Web Architecture and Distributed Systems
Scalable Web Architecture and Distributed SystemsScalable Web Architecture and Distributed Systems
Scalable Web Architecture and Distributed Systemshyun soomyung
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group ReplicationUlf Wendel
 
Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsJeff Harris
 
Architecting Scalable Applications in the Cloud
Architecting Scalable Applications in the CloudArchitecting Scalable Applications in the Cloud
Architecting Scalable Applications in the CloudClint Edmonson
 
Sql 2012 always on
Sql 2012 always onSql 2012 always on
Sql 2012 always ondilip nayak
 
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...IndicThreads
 
Scaling web application in the Cloud
Scaling web application in the CloudScaling web application in the Cloud
Scaling web application in the CloudFederico Feroldi
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systemsMalisa Ncube
 
Linux virtualization in a nutshell
Linux virtualization in a nutshellLinux virtualization in a nutshell
Linux virtualization in a nutshellpv_narayanan
 
Sneak peak of Cloud Computing
Sneak peak of Cloud ComputingSneak peak of Cloud Computing
Sneak peak of Cloud ComputingJamie Shoup
 
Double guard detection project rreport
Double guard detection project rreportDouble guard detection project rreport
Double guard detection project rreportVenkatesan Sathish
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Dataexponential-inc
 
LIQUID-A Scalable Deduplication File System For Virtual Machine Images
LIQUID-A Scalable Deduplication File System For Virtual Machine ImagesLIQUID-A Scalable Deduplication File System For Virtual Machine Images
LIQUID-A Scalable Deduplication File System For Virtual Machine Imagesfabna benz
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...SL Corporation
 

Ähnlich wie Application Partitioning Wp (20)

SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
Scalable Web Architecture and Distributed Systems
Scalable Web Architecture and Distributed SystemsScalable Web Architecture and Distributed Systems
Scalable Web Architecture and Distributed Systems
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
 
Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applications
 
Rise of NewSQL
Rise of NewSQLRise of NewSQL
Rise of NewSQL
 
Architecting Scalable Applications in the Cloud
Architecting Scalable Applications in the CloudArchitecting Scalable Applications in the Cloud
Architecting Scalable Applications in the Cloud
 
Azure and cloud design patterns
Azure and cloud design patternsAzure and cloud design patterns
Azure and cloud design patterns
 
Sql 2012 always on
Sql 2012 always onSql 2012 always on
Sql 2012 always on
 
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
 
Scaling web application in the Cloud
Scaling web application in the CloudScaling web application in the Cloud
Scaling web application in the Cloud
 
Why you should(n't) run your databases in the cloud
Why you should(n't) run your databases in the cloudWhy you should(n't) run your databases in the cloud
Why you should(n't) run your databases in the cloud
 
lect 1TO 5.pptx
lect 1TO 5.pptxlect 1TO 5.pptx
lect 1TO 5.pptx
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
 
As34269277
As34269277As34269277
As34269277
 
Linux virtualization in a nutshell
Linux virtualization in a nutshellLinux virtualization in a nutshell
Linux virtualization in a nutshell
 
Sneak peak of Cloud Computing
Sneak peak of Cloud ComputingSneak peak of Cloud Computing
Sneak peak of Cloud Computing
 
Double guard detection project rreport
Double guard detection project rreportDouble guard detection project rreport
Double guard detection project rreport
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Data
 
LIQUID-A Scalable Deduplication File System For Virtual Machine Images
LIQUID-A Scalable Deduplication File System For Virtual Machine ImagesLIQUID-A Scalable Deduplication File System For Virtual Machine Images
LIQUID-A Scalable Deduplication File System For Virtual Machine Images
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
 

Mehr von liufabin 66688

人大2010新生住宿指南
人大2010新生住宿指南人大2010新生住宿指南
人大2010新生住宿指南liufabin 66688
 
中山大学2010新生住宿指南
中山大学2010新生住宿指南中山大学2010新生住宿指南
中山大学2010新生住宿指南liufabin 66688
 
武汉大学2010新生住宿指南
武汉大学2010新生住宿指南武汉大学2010新生住宿指南
武汉大学2010新生住宿指南liufabin 66688
 
天津大学2010新生住宿指南
天津大学2010新生住宿指南天津大学2010新生住宿指南
天津大学2010新生住宿指南liufabin 66688
 
清华2010新生住宿指南
清华2010新生住宿指南清华2010新生住宿指南
清华2010新生住宿指南liufabin 66688
 
南京大学2010新生住宿指南
南京大学2010新生住宿指南南京大学2010新生住宿指南
南京大学2010新生住宿指南liufabin 66688
 
复旦2010新生住宿指南
复旦2010新生住宿指南复旦2010新生住宿指南
复旦2010新生住宿指南liufabin 66688
 
北京师范大学2010新生住宿指南
北京师范大学2010新生住宿指南北京师范大学2010新生住宿指南
北京师范大学2010新生住宿指南liufabin 66688
 
北大2010新生住宿指南
北大2010新生住宿指南北大2010新生住宿指南
北大2010新生住宿指南liufabin 66688
 
Mysql Replication Excerpt 5.1 En
Mysql Replication Excerpt 5.1 EnMysql Replication Excerpt 5.1 En
Mysql Replication Excerpt 5.1 Enliufabin 66688
 
Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29liufabin 66688
 
090507.New Replication Features(2)
090507.New Replication Features(2)090507.New Replication Features(2)
090507.New Replication Features(2)liufabin 66688
 
High Performance Mysql
High Performance MysqlHigh Performance Mysql
High Performance Mysqlliufabin 66688
 
high performance mysql
high performance mysqlhigh performance mysql
high performance mysqlliufabin 66688
 
Refactoring And Unit Testing
Refactoring And Unit TestingRefactoring And Unit Testing
Refactoring And Unit Testingliufabin 66688
 
Refactoring Simple Example
Refactoring Simple ExampleRefactoring Simple Example
Refactoring Simple Exampleliufabin 66688
 

Mehr von liufabin 66688 (20)

人大2010新生住宿指南
人大2010新生住宿指南人大2010新生住宿指南
人大2010新生住宿指南
 
中山大学2010新生住宿指南
中山大学2010新生住宿指南中山大学2010新生住宿指南
中山大学2010新生住宿指南
 
武汉大学2010新生住宿指南
武汉大学2010新生住宿指南武汉大学2010新生住宿指南
武汉大学2010新生住宿指南
 
天津大学2010新生住宿指南
天津大学2010新生住宿指南天津大学2010新生住宿指南
天津大学2010新生住宿指南
 
清华2010新生住宿指南
清华2010新生住宿指南清华2010新生住宿指南
清华2010新生住宿指南
 
南京大学2010新生住宿指南
南京大学2010新生住宿指南南京大学2010新生住宿指南
南京大学2010新生住宿指南
 
复旦2010新生住宿指南
复旦2010新生住宿指南复旦2010新生住宿指南
复旦2010新生住宿指南
 
北京师范大学2010新生住宿指南
北京师范大学2010新生住宿指南北京师范大学2010新生住宿指南
北京师范大学2010新生住宿指南
 
北大2010新生住宿指南
北大2010新生住宿指南北大2010新生住宿指南
北大2010新生住宿指南
 
Optimzing mysql
Optimzing mysqlOptimzing mysql
Optimzing mysql
 
Mysql Replication Excerpt 5.1 En
Mysql Replication Excerpt 5.1 EnMysql Replication Excerpt 5.1 En
Mysql Replication Excerpt 5.1 En
 
Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29Drupal Con My Sql Ha 2008 08 29
Drupal Con My Sql Ha 2008 08 29
 
090507.New Replication Features(2)
090507.New Replication Features(2)090507.New Replication Features(2)
090507.New Replication Features(2)
 
Mysql Replication
Mysql ReplicationMysql Replication
Mysql Replication
 
High Performance Mysql
High Performance MysqlHigh Performance Mysql
High Performance Mysql
 
high performance mysql
high performance mysqlhigh performance mysql
high performance mysql
 
Lecture28
Lecture28Lecture28
Lecture28
 
Refactoring And Unit Testing
Refactoring And Unit TestingRefactoring And Unit Testing
Refactoring And Unit Testing
 
Refactoring Simple Example
Refactoring Simple ExampleRefactoring Simple Example
Refactoring Simple Example
 
Refactoring Example
Refactoring ExampleRefactoring Example
Refactoring Example
 

Kürzlich hochgeladen

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Kürzlich hochgeladen (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Application Partitioning Wp

  • 1. MySQL Scale-Out by application partitioning Oli Sennhauser Rebenweg 6 CH – 8610 Uster Switzerland oli.sennhauser@bluewin.ch Introduction very limited way by adding more disks to your IO sys­ Eventually every database system hit its limits. Espe­ tem, but here too you eventually hit a limit (price). cially on the Internet, where you have millions of users which theoretically access your database simul­ Scale-out possibilities taneously, eventually your IO system will be a bottle­ So we have to think about other possibilities to scale- neck. out. One possibility would be to use MySQL Cluster. This solution can be very fast because it is not IO Conventional solutions bound. But it has some other limits like: amount of In general, as a first step, MySQL Replication is used to available RAM, and joins not performing to well. If scale-out in such a situation. MySQL Replication scales these limitations were not applicable, MySQL Cluster very well when you have a high read/write (r/w) ratio. would be a good and performant solution. The higher the better. An other promising but more complex solution with nearly no scale-out limits is application partitioning. If and when you get into the top-1000 rank on alexa [1], Web server Web server you have to think about such solutions. Web server Web server Application partitioning What does “application partitioning” mean? Application partitioning means the following: Application server Application server Application server Application server “Application partitioning distributes application pro­ cessing across all system resources...” There are 2 different kinds of application partitioning: horizontal and vertical application partitioning. MySQL MySQL MySQL MySQL Horizontal application partitioning Replication MySQL Replication Replication Replication Replication Horizontal application partitioning is also known as Slave Replication Slave Master (write) Slave Multi-Tier-Computing [2] which means splitting the (readonly) Slave (readonly) (readonly) (readonly) database back end, the application server (middle tier), the web server, and the client doing the display. This nowadays is common sense and good practice. But with horizontal application partitioning you still have not avoided the IO bottleneck on the database back end. But also such a MySQL Replication system (let us call it “MySQL Replication cluster” [4] rather than “MySQL Cluster” in this paper) hits its limits when you have a huge amount of (write) access. Because database systems have random disk access, it's not the throughput of your IO system that's relevant but the IO per second (random seek). You can scale this in a
  • 2. static information like for example geographic informa­ Client Web-Client tion). This can also be done by a separate replication tree: Web server WS horizontal application Monolithic system partitioning Application server / containing: AS * database back end, Middle-tier * application logic and * presentation logic MySQL database M1 M2 M3 M' back end S1 S2 S3 S' Vertical application partitioning With vertical application partitioning you can scale-out your system to a nearly unlimited degree. The more The disadvantage of this solution is, that you have to loosely coupled your vertical application partitions are, (keep) open at least 1 connection from each application the better the whole system scales [3]. server (AS) to each Master (Mn) and Slave (Sn) of each MySQL Replication cluster. So the limitation of this But what does vertical application partitioning now system is roughly: mean? For example suppose you have an on-line contact website with 1 million users. Some of them, let's say #Conn./Server : #AS Conn. = #Replication clusters 20%, are actively searching for contacts with other people. Each of these active searching users does 10 1000 : 50 = 20 contact requests per day. This gives approximately 2 million changes into the back end (23 changes per When this limit too has been hit, a much more second). In general one contact request results in more sophisticated solution with distribution of the users in than one change in the database and also people are the AS and WS tier has to be considered: doing this contact search during peak hours (1/3 of the day). This can result easily in several hundred changes per second on the database during peak time. But your I/O system is roughly limited by this formula: Internet 250 I/O's /s per disk * #disks = #I/O /s When you are using MySQL Replication, some caching WS WS WS mechanism (MySQL query cache, block buffer cache, file system cache, battery buffered disk cache, etc.) can help and when you follow the concept of “relaxation of AS AS AS constraints” you can increase this amount of I/O by some factors. You can handle these 1 million users on an optimized MySQL Replication Cluster system (when M1 M2 M3 you have tuned it properly). But what happens when you want to scale by factor 10 or even 100? With 10 million users your system S1 S2 S3 definitely hits its limits. How do we scale here? In this case we can only scale if we split up one MySQL But in this concept something like an “asynchronous Replication Cluster into several pieces. This splitting inter MySQL Replication Cluster” protocol has to be can be done for example by user (user_id). established. WS WS How to partition an entity An entity can be split up in several different ways: AS AS Partition by RANGE Users are distributed to their MySQL Replication M M1 M2 M3 M4 cluster, for example by their user_id. For every 1 million users you have to provide a new MySQL Replication cluster: S S1 S2 S3 S4 AS It should be considered that the splitting is done by the entity with the smallest possible interaction. Otherwise a lot of synchronization work has to be done between the user_id < user_id < user_id < concerned database nodes. It should also be considered 1'000'000 2'000'000 3'000'000 ... that some data can or must be kept redundant (general
  • 3. Advantages: • No redistribution of users during growth needed. or You only have to add a new MySQL Replication cluster. Cluster = HASH(last_name) MOD #Clusters • Improves slightly locality of reference [5]. Splitting up by DIV is already discussed in “Partitioning • Easy to understand. by RANGE”. • Easy to locate data. • Likelihood of hot-spots is low. Advantages of HASH: • Random distribution, thus no hot-spots • Simple distribution logic can be implemented. Disadvantages of HASH: Disadvantages: • For rebalancing the whole system must be mi­ • On the “old” MySQL Replication clusters it is grated! likely that you get less and less activity. So you either have to waste hardware resources -- which is • Hot-spots can happen if done wrong for example not too serious because these machines are depre­ HASH(country) MOD # Clusters ciated after some years and “only” consume some power and space in your IT center -- or you have to Advantages of MOD: migrate users from the oldest MySQL Replication • Deterministic distribution, target cluster is easily Clusters once in a while -- which causes a lot of visible “by hand”. traffic and probably some downtime on these 2 ma­ Disadvantages of MOD: chines. • For rebalancing the whole system must be mi­ • Resource balancing causes a lot of migration work. grated! • When resource balancing is done, simple distribu­ tion logic does not apply anymore. Then a lookup Partition by LOAD (with lookup table) mechanism is needed. A dynamic way to partition users is measuring the load of each MySQL Replication cluster (somehow) and dis­ Partition by a certain CHARACTERISTCS tributing new users accordingly (similar to a load balan­ Users are distributed by certain characteristics for cer). For this, for every user a more or less static lookup example last name, birth date or country. table is needed to determine on which MySQL Replica­ tion cluster a user is located. AS AS lookup table last_name last_name last_name BETWEEN BETWEEN BETWEEN 'A' AND 'I' 'J' AND R' 'S' AND 'Z' 90% load Advantages: 50% load 60% load 20% load • Easy to understand. • Easy to locate data. Advantages: Disadvantages: • New MySQL Replication cluster is automatically • You can get “hot-spots” for example on the server loaded more until it reaches saturation. with the last name starting with “S” or some coun­ • No data redistribution is need. tries like US, JP, D etc., and get unused resources Disadvantages: on servers with for example birth date February 29th, last names with “X” and “Y” or countries like • When old users are not removed after some posting the Principality of Liechtenstein, Monaco or An­ peaks can happen on the old systems. dorra. This can cause a necessity for redistribution of data. Literature • This can be avoided by merging some of the values [1] Alexa top 500 ranking: http://www.alexa.com into one MySQL Replication Cluster but then some [2] Multi-Tier-Computing: look-up table must exist. http://en.wikipedia.org/wiki/Three- tier_%28computing%29 • Resource balancing is difficult. [3] Loose coupling: http://en.wikipedia.org/wiki/Loose_coupling Partitioning by HASH/MODULO [4] Cluster: An entity can also be split up by some other functions http://en.wikipedia.org/wiki/Computer_cluster like MODULO. The MySQL Replication Cluster is de­ [5] Locality of reference: termined by either: http://en.wikipedia.org/wiki/Locality_of_reference Cluster = user_id MOD #Clusters