SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Memory-Based Cloud
Architectures



Jan Schaffner

Enterprise Platform and Integration Concepts Group
Definition




             Cloud Computing
                    =
             Data Center + API
What to take home from this talk?
Answers to four questions:


    !  Why are memory based architectures great for cloud computing?


    !  How predictable is the behavior of an in-memory column database?


    !  Does virtualization have a negative impact on in-memory databases?


    !  How do I assign tenants to servers in order to manage fault-tolerance and
       scalability?
First question




        Why are memory based architectures
            great for cloud computing?
Numbers everyone should know
"  L1 cache reference                    0.5 ns
"  Branch mispredict                     5 ns
"  L2 cache reference                    7 ns
"  Mutex lock/unlock                     25 ns
"  Main memory reference                 100 ns (in 2008)
"  Compress 1K bytes with Zippy          3,000 ns
"  Send 2K bytes over 1 Gbps network     20,000 ns
"  Read 1 MB sequentially from memory    250,000 ns
"  Round trip within same datacenter     500,000 ns (in 2008)
"  Disk seek                             10,000,000 ns
"  Read 1 MB sequentially from network   10,000,000 ns
"  Read 1 MB sequentially from disk      20,000,000 ns
"  Send packet CA ! Netherlands ! CA     150,000,000 ns
                                                            Source: Jeff Dean
Memory should be the
system of record

"  Typically disks have been the system of record
   !  Slow ! wrap them in complicated caching and distributed file
      systems to make them perform
   !  Memory used as cache all over the place but it can be
      invalidated when something changes on disk
"  Bandwidth:
   !  Disk:                   120 MB/s/controller
   !  DRAM (x86 + FSB):       10.4 GB/s/board
   !  DRAM (Nehalem):         25.6 GB/s/socket
"  Latency:
   !  Disk:           13 milliseconds (up to seconds when queuing)
   !  InfiniBand:     1-2 microseconds
   !  DRAM:           5 nanoseconds
High-end networks vs. disks


                        Maximum bandwidths:

       Hard Disk                 100-120 MB/s
       SSD                       250 MB/s
       Serial ATA II             600 MB/s
       10 GB Ethernet            1204 MB/s
       InfiniBand                1250 MB/s (4 channels)
       PCIe Flash Storage        1400 MB/s
       PCIe 3.0                  32 GB/s
       DDR3-1600                 25.6 GB/s (dual channel)
Designing a database for the cloud

"  Disks are the limiting factor in contemporary database systems
   !  Sharing a high performance disk on a machine/cluster/cloud is
      fine/troublesome/miserable
   !  While one guy is fetching 100 MB/s, everyone else is waiting


"  Claim: Two machines + network is better than one machine + disk
   !  Log to disk on a single node:
      > 10,000 !s     (not predictable)
   !  Transactions only in memory but on two nodes:
      < 600 !s        (more predictable)


"  Concept: Design to the strengths of cloud (redundancy) rather than
   their weaknesses (shared anything)
Design choices for a cloud database

"  No disks (in-memory delta tables + async snapshots)
"  Multi-master replication
    !  Two copies of the data
    !  Load balancing both reads and (monotonic) writes
    !  (Eventual) consistency achieved via MVCC (+ Paxos, later)
"  High-end hardware
    !  Nehalem for high memory bandwidth
    !  Fast interconnect
"  Virtualization
    !  Ease of deployment/administration
    !  Consolidation/multi-tenancy
Why consolidation?

"  In-memory column databases are ideal for mixed workload
   processing
"  But: In a SaaS environment it seems costly to give everybody
   their private NewDB box


"  How much consolidation is possible?
   !  3 years worth of sales records from our favorite
      Fortune 500 retail company
   !  360 million records
   !  Less than 3 GB in compressed columns in memory
   !  Next door is a machine with 1 TB of DRAM
   !  (Beware of overhead)
Multi-tenancy in the database –
four different options
"  No multi-tenancy – one VM per tenant
    !  Ex.: RightNow has 3000 tenants in 200 databases (2007):
       3000 vs. 200 Amazon VMs cost $2,628,000 vs. $175,200/year
    !  Very strong isolation


"  Shared machine – one database process per tenant
                                                                           T1
    !  Scheduler, session manager and transaction manager need
       live inside the individual DB processes: IPC for synchronization          T3
                                                                           T2
    !  Good for custom extensions, good isolation


"  Shared process – one schema instance per tenant
                                                                           T1
    !  Must support large numbers of tables                                      T3
                                                                           T2
    !  Must support online schema extension and evolution


"  Shared table – use a tenant_id column and partitioning
                                                                          T1, T2, T3
    !  Bad for custom extensions, bad isolation
    !  Hard to backup/restore/migrate individual tenants
Putting it all together:
Rock cluster architecture




                                      &))-./(012%   &))-./(012%
                     89!:%3;<*+7%
                                        3+,4+,%       3+,4+,%

 Extract data from
 external system          67)1,*+,%                       =-5<*+,%   Cluster membership,
                                           Rock                      Tenant placement
                                                          >(<*+,%
 Load balance
 between replicas                "15*+,%       "15*+,%

 Forward writes to
 other replicas         &'()*+,%         &'()*+,%        &'()*+,%


                         !"#$%             !"#$%          !"#$%
Second question




       How predictable is the behavior of an
          in-memory column database?
What does “predictable” mean?
"  Traditionally, database people are concerned with the questions of type
   “how do I make a query faster?”


"  In a SaaS environment, the question is
   “how do I get a fixed (low) response time as cheap as possible?”
    !  Look at throughput
    !  Look at quantiles (e.g. 99-th percentile)


"  Example formulation of desired performance:
    !  Response time goal “1 second in the 99-th percentile”
    !  Average response time around 200 ms
    !  Less than 1% of all queries exceed 1,000 ms
    !  Results in a maximum number of concurrent queries before
       response time goal is violated
each test contained several tenants of a particular tenant size so that
                                 20% of the available main memory has always been used for tenant data
                                 preloaded into main memory before each test run and a six minute warm
System capacity
                                 performed before each run. The test data was generated by the Star Schem
"  Fixed amount of data split equally among all tenants
                    Data Generator [36].
                 200
                                   Figure 4.2 shows the Measured
                                                        same graph using a logarithmic scale. This grap
                 180                               Approx. Function
                 160             linear which means that the relation can be described as
                 140                                             Can be expressed as:
  Requests / s




                 120                                         log(f (tSize )) = m · log(tSize ) + n
                 100
                  80             where n is the intercept with the y-axis f (0) and m is the gradient. T
                  60             n and the gradient m can be estimated using regression and the Least-S
                  40
                                 (see chapter 2.3.2) or simply by using a slope triangle. The gradient fo
                  20
                       20   40   60 ≈80 100 120 140(0) = n ≈ 3.6174113. This equation can, then, be
                                 m −0.945496 and f 160 180 200 220
                                        Tenant Size in MB
                                 equation (4.1).
"  Capacity " bytes scanned per second
   (there is a small overhead when processing more requests)
"  In-memory databases behave very linearly!
Workload
"  Tenants generally have different rates and sizes
"  For a given set of T tenants (on one server) define




"  When Workload = 1
    !  System runs at it’s maximum throughput level
    !  Further increase of workload will result in violation of
       response time goal
Response time
"  Different amounts of data and different request rates (“assorted mix”)
"  Workload is varied by scaling the request rates

                           2500
                                     Tenant Data 1.5 GB
                                     Tenant Data 2.0 GB
 99-th Perc. Value in ms




                           2000      Tenant Data 2.6 GB
                                     Tenant Data 3.2 GB
                           1500               Prediction


                           1000

                            500

                              0
                               0.2    0.3   0.4   0.5   0.6   0.7   0.8   0.9   1   1.1
                                                        Workload
Impact of writes
"  Added periodic batch writes (fact table grows by 0.5% every 5 minutes)


                             2000                  Without Writes
                                                       With Writes
   99-th Perc. Value in ms



                                         Prediction without Writes
                             1500           Prediction with Writes


                             1000


                              500


                                0
                                 0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1   1.1
                                                         Workload
Why is predictability good?
"  Ability to plan and perform resource intensive tasks during normal
   operations:
    !  Upgrades
    !  Merges
    !  Migrations of tenants in the cluster (e.g. to dynamically
       re-balance the load situation in the cluster)




                                                             Cost breakdown for
                                                             migration of tenants
Definition




             Cloud Computing
                    =
             Data Center + API
Third question




     Does virtualization have a negative impact
             on in-memory databases?
Impact of virtualization
                        "  Run multi-tenant OLAP benchmark on either:
                               !  one TREX instance directly on the physical host vs.
                               !  one TREX instance inside VM on the physical host


                        "  Overhead is approximately 7% (both in response time and throughput)

                        2600                                                                       5.5
                        2400                                                                        5
                        2200
Average response time




                                                                                                   4.5




                                                                              Queries per second
                        2000
                                                                                                    4
                        1800
                                                                                                   3.5
                        1600
                                                                                                    3
                        1400
                                                                                                   2.5
                        1200
                        1000                                                                        2
                         800                                                                       1.5
                                                virtual           physical                                               virtual          physical
                         600                                                                        1
                               0    2    4         6          8         10   12                          0   2   4          6         8         10   12
                                             Client Threads                                                          Client Threads
Impact of virtualization (contd.)
"  Virtualization is often used to get “better” system utilization
    !  What happens when a physical machine is split into multiple VMs?
    !  Burning CPU cycles does not hurt ! memory bandwidth is the
       limiting factor
                                             160 %
         Response Time as a Percentage of



                                                     Xeon E5450
          Response Time with 1 Active Slot



                                             150 %   Xeon X5650
                                             140 %
                                             130 %
                                             120 %
                                             110 %
                                             100 %
                                             90 %
                                             80 %
                                                      1               2               3          4
                                                                  Concurrently Active VM Slots
Fourth question




        How do I assign tenants to servers
        in order to manage fault-tolerance
                  and scalability?
Why is it good to have multiple
copies of the data?
"  Scalability beyond a certain number of concurrently active users
"  High availability during normal operations
"  Alternating execution of resource-intensive operations (e.g. merge)
"  Rolling upgrades without downtime
"  Data migration without downtime


"  Reminder: Two in-memory copies allow faster writes and are more
   predictable than one in-memory copy plus disk


"  Downsides:
    !  Response time goal might be violated during recovery
    !  You need to plan for twice the capacity
Tenant placement

         Conventional                        Interleaved
        Mirrored Layout                        Layout

       T1               T1             T1                T1
                                              T3               T5
       T2               T2             T2                T4



       T3               T3             T2                T4
                                              T6                T3
       T4               T4             T5                T6


 If a node fails, all work moves to   If a node fails, work moves to
one other node. The system must          many other nodes. Allows
    be 100% over-provisioned.          higher utilization of nodes.
Handcrafted best case
"  Perfect placement:
                                       1 1 4 4 7 7           1 4 7 1 2 3
    !  100 tenants
                                       2 2 5 5 8 8           2 5 8 4 5 6
    !  2 copies/tenant                 3 3 6 6 9 9           3 6 9 7 8 9
    !  All tenants have same size
                                           Mirrored           Interleaved
    !  10 tenants/server


!  Perfect balancing (same load on all tenants):
    !  6M rows (204 MB compressed) of data per tenant
    !  The same (increasing) number of users per tenant
    !  No writes
                              Mirrored        Interleaved    Improvement
      No failures            4218 users        4506 users         7%
    Periodic single          2265 users        4250 users        88%
       failures
                               Throughput before violating
                                   response time goal
Requirements for placement algorithm
"  An optimal placement algorithm needs to cope with
   multiple (conflicting) goals:
    !  Balance load across servers
    !  Achieve good interleaving


"  Use migrations consciously for online layout improvements
   (no big bang cluster re-organization)


"  Take usage patterns into account
    !  Request rates double during last week before end of quarter
    !  Time-zones, Christmas, etc.
Conclusion
"  Answers to four questions:


    !  Why are memory based architectures great for cloud computing?


    !  How predictable is the behavior of an in-memory column database?


    !  Does virtualization have a negative impact on in-memory databases?


    !  How do I assign tenants to servers in order to manage fault-tolerance and
       scalability?




"  Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoverySteven Francia
 
Google Spanner : our understanding of concepts and implications
Google Spanner : our understanding of concepts and implicationsGoogle Spanner : our understanding of concepts and implications
Google Spanner : our understanding of concepts and implicationsHarisankar H
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014nkabra
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Rick Branson
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2Nick Wang
 
Wide area frequency easurement system iitb
Wide area frequency easurement system iitbWide area frequency easurement system iitb
Wide area frequency easurement system iitbPanditNitesh
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsshrey mehrotra
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadeaviadea
 
Mysql story in poi dedup
Mysql story in poi dedupMysql story in poi dedup
Mysql story in poi dedupfeng lee
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)Nicholas Adu Gyamfi
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraDataStax
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
 
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012Chris Richardson
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Inc.
 

Was ist angesagt? (20)

Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Google Spanner : our understanding of concepts and implications
Google Spanner : our understanding of concepts and implicationsGoogle Spanner : our understanding of concepts and implications
Google Spanner : our understanding of concepts and implications
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 
Spanner osdi2012
Spanner osdi2012Spanner osdi2012
Spanner osdi2012
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
Wide area frequency easurement system iitb
Wide area frequency easurement system iitbWide area frequency easurement system iitb
Wide area frequency easurement system iitb
 
Cassandra勉強会
Cassandra勉強会Cassandra勉強会
Cassandra勉強会
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 
Mysql story in poi dedup
Mysql story in poi dedupMysql story in poi dedup
Mysql story in poi dedup
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
Drbd
DrbdDrbd
Drbd
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
 
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduce
 
Spanner (may 19)
Spanner (may 19)Spanner (may 19)
Spanner (may 19)
 

Andere mochten auch

Trinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory CloudTrinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory CloudQian Lin
 
مقدمة الى الأندرويد
مقدمة الى الأندرويدمقدمة الى الأندرويد
مقدمة الى الأندرويدAmal Wishah
 
الحوسبة السحابية
الحوسبة السحابيةالحوسبة السحابية
الحوسبة السحابيةMamoun Matar
 
File systems for mobile phones or handheld devices
File systems for mobile phones or handheld devices File systems for mobile phones or handheld devices
File systems for mobile phones or handheld devices Ram Kumar K R
 
Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...
Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...
Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...Essam Obaid
 
حاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصلي
حاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصليحاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصلي
حاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصليabuhamad999
 
عرض عن الحوسبة السحابية
عرض عن الحوسبة السحابيةعرض عن الحوسبة السحابية
عرض عن الحوسبة السحابيةmousa5122
 
Mobile App Development- Project Management Process
Mobile App Development- Project Management ProcessMobile App Development- Project Management Process
Mobile App Development- Project Management ProcessBagaria Swati
 

Andere mochten auch (8)

Trinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory CloudTrinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory Cloud
 
مقدمة الى الأندرويد
مقدمة الى الأندرويدمقدمة الى الأندرويد
مقدمة الى الأندرويد
 
الحوسبة السحابية
الحوسبة السحابيةالحوسبة السحابية
الحوسبة السحابية
 
File systems for mobile phones or handheld devices
File systems for mobile phones or handheld devices File systems for mobile phones or handheld devices
File systems for mobile phones or handheld devices
 
Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...
Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...
Cloud computing دور الحوسبة السحابية فى المكتبات الرقمية ونظم الارشفة الالكتر...
 
حاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصلي
حاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصليحاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصلي
حاسب ثالث ثانوي الفصل الأول كتاب الطالب النظام الفصلي
 
عرض عن الحوسبة السحابية
عرض عن الحوسبة السحابيةعرض عن الحوسبة السحابية
عرض عن الحوسبة السحابية
 
Mobile App Development- Project Management Process
Mobile App Development- Project Management ProcessMobile App Development- Project Management Process
Mobile App Development- Project Management Process
 

Ähnlich wie Memory-Based Cloud Architectures

Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...ScyllaDB
 
Designs, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed SystemsDesigns, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed SystemsDaehyeok Kim
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud ComputingAmazon Web Services
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud ComputingAmazon Web Services
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...Ben Stopford
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL Bernd Ocklin
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaCloudera, Inc.
 
Ndb cluster 80_ycsb_mem
Ndb cluster 80_ycsb_memNdb cluster 80_ycsb_mem
Ndb cluster 80_ycsb_memmikaelronstrom
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBen Stopford
 
Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopfordBen Stopford
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopDatabricks
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...In-Memory Computing Summit
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudCeph Community
 

Ähnlich wie Memory-Based Cloud Architectures (20)

Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
 
Designs, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed SystemsDesigns, Lessons and Advice from Building Large Distributed Systems
Designs, Lessons and Advice from Building Large Distributed Systems
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud Computing
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud Computing
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Ndb cluster 80_ycsb_mem
Ndb cluster 80_ycsb_memNdb cluster 80_ycsb_mem
Ndb cluster 80_ycsb_mem
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
 
Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopford
 
Kinetic basho public
Kinetic basho publicKinetic basho public
Kinetic basho public
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 

Mehr von 小新 制造

Storage: Alternate Futures
Storage: Alternate FuturesStorage: Alternate Futures
Storage: Alternate Futures小新 制造
 
架构大数据 挑战、现状与展望
架构大数据 挑战、现状与展望架构大数据 挑战、现状与展望
架构大数据 挑战、现状与展望小新 制造
 
Altibase管理培训 优化篇 v1.1
Altibase管理培训 优化篇 v1.1Altibase管理培训 优化篇 v1.1
Altibase管理培训 优化篇 v1.1小新 制造
 
Altibase管理培训 管理篇
Altibase管理培训 管理篇Altibase管理培训 管理篇
Altibase管理培训 管理篇小新 制造
 
Altibase管理培训 安装篇
Altibase管理培训 安装篇Altibase管理培训 安装篇
Altibase管理培训 安装篇小新 制造
 
Ibm solid db overview v6.3 20090320
Ibm solid db overview v6.3 20090320Ibm solid db overview v6.3 20090320
Ibm solid db overview v6.3 20090320小新 制造
 
Oracle clusterware overview_11g_en
Oracle clusterware overview_11g_enOracle clusterware overview_11g_en
Oracle clusterware overview_11g_en小新 制造
 

Mehr von 小新 制造 (10)

Storage: Alternate Futures
Storage: Alternate FuturesStorage: Alternate Futures
Storage: Alternate Futures
 
内存数据库[1]
内存数据库[1]内存数据库[1]
内存数据库[1]
 
架构大数据 挑战、现状与展望
架构大数据 挑战、现状与展望架构大数据 挑战、现状与展望
架构大数据 挑战、现状与展望
 
Altibase管理培训 优化篇 v1.1
Altibase管理培训 优化篇 v1.1Altibase管理培训 优化篇 v1.1
Altibase管理培训 优化篇 v1.1
 
Altibase管理培训 管理篇
Altibase管理培训 管理篇Altibase管理培训 管理篇
Altibase管理培训 管理篇
 
Altibase管理培训 安装篇
Altibase管理培训 安装篇Altibase管理培训 安装篇
Altibase管理培训 安装篇
 
Altibase介绍
Altibase介绍Altibase介绍
Altibase介绍
 
Ibm solid db_基础
Ibm solid db_基础Ibm solid db_基础
Ibm solid db_基础
 
Ibm solid db overview v6.3 20090320
Ibm solid db overview v6.3 20090320Ibm solid db overview v6.3 20090320
Ibm solid db overview v6.3 20090320
 
Oracle clusterware overview_11g_en
Oracle clusterware overview_11g_enOracle clusterware overview_11g_en
Oracle clusterware overview_11g_en
 

Kürzlich hochgeladen

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Kürzlich hochgeladen (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Memory-Based Cloud Architectures

  • 1. Memory-Based Cloud Architectures Jan Schaffner Enterprise Platform and Integration Concepts Group
  • 2. Definition Cloud Computing = Data Center + API
  • 3. What to take home from this talk? Answers to four questions: !  Why are memory based architectures great for cloud computing? !  How predictable is the behavior of an in-memory column database? !  Does virtualization have a negative impact on in-memory databases? !  How do I assign tenants to servers in order to manage fault-tolerance and scalability?
  • 4. First question Why are memory based architectures great for cloud computing?
  • 5. Numbers everyone should know "  L1 cache reference 0.5 ns "  Branch mispredict 5 ns "  L2 cache reference 7 ns "  Mutex lock/unlock 25 ns "  Main memory reference 100 ns (in 2008) "  Compress 1K bytes with Zippy 3,000 ns "  Send 2K bytes over 1 Gbps network 20,000 ns "  Read 1 MB sequentially from memory 250,000 ns "  Round trip within same datacenter 500,000 ns (in 2008) "  Disk seek 10,000,000 ns "  Read 1 MB sequentially from network 10,000,000 ns "  Read 1 MB sequentially from disk 20,000,000 ns "  Send packet CA ! Netherlands ! CA 150,000,000 ns Source: Jeff Dean
  • 6. Memory should be the system of record "  Typically disks have been the system of record !  Slow ! wrap them in complicated caching and distributed file systems to make them perform !  Memory used as cache all over the place but it can be invalidated when something changes on disk "  Bandwidth: !  Disk: 120 MB/s/controller !  DRAM (x86 + FSB): 10.4 GB/s/board !  DRAM (Nehalem): 25.6 GB/s/socket "  Latency: !  Disk: 13 milliseconds (up to seconds when queuing) !  InfiniBand: 1-2 microseconds !  DRAM: 5 nanoseconds
  • 7. High-end networks vs. disks Maximum bandwidths: Hard Disk 100-120 MB/s SSD 250 MB/s Serial ATA II 600 MB/s 10 GB Ethernet 1204 MB/s InfiniBand 1250 MB/s (4 channels) PCIe Flash Storage 1400 MB/s PCIe 3.0 32 GB/s DDR3-1600 25.6 GB/s (dual channel)
  • 8. Designing a database for the cloud "  Disks are the limiting factor in contemporary database systems !  Sharing a high performance disk on a machine/cluster/cloud is fine/troublesome/miserable !  While one guy is fetching 100 MB/s, everyone else is waiting "  Claim: Two machines + network is better than one machine + disk !  Log to disk on a single node: > 10,000 !s (not predictable) !  Transactions only in memory but on two nodes: < 600 !s (more predictable) "  Concept: Design to the strengths of cloud (redundancy) rather than their weaknesses (shared anything)
  • 9. Design choices for a cloud database "  No disks (in-memory delta tables + async snapshots) "  Multi-master replication !  Two copies of the data !  Load balancing both reads and (monotonic) writes !  (Eventual) consistency achieved via MVCC (+ Paxos, later) "  High-end hardware !  Nehalem for high memory bandwidth !  Fast interconnect "  Virtualization !  Ease of deployment/administration !  Consolidation/multi-tenancy
  • 10. Why consolidation? "  In-memory column databases are ideal for mixed workload processing "  But: In a SaaS environment it seems costly to give everybody their private NewDB box "  How much consolidation is possible? !  3 years worth of sales records from our favorite Fortune 500 retail company !  360 million records !  Less than 3 GB in compressed columns in memory !  Next door is a machine with 1 TB of DRAM !  (Beware of overhead)
  • 11. Multi-tenancy in the database – four different options "  No multi-tenancy – one VM per tenant !  Ex.: RightNow has 3000 tenants in 200 databases (2007): 3000 vs. 200 Amazon VMs cost $2,628,000 vs. $175,200/year !  Very strong isolation "  Shared machine – one database process per tenant T1 !  Scheduler, session manager and transaction manager need live inside the individual DB processes: IPC for synchronization T3 T2 !  Good for custom extensions, good isolation "  Shared process – one schema instance per tenant T1 !  Must support large numbers of tables T3 T2 !  Must support online schema extension and evolution "  Shared table – use a tenant_id column and partitioning T1, T2, T3 !  Bad for custom extensions, bad isolation !  Hard to backup/restore/migrate individual tenants
  • 12. Putting it all together: Rock cluster architecture &))-./(012% &))-./(012% 89!:%3;<*+7% 3+,4+,% 3+,4+,% Extract data from external system 67)1,*+,% =-5<*+,% Cluster membership, Rock Tenant placement >(<*+,% Load balance between replicas "15*+,% "15*+,% Forward writes to other replicas &'()*+,% &'()*+,% &'()*+,% !"#$% !"#$% !"#$%
  • 13. Second question How predictable is the behavior of an in-memory column database?
  • 14. What does “predictable” mean? "  Traditionally, database people are concerned with the questions of type “how do I make a query faster?” "  In a SaaS environment, the question is “how do I get a fixed (low) response time as cheap as possible?” !  Look at throughput !  Look at quantiles (e.g. 99-th percentile) "  Example formulation of desired performance: !  Response time goal “1 second in the 99-th percentile” !  Average response time around 200 ms !  Less than 1% of all queries exceed 1,000 ms !  Results in a maximum number of concurrent queries before response time goal is violated
  • 15. each test contained several tenants of a particular tenant size so that 20% of the available main memory has always been used for tenant data preloaded into main memory before each test run and a six minute warm System capacity performed before each run. The test data was generated by the Star Schem "  Fixed amount of data split equally among all tenants Data Generator [36]. 200 Figure 4.2 shows the Measured same graph using a logarithmic scale. This grap 180 Approx. Function 160 linear which means that the relation can be described as 140 Can be expressed as: Requests / s 120 log(f (tSize )) = m · log(tSize ) + n 100 80 where n is the intercept with the y-axis f (0) and m is the gradient. T 60 n and the gradient m can be estimated using regression and the Least-S 40 (see chapter 2.3.2) or simply by using a slope triangle. The gradient fo 20 20 40 60 ≈80 100 120 140(0) = n ≈ 3.6174113. This equation can, then, be m −0.945496 and f 160 180 200 220 Tenant Size in MB equation (4.1). "  Capacity " bytes scanned per second (there is a small overhead when processing more requests) "  In-memory databases behave very linearly!
  • 16. Workload "  Tenants generally have different rates and sizes "  For a given set of T tenants (on one server) define "  When Workload = 1 !  System runs at it’s maximum throughput level !  Further increase of workload will result in violation of response time goal
  • 17. Response time "  Different amounts of data and different request rates (“assorted mix”) "  Workload is varied by scaling the request rates 2500 Tenant Data 1.5 GB Tenant Data 2.0 GB 99-th Perc. Value in ms 2000 Tenant Data 2.6 GB Tenant Data 3.2 GB 1500 Prediction 1000 500 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 Workload
  • 18. Impact of writes "  Added periodic batch writes (fact table grows by 0.5% every 5 minutes) 2000 Without Writes With Writes 99-th Perc. Value in ms Prediction without Writes 1500 Prediction with Writes 1000 500 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 Workload
  • 19. Why is predictability good? "  Ability to plan and perform resource intensive tasks during normal operations: !  Upgrades !  Merges !  Migrations of tenants in the cluster (e.g. to dynamically re-balance the load situation in the cluster) Cost breakdown for migration of tenants
  • 20. Definition Cloud Computing = Data Center + API
  • 21. Third question Does virtualization have a negative impact on in-memory databases?
  • 22. Impact of virtualization "  Run multi-tenant OLAP benchmark on either: !  one TREX instance directly on the physical host vs. !  one TREX instance inside VM on the physical host "  Overhead is approximately 7% (both in response time and throughput) 2600 5.5 2400 5 2200 Average response time 4.5 Queries per second 2000 4 1800 3.5 1600 3 1400 2.5 1200 1000 2 800 1.5 virtual physical virtual physical 600 1 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Client Threads Client Threads
  • 23. Impact of virtualization (contd.) "  Virtualization is often used to get “better” system utilization !  What happens when a physical machine is split into multiple VMs? !  Burning CPU cycles does not hurt ! memory bandwidth is the limiting factor 160 % Response Time as a Percentage of Xeon E5450 Response Time with 1 Active Slot 150 % Xeon X5650 140 % 130 % 120 % 110 % 100 % 90 % 80 % 1 2 3 4 Concurrently Active VM Slots
  • 24. Fourth question How do I assign tenants to servers in order to manage fault-tolerance and scalability?
  • 25. Why is it good to have multiple copies of the data? "  Scalability beyond a certain number of concurrently active users "  High availability during normal operations "  Alternating execution of resource-intensive operations (e.g. merge) "  Rolling upgrades without downtime "  Data migration without downtime "  Reminder: Two in-memory copies allow faster writes and are more predictable than one in-memory copy plus disk "  Downsides: !  Response time goal might be violated during recovery !  You need to plan for twice the capacity
  • 26. Tenant placement Conventional Interleaved Mirrored Layout Layout T1 T1 T1 T1 T3 T5 T2 T2 T2 T4 T3 T3 T2 T4 T6 T3 T4 T4 T5 T6 If a node fails, all work moves to If a node fails, work moves to one other node. The system must many other nodes. Allows be 100% over-provisioned. higher utilization of nodes.
  • 27. Handcrafted best case "  Perfect placement: 1 1 4 4 7 7 1 4 7 1 2 3 !  100 tenants 2 2 5 5 8 8 2 5 8 4 5 6 !  2 copies/tenant 3 3 6 6 9 9 3 6 9 7 8 9 !  All tenants have same size Mirrored Interleaved !  10 tenants/server !  Perfect balancing (same load on all tenants): !  6M rows (204 MB compressed) of data per tenant !  The same (increasing) number of users per tenant !  No writes Mirrored Interleaved Improvement No failures 4218 users 4506 users 7% Periodic single 2265 users 4250 users 88% failures Throughput before violating response time goal
  • 28. Requirements for placement algorithm "  An optimal placement algorithm needs to cope with multiple (conflicting) goals: !  Balance load across servers !  Achieve good interleaving "  Use migrations consciously for online layout improvements (no big bang cluster re-organization) "  Take usage patterns into account !  Request rates double during last week before end of quarter !  Time-zones, Christmas, etc.
  • 29. Conclusion "  Answers to four questions: !  Why are memory based architectures great for cloud computing? !  How predictable is the behavior of an in-memory column database? !  Does virtualization have a negative impact on in-memory databases? !  How do I assign tenants to servers in order to manage fault-tolerance and scalability? "  Questions?