SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Scaling your Database in the Cloud Spring2011 Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.dbshards.com View the video of this presentation here!
Introduction Who I am Cory Isaacson, CEO of CodeFutures Providers of dbShards Author of Software Pipelines Partnerships: Rightscale The leading Cloud Management Platform Leaders in database scalability, performance, and high-availability for the cloud… …based on real-world experience with dozens of cloud-based applications …social networking, gaming, data collection, mobile, analytics Objective is to provide useful experience you can apply to scaling (and managing) your database tier… …especially for high volume applications
Challenges of cloud computing Cloud provides highly attractive service environment Flexible, scales with need (up or down) No need for dedicated IT staff, fixed facility costs Pay-as-you-go model Cloud services occasionally fail Partial network outages Server failures… …by their nature cloud servers are “transient” Disk volume issues Cloud-based resources are constrained CPU I/O Rates… …the “Cloud I/O Barrier”
Typical Application Architecture
Scaling in the Cloud Scaling Load Balancers is easy Stateless routing to app server Can add redundant Load Balancers if needed DNS round-robin or intelligent routing for larger sites If one goes down… …failover to another Scaling Application Servers is easy Stateless Sessions can easily transition to another server Add or remove servers as need dictates If one goes down… …failover to another
Scaling in the Cloud Scaling the Database tier is hard “Stateful” by definition (and necessity…) Large, integrated data sets… 10s of GBs to TBs (or more…) Difficult to move, reload I/O dependent… …adversely affected by cloud service failures …and slow cloud I/O If one goes down… …ouch! Databases form the “last mile” of true application scalability Initially simple optimization produces the best result Implement a follow-on scalability strategy for long-term performance goals… …plus a high-availability strategy is a must
All CPUs wait at the same speed… The Cloud I/O Barrier
More Database scalability challenges Databases have many other challenges that limit scalability ACID compliance… …especially Consistency …user contention Operational challenges Failover Planned, unplanned Maintenance Index rebuild Restore Space reclamation Lifecycle Reliable Backup/Restore Monitoring Application Updates Management
Database slowdown is not linear…
Challenges apply to all types of databases Traditional RDBMS (MySQL, Postgres, Oracle…) I/O bound Multi-user, lock contention High-availability Lifecycle management NoSQLDatabases In-memory, Caching, Document…) Reliability, High-availability Limits of a single server …and a single thread Data dumps to disk Lifecycle Management No matter what the technology, big databases are hard to manage… …elastic scaling is a real challenge …degradation from growth in size and volume is a certainty Application-specific database requirements add to the challenge …database design is key… …balance performance vs. convenience vs. data size
The Laws of Databases Law #1: Small Databases are fast… Law #2: Big Databases are slow… Law #3: Keep databases small
What is the answer? Database sharding is the only effective method for achieving scale, elasticity, reliability and easy management… …regardless of your database technology
What is Database Sharding? “Horizontal partitioning is a database design principle whereby rows of a database table are held separately... Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.” Wikipedia
The key to Database Sharding…
Database Sharding Architecture
Database Sharding Architecture
Database Sharding… the results
Why does Database Sharding work? Maximize CPU/Memory per database instance… …as compared to database size Reduce the size of index trees Speeds writes dramatically No contention between servers Locking, disk, memory, CPU Allows for intelligent parallel processing… …Go Fish queries across shards Keep CPUs busy and productive
Breaking the Cloud I/O Barrier
Database types Traditional RDBMS Proven SQL language… …although ORM can change that Durable, transactional, dependable… …perceived as inflexible NoSQL Typically in-memory… …the real speed advantage Various flavors… Key/Value Store Document database Hybrid Store complex documents using a key In-memory and disk based
Black box vs. Relational Sharding Both utilize sharding on a data key… …typically modulus on a value or consistent hash Black box sharding is automatic…  …attempts to evenly shard across all available servers …no developer visibility or control …can work acceptably for simple, non-indexed NoSQL data stores …easily supports single-row/object results Relational sharding is defined by the developer… …selective sharding of large data …explicit developer control and visibility …tunable as the database grows and matures …more efficient for result sets and searchable queries …preserves data relationships
How Database Sharding works for NoSQL
Relational Sharding Shard-Tree Root Table Shard-Tree Child Tables Global Tables
How Relational Sharding works
How Relational Sharding works Shard key recognition in SQL SELECT * FROM customerWHERE customer_id = 1234 INSERT INTO customer(customer_id, first_name, last_name, addr_line1,…)VALUES(2345, ‘John’, ‘Jones’, ‘123 B Street’,…) UPDATE customerSET addr_line1 = ‘456 C Avenue’WHERE customer_id = 4567
What about Cross-Shard result sets?
Cross-shard result set example Go Fish (no shard key) SELECT country_id, count(*) FROM customerGROUP BY country_id
More on Cross-Shard result sets… Black Box approach requires “scatter gather” for multi-row/object result set… …common with NoSQL engines …forces use of denormalized lists …must be maintained by developer/application code …Map Reduce processing helps with this (non-realtime) Relational sharding provides access to meaningful result sets of related data… …aggregation, sort easier to perform …logical search operations more natural …related rows are stored together
Make your Application Shard-Safe For sharding to be efficient, every sharded table should include a shard key. SELECT WHERE clause statements should include a shard key where possible so that the query goes directly to a single shard rather than being executed against multiple shards in parallel. INSERT, UPDATE and DELETE WHERE CLAUSE statements must include a shard key so that the data is written to the correct shard. Avoid cross-shard inner joins. For example, joining one customer to another is not feasible as both customers may be in different shards. This can be accomplished, but requires complex logic and can adversely affect performance.
What about High-Availability? Can you afford to take your databases offline: …for scheduled maintenance?  …for unplanned failure?  …can you accept some lost transactions? By definition Database Sharding adds failure points to the data tier A proven High-Availability strategy is a must… …system outages…especially in the cloud …planned maintenance…a necessity for all database engines
Traditional asynch replication is inadequate… ,[object Object]
No rapid failover for continuous operation
Maintenance difficult in production environments
Used by many NoSQL engines,[object Object]
…3 active-active instances for in-memory databases
Support for…
…fail-down

Weitere ähnliche Inhalte

Was ist angesagt?

NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture PatternsMaynooth University
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLRichard Schneeman
 
Infinspan: In-memory data grid meets NoSQL
Infinspan: In-memory data grid meets NoSQLInfinspan: In-memory data grid meets NoSQL
Infinspan: In-memory data grid meets NoSQLManik Surtani
 
The Future of Distributed Databases
The Future of Distributed DatabasesThe Future of Distributed Databases
The Future of Distributed DatabasesNuoDB
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDBSandun Perera
 
SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.Denis Reznik
 
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseAnita Luthra
 
Demystfying nosql databases
Demystfying nosql databasesDemystfying nosql databases
Demystfying nosql databasesMike King
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singhMayank Singh
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceWSO2
 
2012 10 24_briefing room
2012 10 24_briefing room2012 10 24_briefing room
2012 10 24_briefing roomNuoDB
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-ConceptsBhaskar Gunda
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 

Was ist angesagt? (20)

NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture Patterns
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot Persistence
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
Infinspan: In-memory data grid meets NoSQL
Infinspan: In-memory data grid meets NoSQLInfinspan: In-memory data grid meets NoSQL
Infinspan: In-memory data grid meets NoSQL
 
The Future of Distributed Databases
The Future of Distributed DatabasesThe Future of Distributed Databases
The Future of Distributed Databases
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.SQL vs. NoSQL. It's always a hard choice.
SQL vs. NoSQL. It's always a hard choice.
 
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
 
Demystfying nosql databases
Demystfying nosql databasesDemystfying nosql databases
Demystfying nosql databases
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
 
2012 10 24_briefing room
2012 10 24_briefing room2012 10 24_briefing room
2012 10 24_briefing room
 
SQL & NoSQL
SQL & NoSQLSQL & NoSQL
SQL & NoSQL
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-Concepts
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 

Andere mochten auch

Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAnormanbarker
 
"How Sharding turned MySQL into the Internet de-facto database standard?", Mo...
"How Sharding turned MySQL into the Internet de-facto database standard?", Mo..."How Sharding turned MySQL into the Internet de-facto database standard?", Mo...
"How Sharding turned MySQL into the Internet de-facto database standard?", Mo...Moshe Kaplan
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseRuben Inoto Soto
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...DataStax Academy
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL DatabasesRajith Pemabandu
 
Planning A Cloud Implementation
Planning A Cloud ImplementationPlanning A Cloud Implementation
Planning A Cloud ImplementationRex Wang
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
Database Sharding At Netlog
Database Sharding At NetlogDatabase Sharding At Netlog
Database Sharding At NetlogJurriaan Persyn
 
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMats Kindahl
 

Andere mochten auch (11)

Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
 
"How Sharding turned MySQL into the Internet de-facto database standard?", Mo...
"How Sharding turned MySQL into the Internet de-facto database standard?", Mo..."How Sharding turned MySQL into the Internet de-facto database standard?", Mo...
"How Sharding turned MySQL into the Internet de-facto database standard?", Mo...
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
Planning A Cloud Implementation
Planning A Cloud ImplementationPlanning A Cloud Implementation
Planning A Cloud Implementation
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
Database Sharding At Netlog
Database Sharding At NetlogDatabase Sharding At Netlog
Database Sharding At Netlog
 
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Ähnlich wie Scaling SQL and NoSQL Databases in the Cloud

Scaling Your Database In The Cloud
Scaling Your Database In The CloudScaling Your Database In The Cloud
Scaling Your Database In The CloudCory Isaacson
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudRightScale
 
Application architecture for the rest of us - php xperts devcon 2012
Application architecture for the rest of us -  php xperts devcon 2012Application architecture for the rest of us -  php xperts devcon 2012
Application architecture for the rest of us - php xperts devcon 2012M N Islam Shihan
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...In-Memory Computing Summit
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with AzureMarco Parenzan
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Alluxio, Inc.
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopDatabricks
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Charley Hanania
 

Ähnlich wie Scaling SQL and NoSQL Databases in the Cloud (20)

Scaling Your Database In The Cloud
Scaling Your Database In The CloudScaling Your Database In The Cloud
Scaling Your Database In The Cloud
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the Cloud
 
Application architecture for the rest of us - php xperts devcon 2012
Application architecture for the rest of us -  php xperts devcon 2012Application architecture for the rest of us -  php xperts devcon 2012
Application architecture for the rest of us - php xperts devcon 2012
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Azure and cloud design patterns
Azure and cloud design patternsAzure and cloud design patterns
Azure and cloud design patterns
 
NoSQL
NoSQLNoSQL
NoSQL
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics Workshop
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
 

Mehr von RightScale

10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT Governance10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT GovernanceRightScale
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsRightScale
 
Optimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScaleOptimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScaleRightScale
 
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About NowPrepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About NowRightScale
 
How to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your EnterpriseHow to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your EnterpriseRightScale
 
Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)RightScale
 
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBMComparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBMRightScale
 
How to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale OptimaHow to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale OptimaRightScale
 
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...RightScale
 
Using RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider ToolsUsing RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider ToolsRightScale
 
Best Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and ComplianceBest Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and ComplianceRightScale
 
Automating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and MoreAutomating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and MoreRightScale
 
The 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for EnterprisesThe 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for EnterprisesRightScale
 
9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage Costs9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage CostsRightScale
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMRightScale
 
Best Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP SuccessBest Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP SuccessRightScale
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMRightScale
 
2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud Report2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud ReportRightScale
 
Got a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsGot a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsRightScale
 
How to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale OptimaHow to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale OptimaRightScale
 

Mehr von RightScale (20)

10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT Governance10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT Governance
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
 
Optimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScaleOptimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScale
 
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About NowPrepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
 
How to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your EnterpriseHow to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your Enterprise
 
Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)
 
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBMComparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
 
How to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale OptimaHow to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale Optima
 
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
 
Using RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider ToolsUsing RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider Tools
 
Best Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and ComplianceBest Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and Compliance
 
Automating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and MoreAutomating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and More
 
The 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for EnterprisesThe 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for Enterprises
 
9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage Costs9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage Costs
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBM
 
Best Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP SuccessBest Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP Success
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud Report2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud Report
 
Got a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsGot a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP Helps
 
How to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale OptimaHow to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale Optima
 

Kürzlich hochgeladen

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Kürzlich hochgeladen (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Scaling SQL and NoSQL Databases in the Cloud

  • 1. Scaling your Database in the Cloud Spring2011 Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.dbshards.com View the video of this presentation here!
  • 2. Introduction Who I am Cory Isaacson, CEO of CodeFutures Providers of dbShards Author of Software Pipelines Partnerships: Rightscale The leading Cloud Management Platform Leaders in database scalability, performance, and high-availability for the cloud… …based on real-world experience with dozens of cloud-based applications …social networking, gaming, data collection, mobile, analytics Objective is to provide useful experience you can apply to scaling (and managing) your database tier… …especially for high volume applications
  • 3. Challenges of cloud computing Cloud provides highly attractive service environment Flexible, scales with need (up or down) No need for dedicated IT staff, fixed facility costs Pay-as-you-go model Cloud services occasionally fail Partial network outages Server failures… …by their nature cloud servers are “transient” Disk volume issues Cloud-based resources are constrained CPU I/O Rates… …the “Cloud I/O Barrier”
  • 5. Scaling in the Cloud Scaling Load Balancers is easy Stateless routing to app server Can add redundant Load Balancers if needed DNS round-robin or intelligent routing for larger sites If one goes down… …failover to another Scaling Application Servers is easy Stateless Sessions can easily transition to another server Add or remove servers as need dictates If one goes down… …failover to another
  • 6. Scaling in the Cloud Scaling the Database tier is hard “Stateful” by definition (and necessity…) Large, integrated data sets… 10s of GBs to TBs (or more…) Difficult to move, reload I/O dependent… …adversely affected by cloud service failures …and slow cloud I/O If one goes down… …ouch! Databases form the “last mile” of true application scalability Initially simple optimization produces the best result Implement a follow-on scalability strategy for long-term performance goals… …plus a high-availability strategy is a must
  • 7. All CPUs wait at the same speed… The Cloud I/O Barrier
  • 8. More Database scalability challenges Databases have many other challenges that limit scalability ACID compliance… …especially Consistency …user contention Operational challenges Failover Planned, unplanned Maintenance Index rebuild Restore Space reclamation Lifecycle Reliable Backup/Restore Monitoring Application Updates Management
  • 9. Database slowdown is not linear…
  • 10. Challenges apply to all types of databases Traditional RDBMS (MySQL, Postgres, Oracle…) I/O bound Multi-user, lock contention High-availability Lifecycle management NoSQLDatabases In-memory, Caching, Document…) Reliability, High-availability Limits of a single server …and a single thread Data dumps to disk Lifecycle Management No matter what the technology, big databases are hard to manage… …elastic scaling is a real challenge …degradation from growth in size and volume is a certainty Application-specific database requirements add to the challenge …database design is key… …balance performance vs. convenience vs. data size
  • 11. The Laws of Databases Law #1: Small Databases are fast… Law #2: Big Databases are slow… Law #3: Keep databases small
  • 12. What is the answer? Database sharding is the only effective method for achieving scale, elasticity, reliability and easy management… …regardless of your database technology
  • 13. What is Database Sharding? “Horizontal partitioning is a database design principle whereby rows of a database table are held separately... Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.” Wikipedia
  • 14. The key to Database Sharding…
  • 18. Why does Database Sharding work? Maximize CPU/Memory per database instance… …as compared to database size Reduce the size of index trees Speeds writes dramatically No contention between servers Locking, disk, memory, CPU Allows for intelligent parallel processing… …Go Fish queries across shards Keep CPUs busy and productive
  • 19. Breaking the Cloud I/O Barrier
  • 20. Database types Traditional RDBMS Proven SQL language… …although ORM can change that Durable, transactional, dependable… …perceived as inflexible NoSQL Typically in-memory… …the real speed advantage Various flavors… Key/Value Store Document database Hybrid Store complex documents using a key In-memory and disk based
  • 21. Black box vs. Relational Sharding Both utilize sharding on a data key… …typically modulus on a value or consistent hash Black box sharding is automatic… …attempts to evenly shard across all available servers …no developer visibility or control …can work acceptably for simple, non-indexed NoSQL data stores …easily supports single-row/object results Relational sharding is defined by the developer… …selective sharding of large data …explicit developer control and visibility …tunable as the database grows and matures …more efficient for result sets and searchable queries …preserves data relationships
  • 22. How Database Sharding works for NoSQL
  • 23. Relational Sharding Shard-Tree Root Table Shard-Tree Child Tables Global Tables
  • 25. How Relational Sharding works Shard key recognition in SQL SELECT * FROM customerWHERE customer_id = 1234 INSERT INTO customer(customer_id, first_name, last_name, addr_line1,…)VALUES(2345, ‘John’, ‘Jones’, ‘123 B Street’,…) UPDATE customerSET addr_line1 = ‘456 C Avenue’WHERE customer_id = 4567
  • 26. What about Cross-Shard result sets?
  • 27. Cross-shard result set example Go Fish (no shard key) SELECT country_id, count(*) FROM customerGROUP BY country_id
  • 28. More on Cross-Shard result sets… Black Box approach requires “scatter gather” for multi-row/object result set… …common with NoSQL engines …forces use of denormalized lists …must be maintained by developer/application code …Map Reduce processing helps with this (non-realtime) Relational sharding provides access to meaningful result sets of related data… …aggregation, sort easier to perform …logical search operations more natural …related rows are stored together
  • 29. Make your Application Shard-Safe For sharding to be efficient, every sharded table should include a shard key. SELECT WHERE clause statements should include a shard key where possible so that the query goes directly to a single shard rather than being executed against multiple shards in parallel. INSERT, UPDATE and DELETE WHERE CLAUSE statements must include a shard key so that the data is written to the correct shard. Avoid cross-shard inner joins. For example, joining one customer to another is not feasible as both customers may be in different shards. This can be accomplished, but requires complex logic and can adversely affect performance.
  • 30. What about High-Availability? Can you afford to take your databases offline: …for scheduled maintenance? …for unplanned failure? …can you accept some lost transactions? By definition Database Sharding adds failure points to the data tier A proven High-Availability strategy is a must… …system outages…especially in the cloud …planned maintenance…a necessity for all database engines
  • 31.
  • 32. No rapid failover for continuous operation
  • 33. Maintenance difficult in production environments
  • 34.
  • 35. …3 active-active instances for in-memory databases
  • 41.
  • 42. Database Sharding…elastic shards Expand the number of shards… …divide a single shard into N new shards …or rebalance existing shards across N new shards Contract the number of shards… …consolidate N shards into a single shard Due to large data sizes, this takes time… …regardless of data architecture Requires High-Availability to ensure no downtime… …perform scaling on live replica in background …fail back to active instances
  • 43. Database Sharding…the future Ability to leverage proven database engines… …SQL …NoSQL …Caching Hybrid database infrastructures using the best of all technologies In-memory and disk-based solutions co-existing Allow developers to select the best database engine for a given set of application requirements… …seamless context-switching within the application …use the API of choice Improved management… …monitoring …configuration “on-the-fly” …dynamic elastic shards based on demand
  • 44. Database Sharding summary Database Sharding is the only effective tool for scaling your database tier in the cloud… …or anywhere Use Database Sharding for any type of engine… …SQL …NoSQL …Cache Use the best engine for a given application requirement… …avoid the “one-size-fits-all” trap …each engine has its strengths to capitalize on Ensure your High-Availability is proven and bulletproof… …especially critical on the cloud …must support failure and maintenance
  • 45. Questions/Answers Cory Isaacson CodeFutures Corporation cory.isaacson@codefutures.com http://www.dbshards.com

Hinweis der Redaktion

  1. Writes are linear, reads can be faster – depending on your database architecture.
  2. How did your database perform 6 months or a year ago?
  3. NoSQL is attractive due to simple API, ease of use. But beware, you really need a solid data design for exactly what you want to do. Relationships are not handled by the database, they are handled by the developer.
  4. Difference between Black box sharding/NoSQL and App Aware sharding with an RDBMS – you can get sets of related data from the same shard, otherwise need to retrieve a row at a time and consolidate