SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
Scaling Data On Public Clouds

        Liran Zelkha, Founder
    Liran.zelkha@scalebase.com
About Us
• ScaleBase is a new startup targeting the
  database-as-a-service market (DBaaS)
• We offer unlimited database scalability and
  availability using our Database Load Balancer
• We currently run in beta mode – contact me if
  you want to join
Problem Of Data
• Flickr just hit 5B pictures
• Facebook > 0.5B users
• Farmville have more monthly players than the
  population of France
Mondays Key Note
•   More data
•   More users
•   More complex actions
•   Shorter response times
Scalability Pain

Infrastructure
Cost $
                   Large                                       You just lost
                   Capital                                      customers
                 Expenditure


                                                                           Predicted
                                                                           Demand

                                            Opportunity                        Traditional
                                              Cost                             Hardware

                                                                               Actual
                                                                               Demand

                                                                           Automated
                                                                           Virtualization


                                                                                       time

   http://media.amazonwebservices.com/pdf/IBMWebinarDeck_Final.pdf
CAP vs. ACID
• CAP = Consistency, Availability, Partition
  Tolerance
• ACID = Atomicity, Consistency, Isolation,
  Durability

• Atomicity – Chain of actions treated as one
  whole unseperateable action
• Isolation – Consistent query snapshots, read
  across writes, 4 levels are supported
ScaleBase
                        Database Scaling In A Box

Applications       Legacy clients
                                       • The first truly elastic,
                                         fault tolerant SQL
                                         based data layer
                                       • Enables linear scaling
       Scalebase                         of any SQL based
                                         database

        Database instances
ScaleBase
Application/Web Servers




                                            Shared Nothing
                                             DB Machines

                                              Commodity
                                               Hardware




                                ScaleBase
                                            MySQL? Oracle?


                                            Scalable and hi-
                                               available
THE REQUIREMENTS FOR DATA SLA
IN PUBLIC CLOUD ENVIRONMENTS
What We Need
• Availability
• Consistency
• Scalability
Brewer's (CAP) Theorem
• It is impossible for a distributed computer
  system to simultaneously provide all three of
  the following guarantees:
  – Consistency (all nodes see the same data at the
    same time)
  – Availability (node failures do not prevent survivors
    from continuing to operate)
  – Partition Tolerance (the system continues to
    operate despite arbitrary message loss)

                            http://en.wikipedia.org/wiki/CAP_theorem
What It Means




http://guyharrison.squarespace.com/blog/2010/6/13/consistency-models-in-non-relational-databases.html
Reading More On CAP
• This is an excellent read, and some of my
  samples are from this blog
  – http://www.julianbrowne.com/article/viewer/bre
    wers-cap-theorem
ACHIEVING DATA SCALABILITY
WITH RELATIONAL DATABASES
Databases And CAP
• ACID – Consistency
• Availability – tons of solutions, most of them
  not cloud oriented
  – Oracle RAC
  – MySQL Proxy
  – Etc.
  – Replication based solutions can solve at least read
    availability and scalability (see Azure SQL)
Database Cloud Solutions
• Amazon RDS
• NaviSite Oracle RAC
• Joyent + Zeus
So Where Is The Problem?
• Scaling problems (usually write but also read)
• Schema change problems
• BigData problems
Scaling Up
• Issues with scaling up when the dataset is just
  too big
• RDBMS were not designed to be distributed
• Began to look at multi-node database
  solutions
• Known as ‘scaling out’ or ‘horizontal scaling’
• Different approaches include:
  – Master-slave
  – Sharding
Scaling RDBMS – Master/Slave
• Master-Slave
  – All writes are written to the master. All reads
    performed against the replicated slave databases
  – Critical reads may be incorrect as writes may not
    have been propagated down
  – Large data sets can pose problems as master
    needs to duplicate data to slaves
Scaling RDBMS - Sharding
• Partition or sharding
  – Scales well for both reads and writes
  – Not transparent, application needs to be partition-
    aware
  – Can no longer have relationships/joins across
    partitions
  – Loss of referential integrity across shards
Other ways to scale RDBMS
• Multi-Master replication
• INSERT only, not UPDATES/DELETES
• No JOINs, thereby reducing query time
  – This involves de-normalizing data
• In-memory databases
ACHIEVING DATA SLA WITH NOSQL
NoSQL
• A term used to designate databases which
  differ from classic relational databases in
  some way. These data stores may not require
  fixed table schemas, and usually
  avoid join operations and typically scale
  horizontally. Academics and papers typically
  refer to these databases as structured storage,
  a term which would include classic relational
  databases as a subset.
                             http://en.wikipedia.org/wiki/NoSQL
NoSQL Types
• Key/Value
  – A big hash table
  – Examples: Voldemort, Amazon Dynamo
• Big Table
  – Big table, column families
  – Examples: Hbase, Cassandra
• Document based
  – Collections of collections
  – Examples: CouchDB, MongoDB
• Each solves a different problem
NO-SQL




  http://browsertoolkit.com/fault-tolerance.png
Pros/Cons
• Pros:
   –   Performance
   –   BigData
   –   Most solutions are open source
   –   Data is replicated to nodes and is therefore fault-tolerant (partitioning)
   –   Don't require a schema
   –   Can scale up and down
• Cons:
   –   Code change
   –   Limited framework support
   –   Not ACID
   –   Eco system (BI, Backup)
   –   There is always a database at the backend
   –   Some API is just too simple
Amazon S3 Code Sample
AWSAuthConnection conn = new AWSAuthConnection(awsAccessKeyId, awsSecretAccessKey, secure, server, format);

Response response = conn.createBucket(bucketName, location, null);

final String text = "this is a test";

response = conn.put(bucketName, key, new S3Object(text.getBytes(), null), null);
Cassandra Code Sample
CassandraClient cl = pool.getClient() ;
KeySpace ks = cl.getKeySpace("Keyspace1") ;

// insert value
ColumnPath cp = new ColumnPath("Standard1" , null,
"testInsertAndGetAndRemove".getBytes("utf-8"));
for(int i = 0 ; i < 100 ; i++){
         ks.insert("testInsertAndGetAndRemove_"+i, cp ,
("testInsertAndGetAndRemove_value_"+i).getBytes("utf-8"));
}

//get value
for(int i = 0 ; i < 100 ; i++){
         Column col = ks.getColumn("testInsertAndGetAndRemove_"+i, cp);
         String value = new String(col.getValue(),"utf-8") ;
}

//remove value
for(int i = 0 ; i < 100 ; i++){
         ks.remove("testInsertAndGetAndRemove_"+i, cp);
}
Cassandra Code Sample – Cont’
try{
         ks.remove("testInsertAndGetAndRemove_not_exist", cp);
}catch(Exception e){
         fail("remove not exist row should not throw exceptions");
}

//get already removed value
for(int i = 0 ; i < 100 ; i++){
try{
         Column col = ks.getColumn("testInsertAndGetAndRemove_"+i, cp);
         fail("the value should already being deleted");
}catch(NotFoundException e){

}catch(Exception e){
                 fail("throw out other exception, should be
NotFoundException." + e.toString() );
         }
}

pool.releaseClient(cl) ;
pool.close() ;
Cassandra Statistics
• Facebook Search
• MySQL > 50 GB Data
  – Writes Average : ~300 ms
  – Reads Average : ~350 ms
• Rewritten with Cassandra > 50 GB Data
  – Writes Average : 0.12 ms
  – Reads Average : 15 ms
MongoDB
Mongo m = new Mongo();

DB db = m.getDB( "mydb" );
Set<String> colls = db.getCollectionNames();

for (String s : colls) {
       System.out.println(s);
}
MongoDB – Cont’
BasicDBObject doc = new BasicDBObject();

doc.put("name", "MongoDB");
doc.put("type", "database");
doc.put("count", 1);

BasicDBObject info = new BasicDBObject();

info.put("x", 203);
info.put("y", 102);

doc.put("info", info);

coll.insert(doc);
THE BOTTOM LINE
Data SLA
• There is no golden hammer
   – See http://sourcemaking.com/antipatterns/golden-
     hammer
• Choose your tool wisely, based on what you need
• Usually
   – Start with RDBMS (shortest TTM, which is what we
     really care about)
   – When scale issues occur – start moving to NoSQL
     based on your needs
• You can get Data Scalability in the cloud – just
  think before you code!!!
A BIT MORE ON SCALEBASE
How ScaleBase Works
                                           Application
• ScaleBase takes an application
  database and splits its data
  across multiple, separated
  instances (a technique called
  Sharding)
• Queries and DML are either:
   – Directed to correct instance, or
   – Executed simultaneously across       ScaleBase
     several instances
• Results are aggregated and
  returned to the original
  application


                                        Database instances
Example
                                         ID    First name   Last name

                                         102   Lex          De Haan

                                         105   David        Austin




                                         ID    First name   Last name
ID    First name   Last name
                                         100   Steven       King
100   Steven       King
                                         103   Alexander    Hunold
101   Neena        Kochhar
                                         106   Valli        Pataballa
102   Lex          De Haan

103   Alexander    Hunold

104   Bruce        Ernst                 ID    First name   Last name

105   David        Austin                101   Neena        Kochhar

106   Valli        Pataballa             104   Bruce        Ernst
ScaleBase Supports
• 3 table types
    – Master
    – Global
    – Split
•   Splitting according to Hash, List, Range
•   Rebalance, addition and removal of machines
•   Instance replication backup: Shadow and Master
•   Full consistent 2-Phase Commit
•   Joins, Foreign Keys, Subqueries
•   DML, DDL, Batch updates, Prepared Statements
•   Aggregations, Group By, Order By, Auto Numbering,
    Timestamps
Sample Code
SELECT site_owner_id, count(*)FROM google.user_clicks
WHERE country = ‘BRAZIL’
GROUP BY site_owner_id


• site_owner_id is the split key
• Perform the query on all DBs
• Simple aggregation of results

• No Code Change
Sample Code
SELECT country, count(*)FROM google.user_clicks
GROUP BY country



• Perform the query on all DBs
• Aggregation of the aggregations

• No Code Change
Sample Code
PreparedStatement pstmt = conn.prepareStatement("INSERT INTO emp VALUES(?,?,?,?,?)");
for (int i = 0; i < 10; i++) {
                pstmt.setInt(1, 300 + i);
                pstmt.setString(2, "Something" + i);
                pstmt.setDate(3, new Date(System.currentTimeMillis()));
                pstmt.setInt(4, i);
                pstmt.setLong(5, i);
                pstmt.addBatch();
}
int[] result = pstmt.executeBatch();



• Is split key dynamic or static?
• Each command is added to the correct DB,
  execution is on all relevant DBs

• No Code Change
ScaleBase Solution
• Elastic SQL Database Scaling hi-availability
  solution
  – Complete
  – Transparent
  – Super scalable
  – Out of the box
  – Non-intrusive
  – Flexible
  – Manageable
With ScaleBase
• Pay much less for hardware and Database licenses
• Get more power, better data spreading and better
  availability
• Real linear scalability
• Go for real grid, cloud and virtualization

• ScaleBase is NOT:
   – Is NOT an RDBMS. It facilitates any secure, high-available,
     archivable RDBMS (Oracle, DB2, MySQL, any!)
   – Does NOT require schema structure modifications
   – Does NOT require additional sophisticated hardware
Moving To ScaleBase
• Implementing ScaleBase is done in minutes
• Just direct your application to your ScaleBase
  instance
• Target ScaleBase to your original database and
  target database instances
• ScaleBase will automatically migrate your schema
  and data to the new servers
• After all data is transferred ScaleBase will start
  working with target database instances, giving
  you linear scalability – with no down time!
Where ScaleBase Fits In
• Cloud databases
  – Use SQL databases in the cloud, and get Scale Out
    features and high availability
• High scale applications
  – Use your existing code, without migrating to
    NOSQL, and get unlimited SQL scalability
CUSTOMER CASE STUDIES
Public Cloud
• Scenario:
  – A startup developing a complex iPhone application
    running on EC2
  – Requires high scaling option for SQL based database
• Problem:
  – Amazon RDS offers only Scale Up option, no Scale Out
• Solution:
  – Customer switched their Amazon RDS-based
    application to ScaleBase on EC2
  – Gained transparent, linear, scaling
  – Running 4 RDS instances behind ScaleBase
Private Cloud
• Scenario:
   – A company selling devices that ‘ping’ home every 5 minute
   – 8 digit number of devices sold
• Problem:
   – Evaluated MySQL, Oracle - no single machine can support
     required devices
   – Clustering options too expensive, limited scalability
• Solution:
   – Customer moved to ScaleBase with no code changes
   – Gained linear scales. Runs 4 MySQL databases behind
     ScaleBase

Weitere ähnliche Inhalte

Was ist angesagt?

Summit 2011 infra_dbms
Summit 2011 infra_dbmsSummit 2011 infra_dbms
Summit 2011 infra_dbmsPini Cohen
 
Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...
Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...
Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...IDERA Software
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesDataWorks Summit
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetupPatrick McFadin
 
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend MicroHBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend MicroCloudera, Inc.
 
cloud conference 2013 - Infrastructure as a Service in Amazon Web Services
cloud conference 2013 - Infrastructure as a Service in Amazon Web Servicescloud conference 2013 - Infrastructure as a Service in Amazon Web Services
cloud conference 2013 - Infrastructure as a Service in Amazon Web ServicesVMEngine
 
Database backed coherence cache
Database backed coherence cacheDatabase backed coherence cache
Database backed coherence cachearagozin
 
Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...
Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...
Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...Acquia
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your StartupAmazon Web Services
 
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...Microsoft Tech Community
 
Using Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data AnalysisUsing Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data AnalysisScaleOut Software
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudKellyn Pot'Vin-Gorman
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraJeff Bollinger
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Concevoir une application scalable dans le Cloud
Concevoir une application scalable dans le CloudConcevoir une application scalable dans le Cloud
Concevoir une application scalable dans le CloudStéphanie Hertrich
 
NoSQL and AWS Dynamodb
NoSQL and AWS DynamodbNoSQL and AWS Dynamodb
NoSQL and AWS DynamodbEduardo Bohrer
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdfDatabase & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdfInSync2011
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?DataStax
 

Was ist angesagt? (20)

Summit 2011 infra_dbms
Summit 2011 infra_dbmsSummit 2011 infra_dbms
Summit 2011 infra_dbms
 
Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...
Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...
Geek Sync | Data in the Cloud: Understanding Amazon Database Services with Vi...
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual Machines
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetup
 
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend MicroHBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
 
cloud conference 2013 - Infrastructure as a Service in Amazon Web Services
cloud conference 2013 - Infrastructure as a Service in Amazon Web Servicescloud conference 2013 - Infrastructure as a Service in Amazon Web Services
cloud conference 2013 - Infrastructure as a Service in Amazon Web Services
 
Database backed coherence cache
Database backed coherence cacheDatabase backed coherence cache
Database backed coherence cache
 
Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...
Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...
Acquia Managed Cloud: Highly Available Architecture for Highly Unpredictable ...
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your Startup
 
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
 
Using Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data AnalysisUsing Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data Analysis
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle Cloud
 
Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with Cassandra
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Concevoir une application scalable dans le Cloud
Concevoir une application scalable dans le CloudConcevoir une application scalable dans le Cloud
Concevoir une application scalable dans le Cloud
 
NoSQL and AWS Dynamodb
NoSQL and AWS DynamodbNoSQL and AWS Dynamodb
NoSQL and AWS Dynamodb
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdfDatabase & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
 

Andere mochten auch

Building Eclipse Plugins
Building Eclipse PluginsBuilding Eclipse Plugins
Building Eclipse PluginsLiran Zelkha
 
שטפונות בנגב
שטפונות בנגבשטפונות בנגב
שטפונות בנגבLiran Zelkha
 
Eclipse Plug-in Develompent Tips And Tricks
Eclipse Plug-in Develompent Tips And TricksEclipse Plug-in Develompent Tips And Tricks
Eclipse Plug-in Develompent Tips And TricksChris Aniszczyk
 
OC4J to WebLogic Server Migration5
OC4J to WebLogic Server Migration5OC4J to WebLogic Server Migration5
OC4J to WebLogic Server Migration5Liran Zelkha
 
L0016 - The Structure of an Eclipse Plug-in
L0016 - The Structure of an Eclipse Plug-inL0016 - The Structure of an Eclipse Plug-in
L0016 - The Structure of an Eclipse Plug-inTonny Madsen
 
Creating a Plug-In Architecture
Creating a Plug-In ArchitectureCreating a Plug-In Architecture
Creating a Plug-In Architectureondrejbalas
 
Data SLA in the public cloud
Data SLA in the public cloudData SLA in the public cloud
Data SLA in the public cloudLiran Zelkha
 

Andere mochten auch (8)

Building Eclipse Plugins
Building Eclipse PluginsBuilding Eclipse Plugins
Building Eclipse Plugins
 
שטפונות בנגב
שטפונות בנגבשטפונות בנגב
שטפונות בנגב
 
PDE builds or Maven
PDE builds or MavenPDE builds or Maven
PDE builds or Maven
 
Eclipse Plug-in Develompent Tips And Tricks
Eclipse Plug-in Develompent Tips And TricksEclipse Plug-in Develompent Tips And Tricks
Eclipse Plug-in Develompent Tips And Tricks
 
OC4J to WebLogic Server Migration5
OC4J to WebLogic Server Migration5OC4J to WebLogic Server Migration5
OC4J to WebLogic Server Migration5
 
L0016 - The Structure of an Eclipse Plug-in
L0016 - The Structure of an Eclipse Plug-inL0016 - The Structure of an Eclipse Plug-in
L0016 - The Structure of an Eclipse Plug-in
 
Creating a Plug-In Architecture
Creating a Plug-In ArchitectureCreating a Plug-In Architecture
Creating a Plug-In Architecture
 
Data SLA in the public cloud
Data SLA in the public cloudData SLA in the public cloud
Data SLA in the public cloud
 

Ähnlich wie Scaling data on public clouds

Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentationAravindharamanan S
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentationAravindharamanan S
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentationManish Singh
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsMark Slingsby
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your StartupAmazon Web Services
 
A Tour of Azure SQL Databases (NOVA SQL UG 2020)
A Tour of Azure SQL Databases  (NOVA SQL UG 2020)A Tour of Azure SQL Databases  (NOVA SQL UG 2020)
A Tour of Azure SQL Databases (NOVA SQL UG 2020)Timothy McAliley
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerMichael Rys
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용Byeongweon Moon
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWSTom Laszewski
 
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!ScaleBase
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAmazon Web Services
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
 
Building a highly scalable and available cloud application
Building a highly scalable and available cloud applicationBuilding a highly scalable and available cloud application
Building a highly scalable and available cloud applicationNoam Sheffer
 
Building Scalable Databases on AWS - AWS Summit 2012 - NYC
Building Scalable Databases on AWS - AWS Summit 2012 - NYCBuilding Scalable Databases on AWS - AWS Summit 2012 - NYC
Building Scalable Databases on AWS - AWS Summit 2012 - NYCAmazon Web Services
 
Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015Amazon Web Services
 

Ähnlich wie Scaling data on public clouds (20)

Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web Apps
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your Startup
 
A Tour of Azure SQL Databases (NOVA SQL UG 2020)
A Tour of Azure SQL Databases  (NOVA SQL UG 2020)A Tour of Azure SQL Databases  (NOVA SQL UG 2020)
A Tour of Azure SQL Databases (NOVA SQL UG 2020)
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled Apps
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
Building a highly scalable and available cloud application
Building a highly scalable and available cloud applicationBuilding a highly scalable and available cloud application
Building a highly scalable and available cloud application
 
Building Scalable Databases on AWS - AWS Summit 2012 - NYC
Building Scalable Databases on AWS - AWS Summit 2012 - NYCBuilding Scalable Databases on AWS - AWS Summit 2012 - NYC
Building Scalable Databases on AWS - AWS Summit 2012 - NYC
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015
 

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Scaling data on public clouds

  • 1. Scaling Data On Public Clouds Liran Zelkha, Founder Liran.zelkha@scalebase.com
  • 2. About Us • ScaleBase is a new startup targeting the database-as-a-service market (DBaaS) • We offer unlimited database scalability and availability using our Database Load Balancer • We currently run in beta mode – contact me if you want to join
  • 3. Problem Of Data • Flickr just hit 5B pictures • Facebook > 0.5B users • Farmville have more monthly players than the population of France
  • 4. Mondays Key Note • More data • More users • More complex actions • Shorter response times
  • 5. Scalability Pain Infrastructure Cost $ Large You just lost Capital customers Expenditure Predicted Demand Opportunity Traditional Cost Hardware Actual Demand Automated Virtualization time http://media.amazonwebservices.com/pdf/IBMWebinarDeck_Final.pdf
  • 6. CAP vs. ACID • CAP = Consistency, Availability, Partition Tolerance • ACID = Atomicity, Consistency, Isolation, Durability • Atomicity – Chain of actions treated as one whole unseperateable action • Isolation – Consistent query snapshots, read across writes, 4 levels are supported
  • 7. ScaleBase Database Scaling In A Box Applications Legacy clients • The first truly elastic, fault tolerant SQL based data layer • Enables linear scaling Scalebase of any SQL based database Database instances
  • 8. ScaleBase Application/Web Servers Shared Nothing DB Machines Commodity Hardware ScaleBase MySQL? Oracle? Scalable and hi- available
  • 9. THE REQUIREMENTS FOR DATA SLA IN PUBLIC CLOUD ENVIRONMENTS
  • 10. What We Need • Availability • Consistency • Scalability
  • 11. Brewer's (CAP) Theorem • It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: – Consistency (all nodes see the same data at the same time) – Availability (node failures do not prevent survivors from continuing to operate) – Partition Tolerance (the system continues to operate despite arbitrary message loss) http://en.wikipedia.org/wiki/CAP_theorem
  • 13. Reading More On CAP • This is an excellent read, and some of my samples are from this blog – http://www.julianbrowne.com/article/viewer/bre wers-cap-theorem
  • 14. ACHIEVING DATA SCALABILITY WITH RELATIONAL DATABASES
  • 15. Databases And CAP • ACID – Consistency • Availability – tons of solutions, most of them not cloud oriented – Oracle RAC – MySQL Proxy – Etc. – Replication based solutions can solve at least read availability and scalability (see Azure SQL)
  • 16. Database Cloud Solutions • Amazon RDS • NaviSite Oracle RAC • Joyent + Zeus
  • 17. So Where Is The Problem? • Scaling problems (usually write but also read) • Schema change problems • BigData problems
  • 18. Scaling Up • Issues with scaling up when the dataset is just too big • RDBMS were not designed to be distributed • Began to look at multi-node database solutions • Known as ‘scaling out’ or ‘horizontal scaling’ • Different approaches include: – Master-slave – Sharding
  • 19. Scaling RDBMS – Master/Slave • Master-Slave – All writes are written to the master. All reads performed against the replicated slave databases – Critical reads may be incorrect as writes may not have been propagated down – Large data sets can pose problems as master needs to duplicate data to slaves
  • 20. Scaling RDBMS - Sharding • Partition or sharding – Scales well for both reads and writes – Not transparent, application needs to be partition- aware – Can no longer have relationships/joins across partitions – Loss of referential integrity across shards
  • 21. Other ways to scale RDBMS • Multi-Master replication • INSERT only, not UPDATES/DELETES • No JOINs, thereby reducing query time – This involves de-normalizing data • In-memory databases
  • 22. ACHIEVING DATA SLA WITH NOSQL
  • 23. NoSQL • A term used to designate databases which differ from classic relational databases in some way. These data stores may not require fixed table schemas, and usually avoid join operations and typically scale horizontally. Academics and papers typically refer to these databases as structured storage, a term which would include classic relational databases as a subset. http://en.wikipedia.org/wiki/NoSQL
  • 24. NoSQL Types • Key/Value – A big hash table – Examples: Voldemort, Amazon Dynamo • Big Table – Big table, column families – Examples: Hbase, Cassandra • Document based – Collections of collections – Examples: CouchDB, MongoDB • Each solves a different problem
  • 26. Pros/Cons • Pros: – Performance – BigData – Most solutions are open source – Data is replicated to nodes and is therefore fault-tolerant (partitioning) – Don't require a schema – Can scale up and down • Cons: – Code change – Limited framework support – Not ACID – Eco system (BI, Backup) – There is always a database at the backend – Some API is just too simple
  • 27. Amazon S3 Code Sample AWSAuthConnection conn = new AWSAuthConnection(awsAccessKeyId, awsSecretAccessKey, secure, server, format); Response response = conn.createBucket(bucketName, location, null); final String text = "this is a test"; response = conn.put(bucketName, key, new S3Object(text.getBytes(), null), null);
  • 28. Cassandra Code Sample CassandraClient cl = pool.getClient() ; KeySpace ks = cl.getKeySpace("Keyspace1") ; // insert value ColumnPath cp = new ColumnPath("Standard1" , null, "testInsertAndGetAndRemove".getBytes("utf-8")); for(int i = 0 ; i < 100 ; i++){ ks.insert("testInsertAndGetAndRemove_"+i, cp , ("testInsertAndGetAndRemove_value_"+i).getBytes("utf-8")); } //get value for(int i = 0 ; i < 100 ; i++){ Column col = ks.getColumn("testInsertAndGetAndRemove_"+i, cp); String value = new String(col.getValue(),"utf-8") ; } //remove value for(int i = 0 ; i < 100 ; i++){ ks.remove("testInsertAndGetAndRemove_"+i, cp); }
  • 29. Cassandra Code Sample – Cont’ try{ ks.remove("testInsertAndGetAndRemove_not_exist", cp); }catch(Exception e){ fail("remove not exist row should not throw exceptions"); } //get already removed value for(int i = 0 ; i < 100 ; i++){ try{ Column col = ks.getColumn("testInsertAndGetAndRemove_"+i, cp); fail("the value should already being deleted"); }catch(NotFoundException e){ }catch(Exception e){ fail("throw out other exception, should be NotFoundException." + e.toString() ); } } pool.releaseClient(cl) ; pool.close() ;
  • 30. Cassandra Statistics • Facebook Search • MySQL > 50 GB Data – Writes Average : ~300 ms – Reads Average : ~350 ms • Rewritten with Cassandra > 50 GB Data – Writes Average : 0.12 ms – Reads Average : 15 ms
  • 31. MongoDB Mongo m = new Mongo(); DB db = m.getDB( "mydb" ); Set<String> colls = db.getCollectionNames(); for (String s : colls) { System.out.println(s); }
  • 32. MongoDB – Cont’ BasicDBObject doc = new BasicDBObject(); doc.put("name", "MongoDB"); doc.put("type", "database"); doc.put("count", 1); BasicDBObject info = new BasicDBObject(); info.put("x", 203); info.put("y", 102); doc.put("info", info); coll.insert(doc);
  • 34. Data SLA • There is no golden hammer – See http://sourcemaking.com/antipatterns/golden- hammer • Choose your tool wisely, based on what you need • Usually – Start with RDBMS (shortest TTM, which is what we really care about) – When scale issues occur – start moving to NoSQL based on your needs • You can get Data Scalability in the cloud – just think before you code!!!
  • 35. A BIT MORE ON SCALEBASE
  • 36. How ScaleBase Works Application • ScaleBase takes an application database and splits its data across multiple, separated instances (a technique called Sharding) • Queries and DML are either: – Directed to correct instance, or – Executed simultaneously across ScaleBase several instances • Results are aggregated and returned to the original application Database instances
  • 37. Example ID First name Last name 102 Lex De Haan 105 David Austin ID First name Last name ID First name Last name 100 Steven King 100 Steven King 103 Alexander Hunold 101 Neena Kochhar 106 Valli Pataballa 102 Lex De Haan 103 Alexander Hunold 104 Bruce Ernst ID First name Last name 105 David Austin 101 Neena Kochhar 106 Valli Pataballa 104 Bruce Ernst
  • 38. ScaleBase Supports • 3 table types – Master – Global – Split • Splitting according to Hash, List, Range • Rebalance, addition and removal of machines • Instance replication backup: Shadow and Master • Full consistent 2-Phase Commit • Joins, Foreign Keys, Subqueries • DML, DDL, Batch updates, Prepared Statements • Aggregations, Group By, Order By, Auto Numbering, Timestamps
  • 39. Sample Code SELECT site_owner_id, count(*)FROM google.user_clicks WHERE country = ‘BRAZIL’ GROUP BY site_owner_id • site_owner_id is the split key • Perform the query on all DBs • Simple aggregation of results • No Code Change
  • 40. Sample Code SELECT country, count(*)FROM google.user_clicks GROUP BY country • Perform the query on all DBs • Aggregation of the aggregations • No Code Change
  • 41. Sample Code PreparedStatement pstmt = conn.prepareStatement("INSERT INTO emp VALUES(?,?,?,?,?)"); for (int i = 0; i < 10; i++) { pstmt.setInt(1, 300 + i); pstmt.setString(2, "Something" + i); pstmt.setDate(3, new Date(System.currentTimeMillis())); pstmt.setInt(4, i); pstmt.setLong(5, i); pstmt.addBatch(); } int[] result = pstmt.executeBatch(); • Is split key dynamic or static? • Each command is added to the correct DB, execution is on all relevant DBs • No Code Change
  • 42. ScaleBase Solution • Elastic SQL Database Scaling hi-availability solution – Complete – Transparent – Super scalable – Out of the box – Non-intrusive – Flexible – Manageable
  • 43. With ScaleBase • Pay much less for hardware and Database licenses • Get more power, better data spreading and better availability • Real linear scalability • Go for real grid, cloud and virtualization • ScaleBase is NOT: – Is NOT an RDBMS. It facilitates any secure, high-available, archivable RDBMS (Oracle, DB2, MySQL, any!) – Does NOT require schema structure modifications – Does NOT require additional sophisticated hardware
  • 44. Moving To ScaleBase • Implementing ScaleBase is done in minutes • Just direct your application to your ScaleBase instance • Target ScaleBase to your original database and target database instances • ScaleBase will automatically migrate your schema and data to the new servers • After all data is transferred ScaleBase will start working with target database instances, giving you linear scalability – with no down time!
  • 45. Where ScaleBase Fits In • Cloud databases – Use SQL databases in the cloud, and get Scale Out features and high availability • High scale applications – Use your existing code, without migrating to NOSQL, and get unlimited SQL scalability
  • 47. Public Cloud • Scenario: – A startup developing a complex iPhone application running on EC2 – Requires high scaling option for SQL based database • Problem: – Amazon RDS offers only Scale Up option, no Scale Out • Solution: – Customer switched their Amazon RDS-based application to ScaleBase on EC2 – Gained transparent, linear, scaling – Running 4 RDS instances behind ScaleBase
  • 48. Private Cloud • Scenario: – A company selling devices that ‘ping’ home every 5 minute – 8 digit number of devices sold • Problem: – Evaluated MySQL, Oracle - no single machine can support required devices – Clustering options too expensive, limited scalability • Solution: – Customer moved to ScaleBase with no code changes – Gained linear scales. Runs 4 MySQL databases behind ScaleBase