SlideShare ist ein Scribd-Unternehmen logo
1 von 83
Downloaden Sie, um offline zu lesen
SQL, NoSQL, NewSQL?
 What's a developer to do?


Chris Richardson

Author of POJOs in Action
Founder of the original CloudFoundry.com

@crichardson
crichardson@vmware.com
Overall presentation goal


 The joy and pain of
    building Java
 applications that use
 NoSQL and NewSQL

       2
About Chris




        3
About Chris




        4
(About Chris)




        5
About Chris()




        6
About Chris




        7
About Chris




        http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/




        8
About Chris


   Developer Advocate for
     CloudFoundry.com


 Signup at CloudFoundry.com
 using promo code EgyptJUG
                            9
Agenda
  Why NoSQL? NewSQL?
  Persisting entities
  Implementing queries




     3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 10
Food to Go

  Take-out food delivery
   service
  “Launched” in 2006
  Used a relational
   database (naturally)




         11
Success  Growth challenges
  Increasing domain model complexity
  Increasing traffic
  Increasing data volume
  Distribute across data centers


But relational databases make
         this difficult…
Problem: Complex object graphs

               ?!      ID   COL1   COL2   COL…




                    Poor performance
                    •  Many inserts
                    •  Many joins


      13
Problem: Semi-structured data

  Customer attribute table

  Customer_Id        Name     Value
  1                  Region   CA
  1                  Type     Bank
  …                  …        …




•  Lack of constraints
•  Poor query performance, e.g. multiple outer joins




               14
Problem: Semi-structured data

     Customer table

Id   Name             Street          …     Other_Attributes
1    Acme Inc         180 Main              XML/JSON/Blob
2    Failed Bank      1 Wall Street
…    …                …




                                          Can’t be queried



                 15
Problem: Schema evolution

   Id          First_Name   Last_Name
   1           Maria        Doe
   2           John         Smith
   …           …            …
   9948429292 Ben           Grayson


Locks?
Application downtime?

   Id       First_Name      Last_Name   DOB
   1        Maria           Doe         10/14/38
   …
               16
Problem: Scaling
  Moore’s law is your friend
                   BUT
  Scaling reads:
    Master/slave
    But beware of consistency issues
  Scaling writes
    Extremely difficult/impossible/expensive
    Vertical scaling is limited and requires $$
    Horizontal scaling is limited/requires $$
Problem: distribution



         App                                                                     App



                                Synchronization
          DB                                                                     DB
                                           WAN

 Datacenter 1                                                     Datacenter 2


Many databases don’t support this out of the box

           3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                       Slide 18
Solution: Buy high end technology




   http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG
Solution: Hire more developers
  Application-level sharding
  Build your own replication middleware
  …




http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_4_series/madone_4_5
Solution: Use NoSQL
 Benefits




                 Higher               Limited
                 performance          transactions
                 Higher scalability   Relaxed
                 Richer data-         consistency
                 model                Unconstrained
                                      data




                                                      Drawbacks
                 Schema-less




            21
MongoDB
  Document-oriented database
    JSON-style documents: Lists, Maps, primitives
    Schema-less
  Transaction = update of a single document
  Rich query language for dynamic queries
  Tunable writes: speed  reliability
  Highly scalable and available




         22
MongoDB use cases
  Use cases
    High volume writes
    Complex data
    Semi-structured data
  Who is using it?
      Shutterfly, Foursquare
      Bit.ly Intuit
      SourceForge, NY Times
      GILT Groupe, Evite,
      SugarCRM

        3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                              Slide 23
Apache Cassandra
  Column-oriented database/Extensible
   row store
    Think Row ~= java.util.SortedMap
  Transaction = update of a row
  Fast writes = append to a log
  Tunable reads/writes: consistency 
   latency/availability
  Extremely scalable
    Transparent and dynamic clustering
    Rack and datacenter aware data replication
  CQL = “SQL”-like DDL and DML
         24
Cassandra use cases
  Use cases
  •    Big data
  •    Multiple Data Center distributed database
  •    Persistent cache
  •    (Write intensive) Logging
  •    High-availability (writes)
  Who is using it
    Digg, Facebook, Twitter, Reddit, Rackspace
    Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX
    “The largest production cluster has over 100
     TB of data in over 150 machines.“ –
     Casssandra web site

         3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                               Slide 25
Other NoSQL databases
Type                                   Examples


Extensible columns/Column-             Hbase
oriented                               SimpleDB
                                       DynamoDB

Graph                                  Neo4j


Key-value                              Redis
                                       Membase

Document                               CouchDb


            http://nosql-database.org/ lists 122+ NoSQL databases

                 26
Solution: Use NewSQL
  Relational databases with SQL and
   ACID transactions
                                      AND
  New and improved architecture
  Radically better scalability and
   performance

  NewSQL vendors: ScaleDB,
   NimbusDB, …, VoltDB
      3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 27
Stonebraker’s motivations

                   “…Current databases are designed for
                           1970s hardware …”

                    Stonebraker: http://www.slideshare.net/VoltDB/sql-myths-webinar




  Significant overhead in “…logging, latching,
   locking, B-tree, and buffer management
                  operations…”
  SIGMOD 08: Though the looking glass: http://dl.acm.org/citation.cfm?id=1376713




                                                                                      28
About VoltDB
  Open-source
  In-memory relational database
  Durability thru replication; snapshots
   and logging
  Transparent partitioning
  Fast and scalable
     …VoltDB is very scalable; it should scale to 120
      partitions, 39 servers, and 1.6 million complex
     transactions per second at over 300 CPU cores…
      http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/




       3/18/12                  Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                   Slide 29
The future is polyglot persistence




                                                                           e.g. Netflix
                                                                           •  RDBMS
                                                                           •  SimpleDB
                                                                           •  Cassandra
                                                                           •  Hadoop/Hbase




   IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg



                          30
Spring Data is here to help




                                  For

                    NoSQL databases

http://www.springsource.org/spring-data



               31
Spring Data sub-projects
  Various sub-projects
      SQL: Spring Data JPA, JDBC extensions
      Commons: Polyglot persistence
      Key-Value: Redis, Riak
      Document: MongoDB
      Graph: Neo4j
      GORM for NoSQL
  What you get:
    Wrapper classes analogous to JDBC
     template
    Generic Repository
    Cross-store persistence
    …
           32
Proceed with caution
  Don’t commit to a
   NoSQL DB until you
   have done a
   significant POC
  Encapsulate your data
   access code so you
   can switch
  Hope that one day
   you won’t need ACID
   (or complex queries)
        33
Agenda
  Why NoSQL? NewSQL?
  Persisting entities
  Implementing queries




     3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 34
Food to Go – Place Order use case

1.  Customer enters delivery address and
    delivery time
2.  System displays available restaurants
3.  Customer picks restaurant
4.  System displays menu
5.  Customer selects menu items
6.  Customer places order



          35
Food to Go – Domain model (partial)


class Restaurant {               class TimeRange {
  long id;                         long id;
  String name;                     int dayOfWeek;
  Set<String> serviceArea;         int openingTime;
  Set<TimeRange> openingHours;
                                   int closingTime;
  List<MenuItem> menuItems;
                                 }
}


                                 class MenuItem {
                                   String name;
                                   double price;
                                 }




               36
Database schema
ID                   Name                  …
                                                         RESTAURANT
1                    Ajanta
                                                         table
2                    Montclair Eggshop

Restaurant_id            zipcode
                                                RESTAURANT_ZIPCODE
1                        94707
                                                table
1                        94619
2                        94611
2                        94619                 RESTAURANT_TIME_RANGE
                                               table

Restaurant_id   dayOfWeek           openTime     closeTime
1               Monday              1130         1430
1               Monday              1730         2130
2               Tuesday             1130         …


                37
How to implement the repository?
      public interface AvailableRestaurantRepository {

          void add(Restaurant restaurant);
          Restaurant findDetailsById(int id);
          …
      }




             Restaurant

                                                        ?
   TimeRange             MenuItem

Restaurant aggregate


                    38
MongoDB: persisting restaurants is easy
                                                        Server
                                   Database: Food To Go
                              Collection: Restaurants
     {
         "_id" : ObjectId("4bddc2f49d1505567c6220a0")
         "name": "Ajanta",
         "serviceArea": ["94619", "99999"],                      BSON =
         "openingHours": [
            {
                                                                 binary
               "dayOfWeek": 1,                                   JSON
              "open": 1130,
              "close": 1430 },
            {                                                    Sequence
                "dayOfWeek": 2,
                "open": 1130,
                                                                 of bytes on
                "close": 1430                                    disk  fast
             }, …
          ]
                                                                 i/o
     }




             39
Using the MongoDB CLI
> r = {name: 'Ajanta'}
> db.restaurants.save(r)
> r
{ "_id" : ObjectId("4e555dd9646e338dca11710c"), "name" : "Ajanta" }

>   r = db.restaurants.findOne({name:"Ajanta"})
{   "_id" : ObjectId("4e555dd9646e338dca11710c"), "name" : "Ajanta" }
>   r.type= "Indian”
>   db.restaurants.save(r)

> db.restaurants.update({name:"Ajanta"},
                    {$set: {name:"Ajanta Restaurant"},
                     $push: { menuItems: {name: "Chicken Vindaloo"}}})
> db.restaurants.find()
{ "_id" : ObjectId("4e555dd9646e338dca11710c"), "menuItems" :
    [ { "name" : "Chicken Vindaloo" } ], "name" : "Ajanta Restaurant",
    "type" : "Indian" }
> db.restaurants.remove(r.id)

                  40
Spring Data for Mongo code
@Repository
public class AvailableRestaurantRepositoryMongoDbImpl
          implements AvailableRestaurantRepository {

 public static String AVAILABLE_RESTAURANTS_COLLECTION = "availableRestaurants";

 @Autowired
 private MongoTemplate mongoTemplate;

 @Override
 public void add(Restaurant restaurant) {
   mongoTemplate.insert(restaurant, AVAILABLE_RESTAURANTS_COLLECTION);
 }

 @Override
  public Restaurant findDetailsById(int id) {
     return mongoTemplate.findOne(new Query(where("_id").is(id)),
                                   Restaurant.class,
                                   AVAILABLE_RESTAURANTS_COLLECTION);
  }
}


                     41
Spring Configuration
@Configuration
public class MongoConfig extends AbstractDatabaseConfig {

 @Value("#{mongoDbProperties.databaseName}")
 private String mongoDbDatabase;

@Bean
public Mongo mongo() throws UnknownHostException, MongoException {
  return new Mongo(databaseHostName);
}

 @Bean
 public MongoTemplate mongoTemplate(Mongo mongo) throws Exception {
    MongoTemplate mongoTemplate = new MongoTemplate(mongo, mongoDbDatabase);
    mongoTemplate.setWriteConcern(WriteConcern.SAFE);
    mongoTemplate.setWriteResultChecking(WriteResultChecking.EXCEPTION);
    return mongoTemplate;
  }
}




                     42
Cassandra data model
           Column           Column
   Row                       Value
            Name                     Timestamp
   Key

                                                                    Keyspace
                                                            Column Family

     K1      N1        V1     TS1    N2   V2     TS2   N3    V3   TS3




     K2      N1        V1     TS1    N2   V2     TS2   N3    V3   TS3




Column name/value: number, string, Boolean, timestamp, and composite

                  43
Cassandra– inserting/updating data
                                                                   Column Family

        K1    N1   V1   TS1   N2   V2   TS2   N3   V3   TS3




    …



Idempotent= transaction                 CF.insert(key=K1, (N4, V4, TS4), …)


                                                                   Column Family

        K1    N1   V1   TS1   N2   V2   TS2   N3   V3   TS3   N4   V4   TS4




    …


                   44
Cassandra– retrieving data
                                                              Column Family

    K1   N1   V1   TS1   N2   V2   TS2   N3   V3   TS3   N4   V4   TS4




…


                    CF.slice(key=K1, startColumn=N2, endColumn=N4)



 K1                      N2   V2   TS2   N3   V3   TS3   N4   V4   TS4




         Cassandra has secondary indexes but they
             aren’t helpful for these use cases

              45
Option #1: Use a column per attribute

       Column Name = path/expression to access property value

                                                         Column Family: RestaurantDetails

                                                              openingHours[0].dayOfWeek     Monday
       name   Ajanta           serviceArea[0]    94619

1                                                                openingHours[0].open      1130
               type    indian        serviceArea[1]   94707

                                                                 openingHours[0].close     1430




              Egg                                             openingHours[0].dayOfWeek    Monday
       name                     serviceArea[0]   94611
              shop

2                      Break                                     openingHours[0].open      0830
               type                  serviceArea[1]   94619
                        Fast

                                                               openingHours[0].close      1430
Option #2: Use a single column
      Column value = serialized object graph, e.g. JSON


                                                  Column Family: RestaurantDetails
         2          attributes: { name: “Montclair Eggshop”, … }
  1          attributes     { name: “Ajanta”, …}




  2          attributes     { name: “Eggshop”, …}




                                                                   ✔
               47
Cassandra code
public class AvailableRestaurantRepositoryCassandraKeyImpl
          implements AvailableRestaurantRepository {

@Autowired                                                    Home grown
private final CassandraTemplate cassandraTemplate;
                                                              wrapper class
public void add(Restaurant restaurant) {
  cassandraTemplate.insertEntity(keyspace,
                                 RESTAURANT_DETAILS_CF,
                                 restaurant);
}

public Restaurant findDetailsById(int id) {
  String key = Integer.toString(id);
  return cassandraTemplate.findEntity(Restaurant.class,
         keyspace, key, RESTAURANT_DETAILS_CF);
  …
}

…                                                            http://en.wikipedia.org/wiki/Hector

                  48
Using VoltDB
  Use the original schema
  Standard SQL statements

                   BUT YOU MUST


  Write stored procedures and invoke
   them using proprietary interface
  Partition your data


      3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 49
About VoltDB stored procedures
  Key part of VoltDB
  Replication = executing stored
   procedure on replica
  Logging = log stored procedure
   invocation
  Stored procedure invocation =
   transaction



      3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 50
About partitioning

 Partition column


                                                     RESTAURANT table

 ID                  Name                                                       …
 1                   Ajanta
 2                   Eggshop
 …




       3/18/12      Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                    Slide 51
Example cluster


 Partition 1a                            Partition 2a                                       Partition 3a



   ID           Name     …                  ID            Name              …                 ID           Name     …
   1            Ajanta                      2             Eggshop                             …            ..
   …                                        …                                                 …



 Partition 3b                            Partition 1b                                       Partition 2b



   ID           Name     …                ID             Name          …                     ID        Name          …
   …            ..                        1              Ajanta                              2         Eggshop
   …                                      …                                                  …



VoltDB Server 1                      VoltDB Server 2                                       VoltDB Server 3


                     3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                                Slide 52
Single partition procedure: FAST
   SELECT * FROM RESTAURANT WHERE ID = 1


 High-performance lock free code



  ID   Name     …                  ID            Name              …                ID   Name         …
  1    Ajanta                      1             Eggshop                            …    ..
  …                                …                                                …




        …                                           …                                         …



VoltDB Server 1             VoltDB Server 2                                       VoltDB Server 3

            3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                  Slide 53
Multi-partition procedure: SLOWER
   SELECT * FROM RESTAURANT WHERE NAME = ‘Ajanta’


                    Communication/Coordination overhead


  ID   Name     …                      ID            Name              …                ID   Name         …
  1    Ajanta                          1             Eggshop                            …    ..
  …                                    …                                                …




        …                                               …                                         …



VoltDB Server 1                 VoltDB Server 2                                       VoltDB Server 3


            3/18/12       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                      Slide 54
Chosen partitioning scheme

<partitions>
   <partition   table="restaurant" column="id"/>
   <partition   table="service_area" column="restaurant_id"/>
   <partition   table="menu_item" column="restaurant_id"/>
   <partition   table="time_range" column="restaurant_id"/>
   <partition   table="available_time_range" column="restaurant_id"/>
</partitions>




  Performance is excellent: much
  faster than MySQL

                  55
Stored procedure – AddRestaurant
@ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”)
public class AddRestaurant extends VoltProcedure {
    public final SQLStmt insertRestaurant =
              new SQLStmt("INSERT INTO Restaurant VALUES (?,?);");
    public final SQLStmt insertServiceArea =
              new SQLStmt("INSERT INTO service_area VALUES (?,?);");
    public final SQLStmt insertOpeningTimes =
              new SQLStmt("INSERT INTO time_range VALUES (?,?,?,?);");
    public final SQLStmt insertMenuItem =
              new SQLStmt("INSERT INTO menu_item VALUES (?,?,?);");
public long run(int id, String name, String[] serviceArea, long[] daysOfWeek, long[] openingTimes,
                                 long[] closingTimes, String[] names, double[] prices) {
    voltQueueSQL(insertRestaurant, id, name);
    for (String zipCode : serviceArea)
      voltQueueSQL(insertServiceArea, id, zipCode);
    for (int i = 0; i < daysOfWeek.length ; i++)
      voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]);
    for (int i = 0; i < names.length ; i++)
      voltQueueSQL(insertMenuItem, id, names[i], prices[i]);
    voltExecuteSQL(true);
     return 0;
    }
}
                            56
VoltDb repository – add()
@Repository
public class AvailableRestaurantRepositoryVoltdbImpl
           implements AvailableRestaurantRepository {

    @Autowired
    private VoltDbTemplate voltDbTemplate;

    @Override
    public void add(Restaurant restaurant) {
      invokeRestaurantProcedure("AddRestaurant", restaurant);
    }

    private void invokeRestaurantProcedure(String procedureName, Restaurant restaurant) {
     Object[] serviceArea = restaurant.getServiceArea().toArray();
     long[][] openingHours = toArray(restaurant.getOpeningHours());            Flatten
     Object[][] menuItems = toArray(restaurant.getMenuItems());
                                                                             Restaurant
     voltDbTemplate.update(procedureName, restaurant.getId(), restaurant.getName(),
                               serviceArea, openingHours[0], openingHours[1],
                               openingHours[2], menuItems[0], menuItems[1]);
}



                       57
VoltDbTemplate wrapper class
public class VoltDbTemplate {

   private Client client;         VoltDB client API

   public VoltDbTemplate(Client client) {
     this.client = client;
   }

   public void update(String procedureName, Object... params) {
     try {
       ClientResponse x =
               client.callProcedure(procedureName, params);
       …
   } catch (Exception e) {
       throw new RuntimeException(e);
     }
   }

                58
VoltDb server configuration
<?xml version="1.0"?>                      <deployment>
<project>                                    <cluster hostcount="1"
  <info>
     <name>Food To Go</name>                      sitesperhost="5" kfactor="0" />
     ...
  </info>                                  </deployment>
  <database>
     <schemas>
         <schema path='schema.sql' />
     </schemas>
     <partitions>
           <partition table="restaurant" column="id"/>
            ...
     </partitions>
     <procedures>
        <procedure class='net.chrisrichardson.foodToGo.newsql.voltdb.procs.AddRestaurant' />
        ...
     </procedures>
  </database>
</project>


voltcompiler target/classes 
   src/main/resources/sql/voltdb-project.xml foodtogo.jar


bin/voltdb leader localhost catalog foodtogo.jar deployment deployment.xml
                    59
Performance
 Benchmarking is still work in
 progress but so far
                                                                                                 http://www.youtube.com/watch?
                                                                                                 v=b2F-DItXtZs




                                    Mongo                            Cassandra                    VoltDB
Insert for PK                       Awesome                          Fast*                        Awesome
Find by PK                          Awesome                          Fast                         Incredible




                          * Cassandra can be clustered for improved write performance




                3/18/12              Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                                       Slide 60
Agenda
  Why NoSQL? NewSQL?
  Persisting entities
  Implementing queries




     3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 61
Finding available restaurants
Available restaurants =
      Serve the zip code of the delivery address
AND
      Are open at the delivery time

public interface AvailableRestaurantRepository {

    List<AvailableRestaurant>
          findAvailableRestaurants(Address deliveryAddress,
                                   Date deliveryTime); …
}




                   62
Finding available restaurants on Monday,
6.15pm for 94619 zip

select r.*             Straightforward
from restaurant r      three-way join
 inner join restaurant_time_range tr
   on r.id =tr.restaurant_id
 inner join restaurant_zipcode sa
   on r.id = sa.restaurant_id
Where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime


          63
MongoDB = easy to query
{
    serviceArea:"94619",                        Find a
    openingHours: {
      $elemMatch : {                            restaurant
           "dayOfWeek" : "Monday",
           "open": {$lte: 1815},                that serves
       }
           "close": {$gte: 1815}
                                                the 94619 zip
}
    }                                           code and is
                                                open at
DBCursor cursor = collection.find(qbeObject);
while (cursor.hasNext()) {                      6.15pm on a
   DBObject o = cursor.next();
   …                                            Monday
 }


db.availableRestaurants.ensureIndex({serviceArea: 1})

                 64
MongoTemplate-based code
@Repository
public class AvailableRestaurantRepositoryMongoDbImpl
                               implements AvailableRestaurantRepository {

@Autowired private final MongoTemplate mongoTemplate;

@Override
public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress,
                                                          Date deliveryTime) {
 int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
 int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);

    Query query = new Query(where("serviceArea").is(deliveryAddress.getZip())
         .and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek)
                .and("openingTime").lte(timeOfDay)
                .and("closingTime").gte(timeOfDay)));

    return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query,
                            AvailableRestaurant.class);
}

              mongoTemplate.ensureIndex(“availableRestaurants”,
                 new Index().on("serviceArea", Order.ASCENDING));
                      65
BUT how to do this with Cassandra??!
  How can Cassandra support a query that has




                           ?
     A 3-way join
     Multiple =
     > and <



 We need to implement an index

              Queries instead of data
              model drives NoSQL
              database design

         66
... And use a slice operation

columnFamily.slice(key=keyVal, startColumn=startVal, endColumn=endVal)




                                                   =
             select *
             from columnFamily
             where key = keyVal
               and col >= startVal
               and col <= endVal


            3/18/12     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                    Slide 67
Simplification #1: Denormalization
Restaurant_id   Day_of_week   Open_time   Close_time   Zip_code

1               Monday        1130        1430         94707
1               Monday        1130        1430         94619
1               Monday        1730        2130         94707
1               Monday        1730        2130         94619
2               Monday        0700        1430         94619
…



        SELECT restaurant_id
        FROM time_range_zip_code
        WHERE day_of_week = ‘Monday’              Simpler query:
          AND zip_code = 94619                      No joins
                                                    Two = and two <
          AND 1815 < close_time
          AND open_time < 1815

                   68
Simplification #2: Application filtering


SELECT restaurant_id, open_time
 FROM time_range_zip_code
 WHERE day_of_week = ‘Monday’     Even simpler query
   AND zip_code = 94619           •  No joins
   AND 1815 < close_time          •  Two = and one <
   AND open_time < 1815




           69
Simplification #3: Eliminate multiple =’s with
concatenation

  Restaurant_id    Zip_dow        Open_time   Close_time

  1                94707:Monday   1130        1430
  1                94619:Monday   1130        1430
  1                94707:Monday   1730        2130
  1                94619:Monday   1730        2130
  2                94619:Monday   0700        1430
  …


 SELECT restaurant_id, open_time
  FROM time_range_zip_code
  WHERE zip_code_day_of_week = ‘94619:Monday’
    AND 1815 < close_time
                                                           key

                                  range

                  70
Column family with composite column
   names as an index
      Restaurant_id     Zip_dow              Open_time            Close_time

      1                 94707:Monday         1130                 1430
      1                 94619:Monday         1130                 1430
      1                 94707:Monday         1730                 2130
      1                 94619:Monday         1730                 2130
      2                 94619:Monday         0700                 1430
      …




                                                     Column Family: AvailableRestaurants



                                JSON FOR                                     JSON FOR
                (1430,0700,2)                                (2130,1730,1)
94619:Monday                       EGG                                         AJANTA

                                                      JSON FOR
                                     (1430,1130,1)
                                                        AJANTA
Querying with a slice
                                                         Column Family: AvailableRestaurants


                                    JSON FOR                                         JSON FOR
                    (1430,0700,2)                                (2130,1730,1)
                                       EGG                                             AJANTA
94619:Monday

                                                          JSON FOR
                                         (1430,1130,1)
                                                            AJANTA




 slice(key= 94619:Monday, sliceStart = (1815, *, *), sliceEnd = (2359, *, *))



                                                                                     JSON FOR
                                                                     (2130,1730,1)
94619:Monday                                                                           AJANTA




                                       18:15 is after 17:30  {Ajanta}
                     72
Needs a few pages of code
         private void insertAvailability(Restaurant restaurant) {
                for (String zipCode : (Set<String>) restaurant.getServiceArea()) {
@Override         for (TimeRange tr : (Set<TimeRange>) restaurant.getOpeningHours()) {
 public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) {
                    String dayOfWeek = format2(tr.getDayOfWeek());
  int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
                    String openingTime = format4(tr.getOpeningTime());
  int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
                    String closingTime = format4(tr.getClosingTime());
  String zipCode = deliveryAddress.getZip();
  String key = formatKey(zipCode, format2(dayOfWeek));
                    String restaurantId = format8(restaurant.getId());
     HSlicePredicate<Composite> predicate = new HSlicePredicate<Composite>(new CompositeSerializer());
                        String key = formatKey(zipCode, dayOfWeek);
     Composite start = new Composite();
     Composite finish = new Composite();
                        String columnValue = toJson(restaurant);
     start.addComponent(0, format4(timeOfDay), ComponentEquality.GREATER_THAN_EQUAL);
     finish.addComponent(0, format4(2359), ComponentEquality.GREATER_THAN_EQUAL);
                   Composite columnName = new Composite();
     predicate.setRange(start, finish, false, 100);
                   columnName.add(0, closingTime);
     final List<AvailableRestaurantIndexEntry> closingAfter = new ArrayList<AvailableRestaurantIndexEntry>();
                   columnName.add(1, openingTime);
                   columnName.add(2, restaurantId);
     ColumnFamilyRowMapper<String, Composite, Object> mapper = new ColumnFamilyRowMapper<String, Composite, Object>() {

      @Override
                        ColumnFamilyUpdater<String, Composite> updater
      public Object mapRow(ColumnFamilyResult<String, Composite> results) {
                                      = compositeCloseTemplate.createUpdater(key);
        for (Composite columnName : results.getColumnNames()) {
          String openTime = columnName.get(1, new StringSerializer());
          String restaurantId = columnName.get(2, new StringSerializer());
                        updater.setString(columnName, columnValue);
          closingAfter.add(new AvailableRestaurantIndexEntry(openTime, restaurantId, results.getString(columnName)));
        }
        return null;
      }
     };
                        compositeCloseTemplate.update(updater);
                    }
     compositeCloseTemplate.queryColumns(key, predicate, mapper);
                }
     List<AvailableRestaurant> result = new LinkedList<AvailableRestaurant>();
            }
     for (AvailableRestaurantIndexEntry trIdAndAvailableRestaurant : closingAfter) {
       if (trIdAndAvailableRestaurant.isOpenBefore(timeOfDay))
         result.add(trIdAndAvailableRestaurant.getAvailableRestaurant());
     }

     return result;
 }                                        73
What did I just do to query the data?
  Wrote code to maintain an index
  Reduced performance due to extra
   writes




        74
Mongo vs. Cassandra
            DC1                                                            DC2

               Shard A Master                                                        Shard B Master
MongoDB                                                Remote




                   DC1 Client                                                          DC2 Client



            DC1                                                            DC2
                                                          Async
                   Cassandra                                 Or
                                                                                       Cassandra
Cassandra
                                                           Sync


                   DC1 Client                                                          DC2 Client



             3/18/12     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                      Slide 75
VoltDB - attempt #1
@ProcInfo( singlePartition = false)
public class FindAvailableRestaurants extends VoltProcedure { ... }

ERROR 10:12:03,251 [main] COMPILER: Failed to plan for statement
type(findAvailableRestaurants_with_join) select r.* from restaurant
r,time_range tr, service_area sa Where ? = sa.zip_code and r.id
=tr.restaurant_id and r.id = sa.restaurant_id and tr.day_of_week=?
and tr.open_time <= ? and ? <= tr.close_time Error: "Unable to plan
for statement. Likely statement is joining two partitioned tables in a
multi-partition statement. This is not supported at this time."
ERROR 10:12:03,251 [main] COMPILER: Catalog compilation failed.




                       Bummer!
             3/18/12     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                     Slide 76
VoltDB - attempt #2
@ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”)
public class AddRestaurant extends VoltProcedure {

 public final SQLStmt insertAvailable=
             new SQLStmt("INSERT INTO available_time_range VALUES (?,?,?, ?, ?, ?);");

  public long run(....) {
               ...
    for (int i = 0; i < daysOfWeek.length ; i++) {
      voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]);
      for (String zipCode : serviceArea) {
       voltQueueSQL(insertAvailable, id, daysOfWeek[i], openingTimes[i],
                          closingTimes[i], zipCode, name);
      }
    }
               ...                    public final SQLStmt findAvailableRestaurants_denorm = new SQLStmt(
    voltExecuteSQL(true);                 "select restaurant_id, name from available_time_range tr " +
    return 0;                             "where ? = tr.zip_code " +
  }                                       "and tr.day_of_week=? " +
}                                         "and tr.open_time <= ? " +
                                          " and ? <= tr.close_time ");


                   Works but queries are only slightly
                          faster than MySQL!
                    3/18/12            Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                   Slide 77
VoltDB - attempt #3
<partitions>
   ...
   <partition table="available_time_range" column="zip_code"/>
</partitions>

@ProcInfo( singlePartition = false, ...)
public class AddRestaurant extends VoltProcedure { ... }

@ProcInfo( singlePartition = true,
             partitionInfo = "available_time_range.zip_code: 0")
public class FindAvailableRestaurants extends VoltProcedure { ... }


Queries are really fast but inserts are not 

Partitioning scheme – optimal for some use
cases but not others
            78
Performance
Benchmarking is still work in
progress but so far


                                    Mongo                            Cassandra                VoltDB
Insert for find available           Awesome                          Ok*                      Ok
Find available                      Ok                               Ok                       Awesome
restaurants




                       * Cassandra can be clustered for improved write performance




             3/18/12              Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                        Slide 79
Is NewSQL/NoSQL a good fit for
  Food To Go?

                  Cassandra      Mongo              VoltDB
Representing      Easy           Very easy          Easy
restaurants
PK-lookup         Easy           Easy               Easy
Find available    Lots of code   Easy               Tricky
restaurants
query
Use as SOR        Yes – if       Yes – for single   Yes
                  distribution   datacenter
                  required




                 80
Summary…
  Relational databases are great BUT there
   are limitations
  Each NoSQL database solves some
   problems BUT
      Limited transactions: NoSQL = NoACID
      One day needing ACID  major rewrite
      Query-driven, denormalized database design
      …
  NewSQL databases such as VoltDB provides
   SQL, ACID transactions and incredible
   performance BUT
    Not all operations are fast
    Non-JDBC API


           81
… Summary
  Very carefully pick the NewSQL/
   NoSQL DB for your application
  Consider a polyglot persistence
   architecture
  Encapsulate your data access code so
   you can switch
  Startups = avoid NewSQL/NoSQL for
   shorter time to market?


      3/18/12   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 82
Thank you!




 Signup at CloudFoundry.com
 using promo code EgyptJUG
           My contact info:
           chris.richardson@springsource.com
           @crichardson


      83

Weitere ähnliche Inhalte

Was ist angesagt?

Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32jujukoko
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
 
Implementation of nosql for robotics
Implementation of nosql for roboticsImplementation of nosql for robotics
Implementation of nosql for roboticsJoão Gabriel Lima
 
Database Tendency
Database TendencyDatabase Tendency
Database Tendencygrandis_au
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...lisapaglia
 
Characterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningCharacterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningJoão Gabriel Lima
 
In-memory Database and MySQL Cluster
In-memory Database and MySQL ClusterIn-memory Database and MySQL Cluster
In-memory Database and MySQL Clustergrandis_au
 
Exchange 2010 ha ctd
Exchange 2010 ha ctdExchange 2010 ha ctd
Exchange 2010 ha ctdKaliyan S
 
The End of an Architectural Era Michael Stonebraker
The End of an Architectural Era Michael StonebrakerThe End of an Architectural Era Michael Stonebraker
The End of an Architectural Era Michael Stonebrakerugur candan
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 
MySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Brasil
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
CUBRID Cluster Introduction
CUBRID Cluster IntroductionCUBRID Cluster Introduction
CUBRID Cluster IntroductionCUBRID
 
Scaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLScaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLOSInet
 
Spring Data NHJUG April 2012
Spring Data NHJUG April 2012Spring Data NHJUG April 2012
Spring Data NHJUG April 2012trisberg
 
3/15 - Intro to Spring Data Neo4j
3/15 - Intro to Spring Data Neo4j3/15 - Intro to Spring Data Neo4j
3/15 - Intro to Spring Data Neo4jNeo4j
 
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Principled Technologies
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBigDataCloud
 

Was ist angesagt? (20)

Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 
Implementation of nosql for robotics
Implementation of nosql for roboticsImplementation of nosql for robotics
Implementation of nosql for robotics
 
Database Tendency
Database TendencyDatabase Tendency
Database Tendency
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 
Characterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningCharacterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learning
 
NuoDB Product Brochure
NuoDB Product BrochureNuoDB Product Brochure
NuoDB Product Brochure
 
In-memory Database and MySQL Cluster
In-memory Database and MySQL ClusterIn-memory Database and MySQL Cluster
In-memory Database and MySQL Cluster
 
Exchange 2010 ha ctd
Exchange 2010 ha ctdExchange 2010 ha ctd
Exchange 2010 ha ctd
 
The End of an Architectural Era Michael Stonebraker
The End of an Architectural Era Michael StonebrakerThe End of an Architectural Era Michael Stonebraker
The End of an Architectural Era Michael Stonebraker
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
MySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Cluster no PayPal
MySQL Cluster no PayPal
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
CUBRID Cluster Introduction
CUBRID Cluster IntroductionCUBRID Cluster Introduction
CUBRID Cluster Introduction
 
Scaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLScaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQL
 
Spring Data NHJUG April 2012
Spring Data NHJUG April 2012Spring Data NHJUG April 2012
Spring Data NHJUG April 2012
 
3/15 - Intro to Spring Data Neo4j
3/15 - Intro to Spring Data Neo4j3/15 - Intro to Spring Data Neo4j
3/15 - Intro to Spring Data Neo4j
 
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 

Andere mochten auch

Steam Learn : Varnish or How to reduce the load of your web server
Steam Learn : Varnish or How to reduce the load of your web serverSteam Learn : Varnish or How to reduce the load of your web server
Steam Learn : Varnish or How to reduce the load of your web serverinovia
 
Varnish
VarnishVarnish
VarnishAdyax
 
SQL, NoSQL, NewSQL? What's a developer to do?
SQL, NoSQL, NewSQL? What's a developer to do?SQL, NoSQL, NewSQL? What's a developer to do?
SQL, NoSQL, NewSQL? What's a developer to do?Chris Richardson
 
VarnishCache入門Rev2.1
VarnishCache入門Rev2.1VarnishCache入門Rev2.1
VarnishCache入門Rev2.1Iwana Chan
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud ArchitectureAdrian Cockcroft
 

Andere mochten auch (7)

Steam Learn : Varnish or How to reduce the load of your web server
Steam Learn : Varnish or How to reduce the load of your web serverSteam Learn : Varnish or How to reduce the load of your web server
Steam Learn : Varnish or How to reduce the load of your web server
 
Varnish
VarnishVarnish
Varnish
 
SQL, NoSQL, NewSQL? What's a developer to do?
SQL, NoSQL, NewSQL? What's a developer to do?SQL, NoSQL, NewSQL? What's a developer to do?
SQL, NoSQL, NewSQL? What's a developer to do?
 
VarnishCache入門Rev2.1
VarnishCache入門Rev2.1VarnishCache入門Rev2.1
VarnishCache入門Rev2.1
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Global Netflix Platform
Global Netflix PlatformGlobal Netflix Platform
Global Netflix Platform
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud Architecture
 

Ähnlich wie SQL, NoSQL or NewSQL? Choosing the right database

Prepare Your Data For The Cloud
Prepare Your Data For The CloudPrepare Your Data For The Cloud
Prepare Your Data For The CloudIndicThreads
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
 
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...InfiniteGraph
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
DAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC OnlineDAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC OnlineAmazon Web Services
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemAdam Marcus
 
The NoSQL Ecosystem
The NoSQL Ecosystem The NoSQL Ecosystem
The NoSQL Ecosystem yarapavan
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Dataexponential-inc
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloudJeff Hung
 

Ähnlich wie SQL, NoSQL or NewSQL? Choosing the right database (20)

Prepare Your Data For The Cloud
Prepare Your Data For The CloudPrepare Your Data For The Cloud
Prepare Your Data For The Cloud
 
Preparing your data for the cloud
Preparing your data for the cloudPreparing your data for the cloud
Preparing your data for the cloud
 
Preparing yourdataforcloud
Preparing yourdataforcloudPreparing yourdataforcloud
Preparing yourdataforcloud
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
DAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC OnlineDAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC Online
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
The NoSQL Ecosystem
The NoSQL Ecosystem The NoSQL Ecosystem
The NoSQL Ecosystem
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Data
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks Presentation
 
Redis by-hari
Redis by-hariRedis by-hari
Redis by-hari
 
No sql
No sqlNo sql
No sql
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
 

Mehr von Chris Richardson

The microservice architecture: what, why, when and how?
The microservice architecture: what, why, when and how?The microservice architecture: what, why, when and how?
The microservice architecture: what, why, when and how?Chris Richardson
 
More the merrier: a microservices anti-pattern
More the merrier: a microservices anti-patternMore the merrier: a microservices anti-pattern
More the merrier: a microservices anti-patternChris Richardson
 
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...Chris Richardson
 
Dark Energy, Dark Matter and the Microservices Patterns?!
Dark Energy, Dark Matter and the Microservices Patterns?!Dark Energy, Dark Matter and the Microservices Patterns?!
Dark Energy, Dark Matter and the Microservices Patterns?!Chris Richardson
 
Dark energy, dark matter and microservice architecture collaboration patterns
Dark energy, dark matter and microservice architecture collaboration patternsDark energy, dark matter and microservice architecture collaboration patterns
Dark energy, dark matter and microservice architecture collaboration patternsChris Richardson
 
Scenarios_and_Architecture_SkillsMatter_April_2022.pdf
Scenarios_and_Architecture_SkillsMatter_April_2022.pdfScenarios_and_Architecture_SkillsMatter_April_2022.pdf
Scenarios_and_Architecture_SkillsMatter_April_2022.pdfChris Richardson
 
Using patterns and pattern languages to make better architectural decisions
Using patterns and pattern languages to make better architectural decisions Using patterns and pattern languages to make better architectural decisions
Using patterns and pattern languages to make better architectural decisions Chris Richardson
 
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...Chris Richardson
 
Events to the rescue: solving distributed data problems in a microservice arc...
Events to the rescue: solving distributed data problems in a microservice arc...Events to the rescue: solving distributed data problems in a microservice arc...
Events to the rescue: solving distributed data problems in a microservice arc...Chris Richardson
 
A pattern language for microservices - June 2021
A pattern language for microservices - June 2021 A pattern language for microservices - June 2021
A pattern language for microservices - June 2021 Chris Richardson
 
QConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
QConPlus 2021: Minimizing Design Time Coupling in a Microservice ArchitectureQConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
QConPlus 2021: Minimizing Design Time Coupling in a Microservice ArchitectureChris Richardson
 
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...Chris Richardson
 
Designing loosely coupled services
Designing loosely coupled servicesDesigning loosely coupled services
Designing loosely coupled servicesChris Richardson
 
Microservices - an architecture that enables DevOps (T Systems DevOps day)
Microservices - an architecture that enables DevOps (T Systems DevOps day)Microservices - an architecture that enables DevOps (T Systems DevOps day)
Microservices - an architecture that enables DevOps (T Systems DevOps day)Chris Richardson
 
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...Chris Richardson
 
Decompose your monolith: Six principles for refactoring a monolith to microse...
Decompose your monolith: Six principles for refactoring a monolith to microse...Decompose your monolith: Six principles for refactoring a monolith to microse...
Decompose your monolith: Six principles for refactoring a monolith to microse...Chris Richardson
 
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...Chris Richardson
 
Overview of the Eventuate Tram Customers and Orders application
Overview of the Eventuate Tram Customers and Orders applicationOverview of the Eventuate Tram Customers and Orders application
Overview of the Eventuate Tram Customers and Orders applicationChris Richardson
 
An overview of the Eventuate Platform
An overview of the Eventuate PlatformAn overview of the Eventuate Platform
An overview of the Eventuate PlatformChris Richardson
 
#DevNexus202 Decompose your monolith
#DevNexus202 Decompose your monolith#DevNexus202 Decompose your monolith
#DevNexus202 Decompose your monolithChris Richardson
 

Mehr von Chris Richardson (20)

The microservice architecture: what, why, when and how?
The microservice architecture: what, why, when and how?The microservice architecture: what, why, when and how?
The microservice architecture: what, why, when and how?
 
More the merrier: a microservices anti-pattern
More the merrier: a microservices anti-patternMore the merrier: a microservices anti-pattern
More the merrier: a microservices anti-pattern
 
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
 
Dark Energy, Dark Matter and the Microservices Patterns?!
Dark Energy, Dark Matter and the Microservices Patterns?!Dark Energy, Dark Matter and the Microservices Patterns?!
Dark Energy, Dark Matter and the Microservices Patterns?!
 
Dark energy, dark matter and microservice architecture collaboration patterns
Dark energy, dark matter and microservice architecture collaboration patternsDark energy, dark matter and microservice architecture collaboration patterns
Dark energy, dark matter and microservice architecture collaboration patterns
 
Scenarios_and_Architecture_SkillsMatter_April_2022.pdf
Scenarios_and_Architecture_SkillsMatter_April_2022.pdfScenarios_and_Architecture_SkillsMatter_April_2022.pdf
Scenarios_and_Architecture_SkillsMatter_April_2022.pdf
 
Using patterns and pattern languages to make better architectural decisions
Using patterns and pattern languages to make better architectural decisions Using patterns and pattern languages to make better architectural decisions
Using patterns and pattern languages to make better architectural decisions
 
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
 
Events to the rescue: solving distributed data problems in a microservice arc...
Events to the rescue: solving distributed data problems in a microservice arc...Events to the rescue: solving distributed data problems in a microservice arc...
Events to the rescue: solving distributed data problems in a microservice arc...
 
A pattern language for microservices - June 2021
A pattern language for microservices - June 2021 A pattern language for microservices - June 2021
A pattern language for microservices - June 2021
 
QConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
QConPlus 2021: Minimizing Design Time Coupling in a Microservice ArchitectureQConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
QConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
 
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
 
Designing loosely coupled services
Designing loosely coupled servicesDesigning loosely coupled services
Designing loosely coupled services
 
Microservices - an architecture that enables DevOps (T Systems DevOps day)
Microservices - an architecture that enables DevOps (T Systems DevOps day)Microservices - an architecture that enables DevOps (T Systems DevOps day)
Microservices - an architecture that enables DevOps (T Systems DevOps day)
 
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
 
Decompose your monolith: Six principles for refactoring a monolith to microse...
Decompose your monolith: Six principles for refactoring a monolith to microse...Decompose your monolith: Six principles for refactoring a monolith to microse...
Decompose your monolith: Six principles for refactoring a monolith to microse...
 
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
 
Overview of the Eventuate Tram Customers and Orders application
Overview of the Eventuate Tram Customers and Orders applicationOverview of the Eventuate Tram Customers and Orders application
Overview of the Eventuate Tram Customers and Orders application
 
An overview of the Eventuate Platform
An overview of the Eventuate PlatformAn overview of the Eventuate Platform
An overview of the Eventuate Platform
 
#DevNexus202 Decompose your monolith
#DevNexus202 Decompose your monolith#DevNexus202 Decompose your monolith
#DevNexus202 Decompose your monolith
 

Kürzlich hochgeladen

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 

Kürzlich hochgeladen (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 

SQL, NoSQL or NewSQL? Choosing the right database

  • 1. SQL, NoSQL, NewSQL? What's a developer to do? Chris Richardson Author of POJOs in Action Founder of the original CloudFoundry.com @crichardson crichardson@vmware.com
  • 2. Overall presentation goal The joy and pain of building Java applications that use NoSQL and NewSQL 2
  • 8. About Chris http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/ 8
  • 9. About Chris Developer Advocate for CloudFoundry.com Signup at CloudFoundry.com using promo code EgyptJUG 9
  • 10. Agenda   Why NoSQL? NewSQL?   Persisting entities   Implementing queries 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 10
  • 11. Food to Go   Take-out food delivery service   “Launched” in 2006   Used a relational database (naturally) 11
  • 12. Success  Growth challenges   Increasing domain model complexity   Increasing traffic   Increasing data volume   Distribute across data centers But relational databases make this difficult…
  • 13. Problem: Complex object graphs ?! ID COL1 COL2 COL… Poor performance •  Many inserts •  Many joins 13
  • 14. Problem: Semi-structured data Customer attribute table Customer_Id Name Value 1 Region CA 1 Type Bank … … … •  Lack of constraints •  Poor query performance, e.g. multiple outer joins 14
  • 15. Problem: Semi-structured data Customer table Id Name Street … Other_Attributes 1 Acme Inc 180 Main XML/JSON/Blob 2 Failed Bank 1 Wall Street … … … Can’t be queried 15
  • 16. Problem: Schema evolution Id First_Name Last_Name 1 Maria Doe 2 John Smith … … … 9948429292 Ben Grayson Locks? Application downtime? Id First_Name Last_Name DOB 1 Maria Doe 10/14/38 … 16
  • 17. Problem: Scaling   Moore’s law is your friend BUT   Scaling reads:   Master/slave   But beware of consistency issues   Scaling writes   Extremely difficult/impossible/expensive   Vertical scaling is limited and requires $$   Horizontal scaling is limited/requires $$
  • 18. Problem: distribution App App Synchronization DB DB WAN Datacenter 1 Datacenter 2 Many databases don’t support this out of the box 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 18
  • 19. Solution: Buy high end technology http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG
  • 20. Solution: Hire more developers   Application-level sharding   Build your own replication middleware   … http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_4_series/madone_4_5
  • 21. Solution: Use NoSQL Benefits Higher Limited performance transactions Higher scalability Relaxed Richer data- consistency model Unconstrained data Drawbacks Schema-less 21
  • 22. MongoDB   Document-oriented database   JSON-style documents: Lists, Maps, primitives   Schema-less   Transaction = update of a single document   Rich query language for dynamic queries   Tunable writes: speed  reliability   Highly scalable and available 22
  • 23. MongoDB use cases   Use cases   High volume writes   Complex data   Semi-structured data   Who is using it?   Shutterfly, Foursquare   Bit.ly Intuit   SourceForge, NY Times   GILT Groupe, Evite,   SugarCRM 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 23
  • 24. Apache Cassandra   Column-oriented database/Extensible row store   Think Row ~= java.util.SortedMap   Transaction = update of a row   Fast writes = append to a log   Tunable reads/writes: consistency  latency/availability   Extremely scalable   Transparent and dynamic clustering   Rack and datacenter aware data replication   CQL = “SQL”-like DDL and DML 24
  • 25. Cassandra use cases   Use cases •  Big data •  Multiple Data Center distributed database •  Persistent cache •  (Write intensive) Logging •  High-availability (writes)   Who is using it   Digg, Facebook, Twitter, Reddit, Rackspace   Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX   “The largest production cluster has over 100 TB of data in over 150 machines.“ – Casssandra web site 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 25
  • 26. Other NoSQL databases Type Examples Extensible columns/Column- Hbase oriented SimpleDB DynamoDB Graph Neo4j Key-value Redis Membase Document CouchDb http://nosql-database.org/ lists 122+ NoSQL databases 26
  • 27. Solution: Use NewSQL   Relational databases with SQL and ACID transactions AND   New and improved architecture   Radically better scalability and performance   NewSQL vendors: ScaleDB, NimbusDB, …, VoltDB 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 27
  • 28. Stonebraker’s motivations “…Current databases are designed for 1970s hardware …” Stonebraker: http://www.slideshare.net/VoltDB/sql-myths-webinar Significant overhead in “…logging, latching, locking, B-tree, and buffer management operations…” SIGMOD 08: Though the looking glass: http://dl.acm.org/citation.cfm?id=1376713 28
  • 29. About VoltDB   Open-source   In-memory relational database   Durability thru replication; snapshots and logging   Transparent partitioning   Fast and scalable …VoltDB is very scalable; it should scale to 120 partitions, 39 servers, and 1.6 million complex transactions per second at over 300 CPU cores… http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/ 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 29
  • 30. The future is polyglot persistence e.g. Netflix •  RDBMS •  SimpleDB •  Cassandra •  Hadoop/Hbase IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg 30
  • 31. Spring Data is here to help For NoSQL databases http://www.springsource.org/spring-data 31
  • 32. Spring Data sub-projects   Various sub-projects   SQL: Spring Data JPA, JDBC extensions   Commons: Polyglot persistence   Key-Value: Redis, Riak   Document: MongoDB   Graph: Neo4j   GORM for NoSQL   What you get:   Wrapper classes analogous to JDBC template   Generic Repository   Cross-store persistence   … 32
  • 33. Proceed with caution   Don’t commit to a NoSQL DB until you have done a significant POC   Encapsulate your data access code so you can switch   Hope that one day you won’t need ACID (or complex queries) 33
  • 34. Agenda   Why NoSQL? NewSQL?   Persisting entities   Implementing queries 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 34
  • 35. Food to Go – Place Order use case 1.  Customer enters delivery address and delivery time 2.  System displays available restaurants 3.  Customer picks restaurant 4.  System displays menu 5.  Customer selects menu items 6.  Customer places order 35
  • 36. Food to Go – Domain model (partial) class Restaurant { class TimeRange { long id; long id; String name; int dayOfWeek; Set<String> serviceArea; int openingTime; Set<TimeRange> openingHours; int closingTime; List<MenuItem> menuItems; } } class MenuItem { String name; double price; } 36
  • 37. Database schema ID Name … RESTAURANT 1 Ajanta table 2 Montclair Eggshop Restaurant_id zipcode RESTAURANT_ZIPCODE 1 94707 table 1 94619 2 94611 2 94619 RESTAURANT_TIME_RANGE table Restaurant_id dayOfWeek openTime closeTime 1 Monday 1130 1430 1 Monday 1730 2130 2 Tuesday 1130 … 37
  • 38. How to implement the repository? public interface AvailableRestaurantRepository { void add(Restaurant restaurant); Restaurant findDetailsById(int id); … } Restaurant  ? TimeRange MenuItem Restaurant aggregate 38
  • 39. MongoDB: persisting restaurants is easy Server Database: Food To Go Collection: Restaurants { "_id" : ObjectId("4bddc2f49d1505567c6220a0") "name": "Ajanta", "serviceArea": ["94619", "99999"], BSON = "openingHours": [ { binary "dayOfWeek": 1, JSON "open": 1130, "close": 1430 }, { Sequence "dayOfWeek": 2, "open": 1130, of bytes on "close": 1430 disk  fast }, … ] i/o } 39
  • 40. Using the MongoDB CLI > r = {name: 'Ajanta'} > db.restaurants.save(r) > r { "_id" : ObjectId("4e555dd9646e338dca11710c"), "name" : "Ajanta" } > r = db.restaurants.findOne({name:"Ajanta"}) { "_id" : ObjectId("4e555dd9646e338dca11710c"), "name" : "Ajanta" } > r.type= "Indian” > db.restaurants.save(r) > db.restaurants.update({name:"Ajanta"}, {$set: {name:"Ajanta Restaurant"}, $push: { menuItems: {name: "Chicken Vindaloo"}}}) > db.restaurants.find() { "_id" : ObjectId("4e555dd9646e338dca11710c"), "menuItems" : [ { "name" : "Chicken Vindaloo" } ], "name" : "Ajanta Restaurant", "type" : "Indian" } > db.restaurants.remove(r.id) 40
  • 41. Spring Data for Mongo code @Repository public class AvailableRestaurantRepositoryMongoDbImpl implements AvailableRestaurantRepository { public static String AVAILABLE_RESTAURANTS_COLLECTION = "availableRestaurants"; @Autowired private MongoTemplate mongoTemplate; @Override public void add(Restaurant restaurant) { mongoTemplate.insert(restaurant, AVAILABLE_RESTAURANTS_COLLECTION); } @Override public Restaurant findDetailsById(int id) { return mongoTemplate.findOne(new Query(where("_id").is(id)), Restaurant.class, AVAILABLE_RESTAURANTS_COLLECTION); } } 41
  • 42. Spring Configuration @Configuration public class MongoConfig extends AbstractDatabaseConfig { @Value("#{mongoDbProperties.databaseName}") private String mongoDbDatabase; @Bean public Mongo mongo() throws UnknownHostException, MongoException { return new Mongo(databaseHostName); } @Bean public MongoTemplate mongoTemplate(Mongo mongo) throws Exception { MongoTemplate mongoTemplate = new MongoTemplate(mongo, mongoDbDatabase); mongoTemplate.setWriteConcern(WriteConcern.SAFE); mongoTemplate.setWriteResultChecking(WriteResultChecking.EXCEPTION); return mongoTemplate; } } 42
  • 43. Cassandra data model Column Column Row Value Name Timestamp Key Keyspace Column Family K1 N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 K2 N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 Column name/value: number, string, Boolean, timestamp, and composite 43
  • 44. Cassandra– inserting/updating data Column Family K1 N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 … Idempotent= transaction CF.insert(key=K1, (N4, V4, TS4), …) Column Family K1 N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 N4 V4 TS4 … 44
  • 45. Cassandra– retrieving data Column Family K1 N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 N4 V4 TS4 … CF.slice(key=K1, startColumn=N2, endColumn=N4) K1 N2 V2 TS2 N3 V3 TS3 N4 V4 TS4 Cassandra has secondary indexes but they aren’t helpful for these use cases 45
  • 46. Option #1: Use a column per attribute Column Name = path/expression to access property value Column Family: RestaurantDetails openingHours[0].dayOfWeek Monday name Ajanta serviceArea[0] 94619 1 openingHours[0].open 1130 type indian serviceArea[1] 94707 openingHours[0].close 1430 Egg openingHours[0].dayOfWeek Monday name serviceArea[0] 94611 shop 2 Break openingHours[0].open 0830 type serviceArea[1] 94619 Fast openingHours[0].close 1430
  • 47. Option #2: Use a single column Column value = serialized object graph, e.g. JSON Column Family: RestaurantDetails 2 attributes: { name: “Montclair Eggshop”, … } 1 attributes { name: “Ajanta”, …} 2 attributes { name: “Eggshop”, …} ✔ 47
  • 48. Cassandra code public class AvailableRestaurantRepositoryCassandraKeyImpl implements AvailableRestaurantRepository { @Autowired Home grown private final CassandraTemplate cassandraTemplate; wrapper class public void add(Restaurant restaurant) { cassandraTemplate.insertEntity(keyspace, RESTAURANT_DETAILS_CF, restaurant); } public Restaurant findDetailsById(int id) { String key = Integer.toString(id); return cassandraTemplate.findEntity(Restaurant.class, keyspace, key, RESTAURANT_DETAILS_CF); … } … http://en.wikipedia.org/wiki/Hector 48
  • 49. Using VoltDB   Use the original schema   Standard SQL statements BUT YOU MUST   Write stored procedures and invoke them using proprietary interface   Partition your data 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 49
  • 50. About VoltDB stored procedures   Key part of VoltDB   Replication = executing stored procedure on replica   Logging = log stored procedure invocation   Stored procedure invocation = transaction 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 50
  • 51. About partitioning Partition column RESTAURANT table ID Name … 1 Ajanta 2 Eggshop … 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 51
  • 52. Example cluster Partition 1a Partition 2a Partition 3a ID Name … ID Name … ID Name … 1 Ajanta 2 Eggshop … .. … … … Partition 3b Partition 1b Partition 2b ID Name … ID Name … ID Name … … .. 1 Ajanta 2 Eggshop … … … VoltDB Server 1 VoltDB Server 2 VoltDB Server 3 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 52
  • 53. Single partition procedure: FAST SELECT * FROM RESTAURANT WHERE ID = 1 High-performance lock free code ID Name … ID Name … ID Name … 1 Ajanta 1 Eggshop … .. … … … … … … VoltDB Server 1 VoltDB Server 2 VoltDB Server 3 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 53
  • 54. Multi-partition procedure: SLOWER SELECT * FROM RESTAURANT WHERE NAME = ‘Ajanta’ Communication/Coordination overhead ID Name … ID Name … ID Name … 1 Ajanta 1 Eggshop … .. … … … … … … VoltDB Server 1 VoltDB Server 2 VoltDB Server 3 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 54
  • 55. Chosen partitioning scheme <partitions> <partition table="restaurant" column="id"/> <partition table="service_area" column="restaurant_id"/> <partition table="menu_item" column="restaurant_id"/> <partition table="time_range" column="restaurant_id"/> <partition table="available_time_range" column="restaurant_id"/> </partitions> Performance is excellent: much faster than MySQL 55
  • 56. Stored procedure – AddRestaurant @ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”) public class AddRestaurant extends VoltProcedure { public final SQLStmt insertRestaurant = new SQLStmt("INSERT INTO Restaurant VALUES (?,?);"); public final SQLStmt insertServiceArea = new SQLStmt("INSERT INTO service_area VALUES (?,?);"); public final SQLStmt insertOpeningTimes = new SQLStmt("INSERT INTO time_range VALUES (?,?,?,?);"); public final SQLStmt insertMenuItem = new SQLStmt("INSERT INTO menu_item VALUES (?,?,?);"); public long run(int id, String name, String[] serviceArea, long[] daysOfWeek, long[] openingTimes, long[] closingTimes, String[] names, double[] prices) { voltQueueSQL(insertRestaurant, id, name); for (String zipCode : serviceArea) voltQueueSQL(insertServiceArea, id, zipCode); for (int i = 0; i < daysOfWeek.length ; i++) voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]); for (int i = 0; i < names.length ; i++) voltQueueSQL(insertMenuItem, id, names[i], prices[i]); voltExecuteSQL(true); return 0; } } 56
  • 57. VoltDb repository – add() @Repository public class AvailableRestaurantRepositoryVoltdbImpl implements AvailableRestaurantRepository { @Autowired private VoltDbTemplate voltDbTemplate; @Override public void add(Restaurant restaurant) { invokeRestaurantProcedure("AddRestaurant", restaurant); } private void invokeRestaurantProcedure(String procedureName, Restaurant restaurant) { Object[] serviceArea = restaurant.getServiceArea().toArray(); long[][] openingHours = toArray(restaurant.getOpeningHours()); Flatten Object[][] menuItems = toArray(restaurant.getMenuItems()); Restaurant voltDbTemplate.update(procedureName, restaurant.getId(), restaurant.getName(), serviceArea, openingHours[0], openingHours[1], openingHours[2], menuItems[0], menuItems[1]); } 57
  • 58. VoltDbTemplate wrapper class public class VoltDbTemplate { private Client client; VoltDB client API public VoltDbTemplate(Client client) { this.client = client; } public void update(String procedureName, Object... params) { try { ClientResponse x = client.callProcedure(procedureName, params); … } catch (Exception e) { throw new RuntimeException(e); } } 58
  • 59. VoltDb server configuration <?xml version="1.0"?> <deployment> <project> <cluster hostcount="1" <info> <name>Food To Go</name> sitesperhost="5" kfactor="0" /> ... </info> </deployment> <database> <schemas> <schema path='schema.sql' /> </schemas> <partitions> <partition table="restaurant" column="id"/> ... </partitions> <procedures> <procedure class='net.chrisrichardson.foodToGo.newsql.voltdb.procs.AddRestaurant' /> ... </procedures> </database> </project> voltcompiler target/classes src/main/resources/sql/voltdb-project.xml foodtogo.jar bin/voltdb leader localhost catalog foodtogo.jar deployment deployment.xml 59
  • 60. Performance Benchmarking is still work in progress but so far http://www.youtube.com/watch? v=b2F-DItXtZs Mongo Cassandra VoltDB Insert for PK Awesome Fast* Awesome Find by PK Awesome Fast Incredible * Cassandra can be clustered for improved write performance 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 60
  • 61. Agenda   Why NoSQL? NewSQL?   Persisting entities   Implementing queries 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 61
  • 62. Finding available restaurants Available restaurants = Serve the zip code of the delivery address AND Are open at the delivery time public interface AvailableRestaurantRepository { List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); … } 62
  • 63. Finding available restaurants on Monday, 6.15pm for 94619 zip select r.* Straightforward from restaurant r three-way join inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id Where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime 63
  • 64. MongoDB = easy to query { serviceArea:"94619", Find a openingHours: { $elemMatch : { restaurant "dayOfWeek" : "Monday", "open": {$lte: 1815}, that serves } "close": {$gte: 1815} the 94619 zip } } code and is open at DBCursor cursor = collection.find(qbeObject); while (cursor.hasNext()) { 6.15pm on a DBObject o = cursor.next(); … Monday } db.availableRestaurants.ensureIndex({serviceArea: 1}) 64
  • 65. MongoTemplate-based code @Repository public class AvailableRestaurantRepositoryMongoDbImpl implements AvailableRestaurantRepository { @Autowired private final MongoTemplate mongoTemplate; @Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); Query query = new Query(where("serviceArea").is(deliveryAddress.getZip()) .and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek) .and("openingTime").lte(timeOfDay) .and("closingTime").gte(timeOfDay))); return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query, AvailableRestaurant.class); } mongoTemplate.ensureIndex(“availableRestaurants”, new Index().on("serviceArea", Order.ASCENDING)); 65
  • 66. BUT how to do this with Cassandra??!   How can Cassandra support a query that has ?   A 3-way join   Multiple =   > and <  We need to implement an index Queries instead of data model drives NoSQL database design 66
  • 67. ... And use a slice operation columnFamily.slice(key=keyVal, startColumn=startVal, endColumn=endVal) = select * from columnFamily where key = keyVal and col >= startVal and col <= endVal 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 67
  • 68. Simplification #1: Denormalization Restaurant_id Day_of_week Open_time Close_time Zip_code 1 Monday 1130 1430 94707 1 Monday 1130 1430 94619 1 Monday 1730 2130 94707 1 Monday 1730 2130 94619 2 Monday 0700 1430 94619 … SELECT restaurant_id FROM time_range_zip_code WHERE day_of_week = ‘Monday’ Simpler query: AND zip_code = 94619   No joins   Two = and two < AND 1815 < close_time AND open_time < 1815 68
  • 69. Simplification #2: Application filtering SELECT restaurant_id, open_time FROM time_range_zip_code WHERE day_of_week = ‘Monday’ Even simpler query AND zip_code = 94619 •  No joins AND 1815 < close_time •  Two = and one < AND open_time < 1815 69
  • 70. Simplification #3: Eliminate multiple =’s with concatenation Restaurant_id Zip_dow Open_time Close_time 1 94707:Monday 1130 1430 1 94619:Monday 1130 1430 1 94707:Monday 1730 2130 1 94619:Monday 1730 2130 2 94619:Monday 0700 1430 … SELECT restaurant_id, open_time FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time key range 70
  • 71. Column family with composite column names as an index Restaurant_id Zip_dow Open_time Close_time 1 94707:Monday 1130 1430 1 94619:Monday 1130 1430 1 94707:Monday 1730 2130 1 94619:Monday 1730 2130 2 94619:Monday 0700 1430 … Column Family: AvailableRestaurants JSON FOR JSON FOR (1430,0700,2) (2130,1730,1) 94619:Monday EGG AJANTA JSON FOR (1430,1130,1) AJANTA
  • 72. Querying with a slice Column Family: AvailableRestaurants JSON FOR JSON FOR (1430,0700,2) (2130,1730,1) EGG AJANTA 94619:Monday JSON FOR (1430,1130,1) AJANTA slice(key= 94619:Monday, sliceStart = (1815, *, *), sliceEnd = (2359, *, *)) JSON FOR (2130,1730,1) 94619:Monday AJANTA 18:15 is after 17:30  {Ajanta} 72
  • 73. Needs a few pages of code private void insertAvailability(Restaurant restaurant) { for (String zipCode : (Set<String>) restaurant.getServiceArea()) { @Override for (TimeRange tr : (Set<TimeRange>) restaurant.getOpeningHours()) { public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { String dayOfWeek = format2(tr.getDayOfWeek()); int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); String openingTime = format4(tr.getOpeningTime()); int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); String closingTime = format4(tr.getClosingTime()); String zipCode = deliveryAddress.getZip(); String key = formatKey(zipCode, format2(dayOfWeek)); String restaurantId = format8(restaurant.getId()); HSlicePredicate<Composite> predicate = new HSlicePredicate<Composite>(new CompositeSerializer()); String key = formatKey(zipCode, dayOfWeek); Composite start = new Composite(); Composite finish = new Composite(); String columnValue = toJson(restaurant); start.addComponent(0, format4(timeOfDay), ComponentEquality.GREATER_THAN_EQUAL); finish.addComponent(0, format4(2359), ComponentEquality.GREATER_THAN_EQUAL); Composite columnName = new Composite(); predicate.setRange(start, finish, false, 100); columnName.add(0, closingTime); final List<AvailableRestaurantIndexEntry> closingAfter = new ArrayList<AvailableRestaurantIndexEntry>(); columnName.add(1, openingTime); columnName.add(2, restaurantId); ColumnFamilyRowMapper<String, Composite, Object> mapper = new ColumnFamilyRowMapper<String, Composite, Object>() { @Override ColumnFamilyUpdater<String, Composite> updater public Object mapRow(ColumnFamilyResult<String, Composite> results) { = compositeCloseTemplate.createUpdater(key); for (Composite columnName : results.getColumnNames()) { String openTime = columnName.get(1, new StringSerializer()); String restaurantId = columnName.get(2, new StringSerializer()); updater.setString(columnName, columnValue); closingAfter.add(new AvailableRestaurantIndexEntry(openTime, restaurantId, results.getString(columnName))); } return null; } }; compositeCloseTemplate.update(updater); } compositeCloseTemplate.queryColumns(key, predicate, mapper); } List<AvailableRestaurant> result = new LinkedList<AvailableRestaurant>(); } for (AvailableRestaurantIndexEntry trIdAndAvailableRestaurant : closingAfter) { if (trIdAndAvailableRestaurant.isOpenBefore(timeOfDay)) result.add(trIdAndAvailableRestaurant.getAvailableRestaurant()); } return result; } 73
  • 74. What did I just do to query the data?   Wrote code to maintain an index   Reduced performance due to extra writes 74
  • 75. Mongo vs. Cassandra DC1 DC2 Shard A Master Shard B Master MongoDB Remote DC1 Client DC2 Client DC1 DC2 Async Cassandra Or Cassandra Cassandra Sync DC1 Client DC2 Client 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 75
  • 76. VoltDB - attempt #1 @ProcInfo( singlePartition = false) public class FindAvailableRestaurants extends VoltProcedure { ... } ERROR 10:12:03,251 [main] COMPILER: Failed to plan for statement type(findAvailableRestaurants_with_join) select r.* from restaurant r,time_range tr, service_area sa Where ? = sa.zip_code and r.id =tr.restaurant_id and r.id = sa.restaurant_id and tr.day_of_week=? and tr.open_time <= ? and ? <= tr.close_time Error: "Unable to plan for statement. Likely statement is joining two partitioned tables in a multi-partition statement. This is not supported at this time." ERROR 10:12:03,251 [main] COMPILER: Catalog compilation failed. Bummer! 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 76
  • 77. VoltDB - attempt #2 @ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”) public class AddRestaurant extends VoltProcedure { public final SQLStmt insertAvailable= new SQLStmt("INSERT INTO available_time_range VALUES (?,?,?, ?, ?, ?);"); public long run(....) { ... for (int i = 0; i < daysOfWeek.length ; i++) { voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]); for (String zipCode : serviceArea) { voltQueueSQL(insertAvailable, id, daysOfWeek[i], openingTimes[i], closingTimes[i], zipCode, name); } } ... public final SQLStmt findAvailableRestaurants_denorm = new SQLStmt( voltExecuteSQL(true); "select restaurant_id, name from available_time_range tr " + return 0; "where ? = tr.zip_code " + } "and tr.day_of_week=? " + } "and tr.open_time <= ? " + " and ? <= tr.close_time "); Works but queries are only slightly faster than MySQL! 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 77
  • 78. VoltDB - attempt #3 <partitions> ... <partition table="available_time_range" column="zip_code"/> </partitions> @ProcInfo( singlePartition = false, ...) public class AddRestaurant extends VoltProcedure { ... } @ProcInfo( singlePartition = true, partitionInfo = "available_time_range.zip_code: 0") public class FindAvailableRestaurants extends VoltProcedure { ... } Queries are really fast but inserts are not  Partitioning scheme – optimal for some use cases but not others 78
  • 79. Performance Benchmarking is still work in progress but so far Mongo Cassandra VoltDB Insert for find available Awesome Ok* Ok Find available Ok Ok Awesome restaurants * Cassandra can be clustered for improved write performance 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 79
  • 80. Is NewSQL/NoSQL a good fit for Food To Go? Cassandra Mongo VoltDB Representing Easy Very easy Easy restaurants PK-lookup Easy Easy Easy Find available Lots of code Easy Tricky restaurants query Use as SOR Yes – if Yes – for single Yes distribution datacenter required 80
  • 81. Summary…   Relational databases are great BUT there are limitations   Each NoSQL database solves some problems BUT   Limited transactions: NoSQL = NoACID   One day needing ACID  major rewrite   Query-driven, denormalized database design   …   NewSQL databases such as VoltDB provides SQL, ACID transactions and incredible performance BUT   Not all operations are fast   Non-JDBC API 81
  • 82. … Summary   Very carefully pick the NewSQL/ NoSQL DB for your application   Consider a polyglot persistence architecture   Encapsulate your data access code so you can switch   Startups = avoid NewSQL/NoSQL for shorter time to market? 3/18/12 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 82
  • 83. Thank you! Signup at CloudFoundry.com using promo code EgyptJUG My contact info: chris.richardson@springsource.com @crichardson 83