SlideShare a Scribd company logo
1 of 78
To be relational, or not ?
             That's not the question!
   @sbtourist aka sergio bossa
   @alexsnaps aka alex snaps

                                          sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Agenda

   Act 1 – Relational databases:
      a difficult love affair.
   Interlude 1 – Problems in the modern era:
       big data and the CAP theorem.
   Act 2 : Non-relational databases: love is in the air.
   Interlude 2 : Relational or non-relational?
       Not the correct question.
   Act 3 : Building scalable apps.
   The End
                                                           sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
About us

   Sergio Bossa
         Software Engineer at Bwin Italy.
         Long time open source contributor.
         (Micro)Blogger - @sbtourist.
   Alex Snaps
         Software engineer at Terracotta …
         … after 10 years of industry experience.
         www.codespot.net — @alexsnaps
                                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Act I
             Relational databases
   A difficult love affair …



                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
The relational model

   Defines constraints
         Finite model
           aka relation variable, relation or table
         Candidate keys
         Foreign keys
   Queries are relations themselves
         Heading & body
   SQL differs slightly from the "pure" relational model
                                                           sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
The ACID guarantees

   Let people easily reason about the problem.
         Atomic.
               We see all changes, or no changes at all.
         Consistent.
               Changes respect our rules and constraints.
         Isolated.
               We see all changes as independently happening.
         Durable.
               We keep the effect of our changes forever.
   Fits a simplified model of our reality.

                                                                sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
The SQL ubiquity

   SQL is everywhere
         Still trying to figure out why my blog
           uses a relational database to be honest
   SQL is known by everyone
         Raise your hand if you've never written a SQL query
   ... and if you don't want to
         ORM are there for you
         ActiveRecord if you're into Rails
   Simple persistence for all our objects!
                                                     sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Interlude 1
   Problems of  the Modern Era



                                 sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Big Data

   What's Big Data for you?
   Not a question of quantity.
         Gigabytes?
         Terabytes?
         Petabytes?
   It's all about supporting your business growth.
         Growth in terms of schema evolution.
         Growth in terms of data storage.
   Growth in terms of data processing.
   Is your data stack capable of handling such a growth?

                                                           sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
CAP Theorem

   Year 2000: Formulated by Eric Brewer.
   Year 2002: Demonstrated by Lynch and Gilbert.
   Nowadays: A religion for many distributed system guys.
   CAP Theorem in a few words:
         Consistency.
         Availability.
         Partition-Tolerance.
         Pick (at most) two.
   More later.

                                                            sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Act II
   Non-relational databases
   Love is in the air

                              sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Non-relational databases


                           From the origins to the current explosion...
                                      How do they differ?




                                                                   sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Data Model (1)


     Column-family.
           Key-identified rows
             with a sparse number of columns.
           Columns grouped in families.
           Multiple families for the same key.
           Dynamically add/remove columns.
           Efficiently access same-family columns




                                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Data Model (2)


           Graph.
                 Vertices represent your data.
                 Edges represent meaningful relations between nodes.
                 Key/Value properties attached to both.
                 Indexed properties.
                 Efficient traversal.




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Data Model (3)


           Document.
                 Schemaless documents.
                        With denormalized data.
                 Stored as a whole unit.
                 Clients can update/query contents.




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Data Model (4)


           Key/Value.
                 Opaque values.
                 Maximize flexibility.
                 Efficiently store and retrieve by key.




                                                          sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Consistency Model (1)


           Strict (Sequential) Consistency.
                 Every read/write operation act on either:
                        The last value read.
                        The last value written.




                                                     sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Consistency Model (2)


           Eventual Consistency.
                 All read/write operations will eventually reach
                   consistent state.
                        Stale data may be served.
                        Versions may diverge.
                        Repair may be needed.




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Partitioning (1)


           Client-side partitioning.
                 Every server is self-contained.
                 Clients partition data per-request.




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Partitioning (2)


 Server-side partitioning.
       Servers automatically partition data.
       Consistent-hashing, ring-based
         is the most used.




                                               sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Replication (1)


           Master/Slave.
                 Master propagates changes to slaves.
                        Log replication.
                        Statement replication.
                 Slaves may take over master
                   in case of failures.



                                                   sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Replication (2)


           N-Replication.
                 Nodes are all peers.
                        No master, no slaves.
                 Each node replicates its data
                  to a subset (N) of nodes.




                                                 sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Processing


               A distributed system is built by:
                           Moving data toward its behavior.
                                                              ... or ...
                           Moving behavior toward its data.
               An efficient distributed system is built by:
                           Moving behavior toward its data.
               Map/Reduce is the most common and efficient.




                                                                           sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
CAP - Problem

   CAP Theorem.
         Consistency.
         Availability.
         Partition tolerance.
         Pick two.
         Do you remember?
   Makes sense only
     when dealing with partitions/failures ...
                                                 sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
CAP - Trade-offs

           Consistency + Availability.
                 Requests will wait until partitions heal.
                        Full consistency.
                        Full availability.
                        Max latency.
           Consistency + Partition tolerance.
                 Some requests will act on partial data.
                 Some requests will be refused.
                        Full consistency.
                        Reduced availability.
                        Min latency.
           Availability + Partition tolerance.
                 All requests will be fulfilled.
                        Sacrifice consistency.
                        Max availability.
                        Min latency.

                                                             sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Interlude II
   Relational or non-relational? 
   Not the correct question!

                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Relational or non-relational? Not the correct question!

   Freedom to build the right solution for your problems.
   Freedom to scale your solution as your problems scale.
   Know your use case.
         Understand the problem domain, then choose technology.
   Know your data.
         Understand your data and data access patterns, then choose technology.
   Know your tools.
         Understand available tools, don't go blind.
   Pick the simple solution.
         Choose the simpler technology that works for your problem.
   Build on that.

                                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Act III
   Building scalable apps



                            sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Building scalable apps
   Relational databases



                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Scaling out...


           Adding app servers will work...
             ... for a little while!
           But at some point we need to scale the database as well

                                                                RDBMS
                                                        WRITE

                                                                                 READ




                                                                  PowerBook G4
                                         PowerBook G4                                   PowerBook G4




                                                                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Master-Slave

   One master
         gets all the writes
         replicates to slaves
   Slaves (& master)
         gets the read operations

   We didn't really scale writes 
   Static topology
   SPOF remains
                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Master-Master


           Multiple masters
                 Writes & reads all participants
           Writes are replicated to masters
           Synchronously
                 Expensive 2PC
           Asynchronously
                 Conflicts have to be resolved

           Complex
           Static topology
           Limited performance again
           Solved SPOF, sort of…


                                                   sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Vertical Partitioning


           Data is split across multiple database servers based
               on
                 The functional area

           Joins are moved to the application
                 Not relational anymore

           SPOF back
           What about one functional area growing "out of hand" ?


                                                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Horizontal partitioning


           Data is split across multiple database servers based on
                 Key sharding

           Joins are moved to the application
                 Not relational anymore

           SPOF back
           What about one functional area growing "out of hand" ?
           Routing required
                 Where's Order#123 ?

                                                                     sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Building scalable apps
   Caching



                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Caching


           If going to the database is so
               expensive...
               ... we should just avoid it!
           Put a cache in front of the database

           Data remains closer to processing unit
           If needed, distribute

                                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Distributed Caching


           What if data isn't perfectly partitioned ?
           How do we keep this all in sync ?                               RDBMS
                                                                   WRITE

           Peer-to-peer ?                                                                   READ




                                                   Cache                   Cache                     Cache

                                                                             PowerBook G4
                                                    PowerBook G4                                        PowerBook G4




                                                                            sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Distributed Caching


           Cached data remains close to the
              processing unit                                RDBMS



           Central unit                                       L2



              orchestrates it all             L1              L1                                L1




                                                               PowerBook G4
                                              PowerBook G4                                       PowerBook G4




                                                                              sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Distributed Caching




                                                   RDBMS




                                                    L2



                                   L1               L1              L1




                                                     PowerBook G4
                                    PowerBook G4                    PowerBook G4




                                                                                   sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Distributed Caching


                             SLOWER                         LARGER

                                              RAM




                                              L2


                                      L1       L1     L1



                                      Core    Core   Core
                             FASTER                         SMALLER




                                                               sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Distributed Caching


                            SLOWER                                                  LARGER




                                                    RDBMS




                                                     L2



                                     L1              L1              L1




                            FASTER   PowerBook G4
                                                      PowerBook G4
                                                                     PowerBook G4




                                                                                    SMALLER




                                                                                     sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Distributed Caching
                                                    We've scaled our reads
                                                    But what about writes ?
                            SLOWER                                                            LARGER




                                                              RDBMS




                                                               L2



                                     L1                        L1              L1




                            FASTER   PowerBook G4
                                                                PowerBook G4
                                                                               PowerBook G4




                                                                                              SMALLER




                                                                                               sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind Cache

   Rather than write changes directly to the slowest participant
         Write to fastest durable store (persistent queue)
               required for recovery in the face of failure
         Only write to database later
               in batches and/or coalesced
               while still controlling
                        the lag
                        the load on the database
   In a distributed environment handling failures 
       we enforce happens at least once 
         loosens the contract vs. "once and only once"!

                                                                   sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-through


                                           RDBMS




                                           Cache




                                        Application
                                          code




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-through


                                           RDBMS




                                           Cache




                                        Application
                                          code




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-through


                                           RDBMS




                                           Cache




                                        Application
                                          code




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-through


                                           RDBMS




                                           Cache




                                        Application
                                          code




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache




                                       Application
                                         code




                                                     sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache




                                       Application
                                         code




                                                     sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                          Cache      Writer




                                       Application
                                         code




                                                       sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                                     Writer
                                          Cache       Writer




                                       Application
                                         code




                                                        sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Write-behind


                                          RDBMS




                                                     Writer
                                          Cache       Writer




                                       Application
                                         code




                                                        sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Building scalable apps
   Non-relational databases



                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
The case for non-relationals


           We've seen how to scale our relational database.
           We've seen how to add caching.
           We've seen how to make caching scale.
           So why going non-relational?
                 A matter of use case




                                                              sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Rich Data

   Frequent schema changes.
         Relational databases
          cannot easily handle table modifications.
         Column, Document and Graph databases can.
   Tracking/Processing of large relations.
         Relational databases keep and
          traverse table relations by foreign-key joins.
               Joins are expensive.
         Graph databases provide cheap and
          fast traversal operations.

                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Runtime Data

   Poorly structured data.
         Relational model
           doesn't fit unstructured data.
               Unless you want BLOBs.
         Non-relational databases provide more flexible models.
   Throw-away data.
         Relational databases provide maximum data durability.
         But not all kind of data are the same.
         When you can afford to possibly lose some data, non-relational
          databases let you trade durability for higher flexibility and
          performance.

                                                         sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Massive Data

   Large quantity of data.
         Relational databases are typically single instance.
               Otherwise cost too much.
         Choose a partitioned non-relational database.
   High volume of writes.
         Scaling writes with relational databases is difficult.
               Even if using write-behind caching.
         Choose a partitioned non-relational database.
         Eventual consistency may also help.
   Complex, aggregated, queries.
         Complex aggregations and queries
           are expensive in standard relational databases.
               Remember joins?
         Choose a non-relational database supporting distributed data processing.
               Hint: Map/Reduce.


                                                                  sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Building scalable apps
   Multi-paradigm



                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
A case for multi-paradigm: CQRS




                                                          sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
CQRS - Explained

           Command Store.
                  Strictly modeled after the problem domain.
                  Commands model domain changes.
                  Domain changes cause events publishing.
           Query Store.
                  Fed by published events.
                  Strictly modeled after the user interface.
                  Possibly denormalized to accommodate queries.
           Offline Store.
                  Fed by published events.
                  Strictly modeled after processing needs.
                            Business Intelligence.
                            Reporting.
                            Statistical aggregations, …

                                                                  sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
What about stores implementation?



                                          sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Introducing Ehcache


           Started in 2003 by Greg Luck
           Apache 2.0 License
           Integrated by lots of projects, products
           Hibernate Provider implemented 2003
           Web Caching 2004
           Distributed Caching 2006
           REST and SOAP APIs 2008
           Acquired
               by Terracotta Sept. 2009

                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Introducing Ehcache 2.4


           New Search API
           New consistency modes
           New transactional modes

           Still:
                    Small memory footprint
                    Cache Writers
                    JTA
                    Bulk load
           Grows with your application
               with only two lines of configuration
                    Scale Up - BigMemory (100’s of Gig, in process, NO GC)
                    Scale Out - Clustering Platform (Up to 2 Terabytes, HA)


                                                                              sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Ehcache & Terracotta

   Terabyte scale with minimal footprint
   Server array with striping for linear scale
   High availability
   High performance data persistence
   In-memory performance as you scale



                                                 sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Introducing Terrastore

   Document Store.
         Ubiquitous.
         Consistent.
         Distributed.
         Scalable.
   Written in Java.
   Based on Terracotta.
   Open Source.
   Apache-licensed.
                                                 sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Terrastore Architecture




                                                  sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
CQRS - Implemented




                                             sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
CQRS - Implemented

   Command Store.
         Express your domain as an object model.
         Process and store it with Ehcache and Terracotta.
         Optionally write it back to a relational database.
   Query Store.
         Map queries to user (screen) views.
         Map views to JSON documents.
         Store and get them back with Terrastore.
   Offline Store.
         Collect data from input commands.
         Keep data denormalized.
         Aggregate and process with Terrastore Map/Reduce.

                                                              sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
The end
   Be a polyglot!



                        sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Conclusion

   Be polyglot
         Go out and play with all this
         Know your options (all of them)
   Understand your domain
         It’s current and future requirements
   Be wise, and choose well!



                                                sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
Creative Commons by wwworks




                                                      sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011
More info

   www.ehcache.org
   www.terracotta.org
   code.google.com/p/terrastore




                                    sergio bossa & alex snaps - @sbtourist & @alexsnaps
Saturday 5 March 2011

More Related Content

More from Codemotion

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Codemotion
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyCodemotion
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaCodemotion
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserCodemotion
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Codemotion
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Codemotion
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Codemotion
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 - Codemotion
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Codemotion
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Codemotion
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Codemotion
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Codemotion
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Codemotion
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Codemotion
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Codemotion
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...Codemotion
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Codemotion
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Codemotion
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Codemotion
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Codemotion
 

More from Codemotion (20)

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending story
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storia
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard Altwasser
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

To be relational, or not to be relational? That's *not* the question!

  • 1. To be relational, or not ? That's not the question! @sbtourist aka sergio bossa @alexsnaps aka alex snaps sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 2. Agenda Act 1 – Relational databases: a difficult love affair. Interlude 1 – Problems in the modern era: big data and the CAP theorem. Act 2 : Non-relational databases: love is in the air. Interlude 2 : Relational or non-relational? Not the correct question. Act 3 : Building scalable apps. The End sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 3. About us Sergio Bossa Software Engineer at Bwin Italy. Long time open source contributor. (Micro)Blogger - @sbtourist. Alex Snaps Software engineer at Terracotta … … after 10 years of industry experience. www.codespot.net — @alexsnaps sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 4. Act I Relational databases A difficult love affair … sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 5. The relational model Defines constraints Finite model aka relation variable, relation or table Candidate keys Foreign keys Queries are relations themselves Heading & body SQL differs slightly from the "pure" relational model sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 6. The ACID guarantees Let people easily reason about the problem. Atomic. We see all changes, or no changes at all. Consistent. Changes respect our rules and constraints. Isolated. We see all changes as independently happening. Durable. We keep the effect of our changes forever. Fits a simplified model of our reality. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 7. The SQL ubiquity SQL is everywhere Still trying to figure out why my blog uses a relational database to be honest SQL is known by everyone Raise your hand if you've never written a SQL query ... and if you don't want to ORM are there for you ActiveRecord if you're into Rails Simple persistence for all our objects! sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 8. Interlude 1 Problems of  the Modern Era sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 9. Big Data What's Big Data for you? Not a question of quantity. Gigabytes? Terabytes? Petabytes? It's all about supporting your business growth. Growth in terms of schema evolution. Growth in terms of data storage. Growth in terms of data processing. Is your data stack capable of handling such a growth? sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 10. CAP Theorem Year 2000: Formulated by Eric Brewer. Year 2002: Demonstrated by Lynch and Gilbert. Nowadays: A religion for many distributed system guys. CAP Theorem in a few words: Consistency. Availability. Partition-Tolerance. Pick (at most) two. More later. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 11. Act II Non-relational databases Love is in the air sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 12. Non-relational databases From the origins to the current explosion... How do they differ? sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 13. Data Model (1) Column-family. Key-identified rows with a sparse number of columns. Columns grouped in families. Multiple families for the same key. Dynamically add/remove columns. Efficiently access same-family columns sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 14. Data Model (2) Graph. Vertices represent your data. Edges represent meaningful relations between nodes. Key/Value properties attached to both. Indexed properties. Efficient traversal. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 15. Data Model (3) Document. Schemaless documents. With denormalized data. Stored as a whole unit. Clients can update/query contents. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 16. Data Model (4) Key/Value. Opaque values. Maximize flexibility. Efficiently store and retrieve by key. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 17. Consistency Model (1) Strict (Sequential) Consistency. Every read/write operation act on either: The last value read. The last value written. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 18. Consistency Model (2) Eventual Consistency. All read/write operations will eventually reach consistent state. Stale data may be served. Versions may diverge. Repair may be needed. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 19. Partitioning (1) Client-side partitioning. Every server is self-contained. Clients partition data per-request. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 20. Partitioning (2) Server-side partitioning. Servers automatically partition data. Consistent-hashing, ring-based is the most used. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 21. Replication (1) Master/Slave. Master propagates changes to slaves. Log replication. Statement replication. Slaves may take over master in case of failures. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 22. Replication (2) N-Replication. Nodes are all peers. No master, no slaves. Each node replicates its data to a subset (N) of nodes. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 23. Processing A distributed system is built by: Moving data toward its behavior. ... or ... Moving behavior toward its data. An efficient distributed system is built by: Moving behavior toward its data. Map/Reduce is the most common and efficient. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 24. CAP - Problem CAP Theorem. Consistency. Availability. Partition tolerance. Pick two. Do you remember? Makes sense only when dealing with partitions/failures ... sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 25. CAP - Trade-offs Consistency + Availability. Requests will wait until partitions heal. Full consistency. Full availability. Max latency. Consistency + Partition tolerance. Some requests will act on partial data. Some requests will be refused. Full consistency. Reduced availability. Min latency. Availability + Partition tolerance. All requests will be fulfilled. Sacrifice consistency. Max availability. Min latency. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 26. Interlude II Relational or non-relational?  Not the correct question! sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 27. Relational or non-relational? Not the correct question! Freedom to build the right solution for your problems. Freedom to scale your solution as your problems scale. Know your use case. Understand the problem domain, then choose technology. Know your data. Understand your data and data access patterns, then choose technology. Know your tools. Understand available tools, don't go blind. Pick the simple solution. Choose the simpler technology that works for your problem. Build on that. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 28. Act III Building scalable apps sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 29. Building scalable apps Relational databases sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 30. Scaling out... Adding app servers will work... ... for a little while! But at some point we need to scale the database as well RDBMS WRITE READ PowerBook G4 PowerBook G4 PowerBook G4 sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 31. Master-Slave One master gets all the writes replicates to slaves Slaves (& master) gets the read operations We didn't really scale writes  Static topology SPOF remains sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 32. Master-Master Multiple masters Writes & reads all participants Writes are replicated to masters Synchronously Expensive 2PC Asynchronously Conflicts have to be resolved Complex Static topology Limited performance again Solved SPOF, sort of… sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 33. Vertical Partitioning Data is split across multiple database servers based on The functional area Joins are moved to the application Not relational anymore SPOF back What about one functional area growing "out of hand" ? sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 34. Horizontal partitioning Data is split across multiple database servers based on Key sharding Joins are moved to the application Not relational anymore SPOF back What about one functional area growing "out of hand" ? Routing required Where's Order#123 ? sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 35. Building scalable apps Caching sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 36. Caching If going to the database is so expensive... ... we should just avoid it! Put a cache in front of the database Data remains closer to processing unit If needed, distribute sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 37. Distributed Caching What if data isn't perfectly partitioned ? How do we keep this all in sync ? RDBMS WRITE Peer-to-peer ? READ Cache Cache Cache PowerBook G4 PowerBook G4 PowerBook G4 sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 38. Distributed Caching Cached data remains close to the processing unit RDBMS Central unit L2 orchestrates it all L1 L1 L1 PowerBook G4 PowerBook G4 PowerBook G4 sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 39. Distributed Caching RDBMS L2 L1 L1 L1 PowerBook G4 PowerBook G4 PowerBook G4 sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 40. Distributed Caching SLOWER LARGER RAM L2 L1 L1 L1 Core Core Core FASTER SMALLER sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 41. Distributed Caching SLOWER LARGER RDBMS L2 L1 L1 L1 FASTER PowerBook G4 PowerBook G4 PowerBook G4 SMALLER sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 42. Distributed Caching We've scaled our reads But what about writes ? SLOWER LARGER RDBMS L2 L1 L1 L1 FASTER PowerBook G4 PowerBook G4 PowerBook G4 SMALLER sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 43. Write-behind Cache Rather than write changes directly to the slowest participant Write to fastest durable store (persistent queue) required for recovery in the face of failure Only write to database later in batches and/or coalesced while still controlling the lag the load on the database In a distributed environment handling failures  we enforce happens at least once  loosens the contract vs. "once and only once"! sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 44. Write-through RDBMS Cache Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 45. Write-through RDBMS Cache Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 46. Write-through RDBMS Cache Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 47. Write-through RDBMS Cache Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 48. Write-behind RDBMS Cache Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 49. Write-behind RDBMS Cache Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 50. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 51. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 52. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 53. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 54. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 55. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 56. Write-behind RDBMS Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 57. Write-behind RDBMS Writer Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 58. Write-behind RDBMS Writer Cache Writer Application code sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 59. Building scalable apps Non-relational databases sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 60. The case for non-relationals We've seen how to scale our relational database. We've seen how to add caching. We've seen how to make caching scale. So why going non-relational? A matter of use case sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 61. Rich Data Frequent schema changes. Relational databases cannot easily handle table modifications. Column, Document and Graph databases can. Tracking/Processing of large relations. Relational databases keep and traverse table relations by foreign-key joins. Joins are expensive. Graph databases provide cheap and fast traversal operations. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 62. Runtime Data Poorly structured data. Relational model doesn't fit unstructured data. Unless you want BLOBs. Non-relational databases provide more flexible models. Throw-away data. Relational databases provide maximum data durability. But not all kind of data are the same. When you can afford to possibly lose some data, non-relational databases let you trade durability for higher flexibility and performance. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 63. Massive Data Large quantity of data. Relational databases are typically single instance. Otherwise cost too much. Choose a partitioned non-relational database. High volume of writes. Scaling writes with relational databases is difficult. Even if using write-behind caching. Choose a partitioned non-relational database. Eventual consistency may also help. Complex, aggregated, queries. Complex aggregations and queries are expensive in standard relational databases. Remember joins? Choose a non-relational database supporting distributed data processing. Hint: Map/Reduce. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 64. Building scalable apps Multi-paradigm sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 65. A case for multi-paradigm: CQRS sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 66. CQRS - Explained Command Store. Strictly modeled after the problem domain. Commands model domain changes. Domain changes cause events publishing. Query Store. Fed by published events. Strictly modeled after the user interface. Possibly denormalized to accommodate queries. Offline Store. Fed by published events. Strictly modeled after processing needs. Business Intelligence. Reporting. Statistical aggregations, … sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 67. What about stores implementation? sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 68. Introducing Ehcache Started in 2003 by Greg Luck Apache 2.0 License Integrated by lots of projects, products Hibernate Provider implemented 2003 Web Caching 2004 Distributed Caching 2006 REST and SOAP APIs 2008 Acquired by Terracotta Sept. 2009 sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 69. Introducing Ehcache 2.4 New Search API New consistency modes New transactional modes Still: Small memory footprint Cache Writers JTA Bulk load Grows with your application with only two lines of configuration Scale Up - BigMemory (100’s of Gig, in process, NO GC) Scale Out - Clustering Platform (Up to 2 Terabytes, HA) sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 70. Ehcache & Terracotta Terabyte scale with minimal footprint Server array with striping for linear scale High availability High performance data persistence In-memory performance as you scale sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 71. Introducing Terrastore Document Store. Ubiquitous. Consistent. Distributed. Scalable. Written in Java. Based on Terracotta. Open Source. Apache-licensed. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 72. Terrastore Architecture sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 73. CQRS - Implemented sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 74. CQRS - Implemented Command Store. Express your domain as an object model. Process and store it with Ehcache and Terracotta. Optionally write it back to a relational database. Query Store. Map queries to user (screen) views. Map views to JSON documents. Store and get them back with Terrastore. Offline Store. Collect data from input commands. Keep data denormalized. Aggregate and process with Terrastore Map/Reduce. sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 75. The end Be a polyglot! sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 76. Conclusion Be polyglot Go out and play with all this Know your options (all of them) Understand your domain It’s current and future requirements Be wise, and choose well! sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 77. Creative Commons by wwworks sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011
  • 78. More info www.ehcache.org www.terracotta.org code.google.com/p/terrastore sergio bossa & alex snaps - @sbtourist & @alexsnaps Saturday 5 March 2011