SlideShare ist ein Scribd-Unternehmen logo
1 von 34
LEARNING TO RELAX:
   CouchDB for Beginners
        Windy City DB

              1
OUTLINE

• Introduction     and Overview
• CouchDB       Basics
• Special Topics    in Relaxation: Scaling CouchDB
• Use      Cases In the Wild
• Takeaways



Windy City DB                     2                  June 26, 2010
HI
• Alan    Hoffman
  • @_hoffman
  • alan@cloudant.com


• Experimental    particle physicist

• Background: machine    learning, big
 data analysis, distributed systems

• Co-founder     of Cloudant (Hosted
 Couch)

• Not    a committer, but...
Windy City DB                          3   June 26, 2010
COUCH: THE BIG PICTURE
• Apache         project

• Schema-free        document database management system

• Robust, concurrent, fault-tolerant

• RESTful        JSON API

• Custom         persistent views using MapReduce

• Bi-directional      incremental replication

• Futon         web admin console

Windy City DB                         4                    June 26, 2010
WHO CARES?

                The internet happened, and we ignored it.
                    In retrospect, that was a mistake.


                    -Bill Warner (Avid, Wildfire, Techstars)
                                Summer, 2008



           Disruptive technologies enable new business

Windy City DB                          5                      June 26, 2010
DOCUMENTS
                                            Primary Key

                                               MVCC
                                                  &
                                             Insta-cache



                                 Nested Structures




                                                     •   Reserved fields are prefixed with an
                                                         underscore
                                                     •   MVCC _rev deterministically generated
                                                         from doc content
                Binary Attachments                   •   Binary attachments
Windy City DB                                6                                       June 26, 2010
RESTFUL API
 •   Create
     PUT /mydb/mydocid
 •   Retrieve
     GET /mydb/mydocid                     “Built of the Web
                                      Completely embraces... HTTP”
 •   Update
     PUT /mydb/mydocid
                                             -Jacob Kaplan-Moss
 •   Delete                                     October 2007
     DELETE /mydb/mydocid


     GET /mydb/_all_docs?include_docs=true

 http://wiki.apache.org/couchdb/Reference
Windy City DB                     7                               June 26, 2010
VIEWS
                                          value



                                    ap                     du ce
                                  m                      re
       key


  • Docs can be indexed by any attribute using views. Custom, persistent
    representations of the data.
  • Each view must have a map function and may also have a reduce function
  • View indices are stored in B-trees for efficient lookup by map key
  • Stored in special documents called _design documents

Windy City DB                         8                              June 26, 2010
INCREMENTAL
• Computing    a view can be expensive, so CouchDB saves
  the result in a B-tree and keeps it up-to-date
• Only new docs or changed docs get ‘re-indexed’
• Leaf nodes store map results, inner nodes store reductions
  of children




                http://horicky.blogspot.com/2008/10/couchdb-implementation.html
Windy City DB                                  9                                  June 26, 2010
ROBUST

•   Never overwrite previously committed
    data

•   Append only b+trees, ‘copy-on-write’

•   Server crash, power failure? just restart
    CouchDB -- there is no “repair”

•   Take snapshots with “cp”
                                                J.C. Anderson
•   ACID at the single document level


Windy City DB                            10      June 26, 2010
REPLICATION
source               target


                                   progress




                  The beauty of MVCC     one click
                CouchDB => “Cloud ready”
Windy City DB                 11                June 26, 2010
REPLICATION
•   Peer-based, bi-directional replication using normal HTTP
•   Mediated by a replicator process which can
    live on the source, target, or somewhere else
    entirely
•   Replicate a subset of documents in a DB
    meeting criteria defined in a custom filter
    function
•   Applications (_design documents) replicate
    along with the data
•   Ideal for offline applications: “ground
    computing”

Windy City DB                          12                      June 26, 2010
FILTERED REPLICATION

                       Write the filter function



                                  Embed it in a design
                                         doc


                                          Specify in the
                                        replication request
Windy City DB             13                       June 26, 2010
MULTI-COUCH SETUPS
          Master-Slave        Robust Multi-Master




         Master-Master




Windy City DB            14                     June 26, 2010
CONFLICTS
             PUT /a/foo                                        PUT /b/foo



                                       replicate

                                        Conflict

     •   Replication can introduce conflicts in a multi-master setup
     •   CouchDB deterministically chooses a winner but the loser is saved with
         the document as a conflicting rev
     •   Conflicting revs are replicated; both source and target will agree on
         winning and losing revs
     •   Compacting the DB removes all losing revs
Windy City DB                              15                                   June 26, 2010
BUILDING A BIG COUCH




                             D oesn’t
                Why CouchDB ^ Doesn’t Scale
Windy City DB                16               June 26, 2010
WHAT WE TALK ABOUT WHEN WE
          TALK ABOUT SCALING
•   Horizontal scaling: more servers creates more capacity
•   Transparent to the application: adding more capacity should not affect
    the business logic of the application.
•   No single point of failure.                                   Physics Joke!
                                    Pseudo Scalars




                http://adam.heroku.com/past/2009/7/6/sql_databases_dont_scale/
Windy City DB                                17                                  June 26, 2010
COUCHDB LOUNGE
•   Proxy-based partitioning and clustering             PUT/GET

    application
•   Designed originally for use at Meebo                Dumbproxy
                                                          (nginx)

•   Uses consistent hashing to partition docs
    across nodes
•   Dumbproxy - nginx module that handles
    simple GETs and PUTs
•   Smartproxy - A twisted/python daemon
    that handles view requests                         Smartproxy

•   Want to know more? R. Leeds (tilgovi)
    http://tilgovi.github.com/couchdb-lounge/
                                                     GET /_deisgn/...



Windy City DB                                   18                      June 26, 2010
OPEN CLOUDANT
                                                                                  •        Clustering in a ring (a la Dynamo)
PUT http://alan.cloudant.com/dbname/blah?w=2                                      •        Any node can handle a request
                                                                                  •        O(1) lookup
N=3
                       Load Balancer
                                                                                  •        Quorum system (N, R, W)
W=2
R=2
                                                                                  •        Views distributed like documents
            24
                           Node 1
                                                   No
                                                                                  •        Distributed erlang
       de             A   B     C      D             de
     No                                    B                2

      Y
             Z
                 A
                                               C
                                                        D
                                                                                  •        Masterless
 X                   hash(blah) = E                             E

                                                                    C        N
                                                                                 od                           ✓ Horiziontally   Scalable
                                                                                   e
                                                                         D             3
                                                                             E                                ✓ No   SPOF
                                                                                       F

                                                                                                              ✓ Transparent   to the
                                                                                             D
                                                                                                                application

                                                                                                     No
                                                                                                 E


                                                                                                         de
                                                                                                          4
                                                                                                     F

                                                                                                                   Coming soon to a
                                                                                                         G


                                                                                                                    github near you!
Windy City DB                                                       19                                                        June 26, 2010
IN THE WILD




• 15+  million deployments
                                  • Activecommercial support
• 3 books
                                  • 1.0 imminent
• Vibrant, open community
Windy City DB                20                       June 26, 2010
CASE #1: REALTIME ANALYTICS

•   Analytics on high-rate advertising data
•   ETL analysis workflow too slow for their customers (24 hr cycle)
    •   Needed a realtime solution
•   Complicated SQL stored procedures for social graph analysis
    required 40+ postgres tables
•   Replaced it all with a single CouchDB document type and two
    views:
    •   group level collation to bin data at multiple granularities => customers get
        updated results in seconds, not hours
    •   single view (30 lines of JS) for graph analysis.

Windy City DB                                 21                             June 26, 2010
MONEY QUOTE

           Migrating to CouchDB really opened a lot of doors
           for us product-wise. The time delay between data
          arriving in our systems and becoming available to our
        customers went from 24 hours to less than 30 min - on
         similar hardware - even while we greatly increased the
             level of granularity that our processing provided




Windy City DB                     22                         June 26, 2010
CASE #2: EASYBIB
•   Online bibliography service, ~10 years old, initially built on MySQL
    (and Coldfusion)
•   Had suffered through many migrations
•   Choice: massive sharding and replication of MySQL v. “another
    option”
•   Why Couch:
    •   Schema Free (replacing 40 - 50 tables with 3 DBs)
    •   Easily scalable
    •   Strong community support


                “In your best Borat voice: ‘Great Success!’”
Windy City DB                         23                            June 26, 2010
CASE #3: MEEBO
• “All
     your friends and networks, from wherever you are.”
• Why Couch?
   • No Schema (and ergo, no schema migrations)
   • Replication
   • Could deal with queries that would break on a sharded RDBMS
   • REST interface -- easy to re-use existing tools and libraries
   • Easy to write a proxy layer that keeps sharding out of the app
     logic
• Wishes?       Speed, API stability, native clustering

Windy City DB                      24                       June 26, 2010
PARAPHRASING THE MASSES
• Why           CouchDB?
   •   Simple, robust, concurrent, fun
   •   successful in production
• Why           Not Couch?
   •   Missing Features
       •   ad hoc queries
       •   authz/authn
       •   doesn’t scale
   •   Too New -- api still changing, still alpha
   •   “Too Slow”

Windy City DB                          25           June 26, 2010
PARAPHRASING THE MASSES
• Why           CouchDB?
   •   Simple, robust, concurrent, fun, scalable, powerful
   •   successful in production, active community, industry adoption
• Why           Not Couch?
   •   Missing Features
       •   ad hoc queries
       •   authz/authn
       •   doesn’t scale
   •   Too New -- api still changing, still alpha
   •   “Too Slow”

Windy City DB                          25                       June 26, 2010
PARAPHRASING THE MASSES
• Why           CouchDB?
   •   Simple, robust, concurrent, fun, scalable, powerful
   •   successful in production, active community, industry adoption
• Why           Not Couch?
   •   Missing Features
       •   ad hoc queries   True, by design
       •   authz/authn      Included in 0.11
       •   doesn’t scale    Lounge, Pillow, Open Cloudant, etc
   • Too New -- api still changing, still alpha
   • “Too Slow”                0.11 Feature freeze and 1.0 imminent
                     Perhaps, but...
Windy City DB                                 25                 June 26, 2010
DESERVING OF MORE TIME
• CouchApp:      HTML+JS framework for building
   lightweight, portable apps and serving them directly
   from CouchDB
   •   http://github.com/couchapp/couchapp/

• External      indexers like CouchDB-Lucene
   •   http://github.com/rnewson/couchdb-lucene

• The      plethora of client libraries and tools...

Windy City DB                      26                  June 26, 2010
TRY IT OUT


Hosted Free:
Cloudant.com


                             Easy Offline:
                             CouchDBX



Windy City DB       27             June 26, 2010
THANK YOU
• Books
   • CouchDB: The   Definitive Guide. J. Chris Anderson, Jan
    Lehnardt, Noah Slater
  • Beginning CouchDB. Joe Lennon
• Web
  • http://wiki.apache.org/couchdb/
  • http://planet.couchdb.org/
• IRC
  • Freenode #couchdb
  • Freenode #cloudant


Windy City DB                    28                           June 26, 2010
relax
AUTHZ/AUTHN
• Remember, Couch          acts like a web service
• Authentication:
   •   0.11+ ships with support for OAuth, cookie, and basic
   •   Handlers specified in a config file
   •   Users defined in authentication database (“_users” by default)
• Authorization
   •   3 levels: DB reader, DB admin, Server Admin
   •   Per DB roles defined in security document


Windy City DB                       30                         June 26, 2010
EXAMPLES
           User Document




           Security Document
                                     Caution!
                                     Do not leave arrays blank


                                    http://wiki.apache.org/couchdb/
                                     Security_Features_Overview
Windy City DB                  31                          June 26, 2010
DRAWBACKS

• “Futon  -- difficult to use for installations that have a lot
   of DBs (1000+)”
• “Tools        for managing design docs are deficient”
• “Client
        libraries too focused on Couch as the ‘M’ in
   MVC apps.”
• “Couch         1.0 is a moving target”


Windy City DB                       32                   June 26, 2010

Weitere ähnliche Inhalte

Was ist angesagt?

The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)Simon Haslam
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
Ramakrishnan Keynote Ladis2009
Ramakrishnan Keynote Ladis2009Ramakrishnan Keynote Ladis2009
Ramakrishnan Keynote Ladis2009yarapavan
 
Partitioning CCGrid 2012
Partitioning CCGrid 2012Partitioning CCGrid 2012
Partitioning CCGrid 2012Weiwei Chen
 

Was ist angesagt? (6)

The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
Ramakrishnan Keynote Ladis2009
Ramakrishnan Keynote Ladis2009Ramakrishnan Keynote Ladis2009
Ramakrishnan Keynote Ladis2009
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
 
Partitioning CCGrid 2012
Partitioning CCGrid 2012Partitioning CCGrid 2012
Partitioning CCGrid 2012
 
Nosql
NosqlNosql
Nosql
 

Ähnlich wie Learn to Relax with CouchDB for Beginners

Scaling CouchDB with BigCouch
Scaling CouchDB with BigCouchScaling CouchDB with BigCouch
Scaling CouchDB with BigCouchCloudant
 
20100310 Miller Sts
20100310 Miller Sts20100310 Miller Sts
20100310 Miller StsMike Miller
 
NoSQL on the move
NoSQL on the moveNoSQL on the move
NoSQL on the moveCodemotion
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...IndicThreads
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...MayaData
 
Cloud Camp Chicago Dec 2012 Slides
Cloud Camp Chicago Dec 2012 SlidesCloud Camp Chicago Dec 2012 Slides
Cloud Camp Chicago Dec 2012 SlidesRyan Koop
 
Cloud Camp Chicago Dec 2012 - All presentations
Cloud Camp Chicago Dec 2012 - All presentationsCloud Camp Chicago Dec 2012 - All presentations
Cloud Camp Chicago Dec 2012 - All presentationsCloudCamp Chicago
 
Above the cloud: Big Data and BI
Above the cloud: Big Data and BIAbove the cloud: Big Data and BI
Above the cloud: Big Data and BIDenny Lee
 
Mobile Offline First
Mobile Offline FirstMobile Offline First
Mobile Offline FirstJulio Castro
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAnormanbarker
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octParadigma Digital
 
Alternative Database Technology in the Cloud
Alternative Database Technology in the CloudAlternative Database Technology in the Cloud
Alternative Database Technology in the CloudBret Piatt
 
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...MongoDB
 
Docksal: Better than VMs
Docksal:  Better than VMsDocksal:  Better than VMs
Docksal: Better than VMsLeonid Makarov
 
Webinar- Tea for the Tillerman
Webinar- Tea for the TillermanWebinar- Tea for the Tillerman
Webinar- Tea for the TillermanCumulus Networks
 
The architecture of oak
The architecture of oakThe architecture of oak
The architecture of oakMichael Dürig
 
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...MongoDB
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 

Ähnlich wie Learn to Relax with CouchDB for Beginners (20)

Scaling CouchDB with BigCouch
Scaling CouchDB with BigCouchScaling CouchDB with BigCouch
Scaling CouchDB with BigCouch
 
20100310 Miller Sts
20100310 Miller Sts20100310 Miller Sts
20100310 Miller Sts
 
NoSQL on the move
NoSQL on the moveNoSQL on the move
NoSQL on the move
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
 
Cloud Camp Chicago Dec 2012 Slides
Cloud Camp Chicago Dec 2012 SlidesCloud Camp Chicago Dec 2012 Slides
Cloud Camp Chicago Dec 2012 Slides
 
Cloud Camp Chicago Dec 2012 - All presentations
Cloud Camp Chicago Dec 2012 - All presentationsCloud Camp Chicago Dec 2012 - All presentations
Cloud Camp Chicago Dec 2012 - All presentations
 
Above the cloud: Big Data and BI
Above the cloud: Big Data and BIAbove the cloud: Big Data and BI
Above the cloud: Big Data and BI
 
Mobile Offline First
Mobile Offline FirstMobile Offline First
Mobile Offline First
 
No sql Database
No sql DatabaseNo sql Database
No sql Database
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4oct
 
Alternative Database Technology in the Cloud
Alternative Database Technology in the CloudAlternative Database Technology in the Cloud
Alternative Database Technology in the Cloud
 
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
 
Docksal: Better than VMs
Docksal:  Better than VMsDocksal:  Better than VMs
Docksal: Better than VMs
 
Webinar- Tea for the Tillerman
Webinar- Tea for the TillermanWebinar- Tea for the Tillerman
Webinar- Tea for the Tillerman
 
The architecture of oak
The architecture of oakThe architecture of oak
The architecture of oak
 
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 

Kürzlich hochgeladen

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Kürzlich hochgeladen (20)

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

Learn to Relax with CouchDB for Beginners

  • 1. LEARNING TO RELAX: CouchDB for Beginners Windy City DB 1
  • 2. OUTLINE • Introduction and Overview • CouchDB Basics • Special Topics in Relaxation: Scaling CouchDB • Use Cases In the Wild • Takeaways Windy City DB 2 June 26, 2010
  • 3. HI • Alan Hoffman • @_hoffman • alan@cloudant.com • Experimental particle physicist • Background: machine learning, big data analysis, distributed systems • Co-founder of Cloudant (Hosted Couch) • Not a committer, but... Windy City DB 3 June 26, 2010
  • 4. COUCH: THE BIG PICTURE • Apache project • Schema-free document database management system • Robust, concurrent, fault-tolerant • RESTful JSON API • Custom persistent views using MapReduce • Bi-directional incremental replication • Futon web admin console Windy City DB 4 June 26, 2010
  • 5. WHO CARES? The internet happened, and we ignored it. In retrospect, that was a mistake. -Bill Warner (Avid, Wildfire, Techstars) Summer, 2008 Disruptive technologies enable new business Windy City DB 5 June 26, 2010
  • 6. DOCUMENTS Primary Key MVCC & Insta-cache Nested Structures • Reserved fields are prefixed with an underscore • MVCC _rev deterministically generated from doc content Binary Attachments • Binary attachments Windy City DB 6 June 26, 2010
  • 7. RESTFUL API • Create PUT /mydb/mydocid • Retrieve GET /mydb/mydocid “Built of the Web Completely embraces... HTTP” • Update PUT /mydb/mydocid -Jacob Kaplan-Moss • Delete October 2007 DELETE /mydb/mydocid GET /mydb/_all_docs?include_docs=true http://wiki.apache.org/couchdb/Reference Windy City DB 7 June 26, 2010
  • 8. VIEWS value ap du ce m re key • Docs can be indexed by any attribute using views. Custom, persistent representations of the data. • Each view must have a map function and may also have a reduce function • View indices are stored in B-trees for efficient lookup by map key • Stored in special documents called _design documents Windy City DB 8 June 26, 2010
  • 9. INCREMENTAL • Computing a view can be expensive, so CouchDB saves the result in a B-tree and keeps it up-to-date • Only new docs or changed docs get ‘re-indexed’ • Leaf nodes store map results, inner nodes store reductions of children http://horicky.blogspot.com/2008/10/couchdb-implementation.html Windy City DB 9 June 26, 2010
  • 10. ROBUST • Never overwrite previously committed data • Append only b+trees, ‘copy-on-write’ • Server crash, power failure? just restart CouchDB -- there is no “repair” • Take snapshots with “cp” J.C. Anderson • ACID at the single document level Windy City DB 10 June 26, 2010
  • 11. REPLICATION source target progress The beauty of MVCC one click CouchDB => “Cloud ready” Windy City DB 11 June 26, 2010
  • 12. REPLICATION • Peer-based, bi-directional replication using normal HTTP • Mediated by a replicator process which can live on the source, target, or somewhere else entirely • Replicate a subset of documents in a DB meeting criteria defined in a custom filter function • Applications (_design documents) replicate along with the data • Ideal for offline applications: “ground computing” Windy City DB 12 June 26, 2010
  • 13. FILTERED REPLICATION Write the filter function Embed it in a design doc Specify in the replication request Windy City DB 13 June 26, 2010
  • 14. MULTI-COUCH SETUPS Master-Slave Robust Multi-Master Master-Master Windy City DB 14 June 26, 2010
  • 15. CONFLICTS PUT /a/foo PUT /b/foo replicate Conflict • Replication can introduce conflicts in a multi-master setup • CouchDB deterministically chooses a winner but the loser is saved with the document as a conflicting rev • Conflicting revs are replicated; both source and target will agree on winning and losing revs • Compacting the DB removes all losing revs Windy City DB 15 June 26, 2010
  • 16. BUILDING A BIG COUCH D oesn’t Why CouchDB ^ Doesn’t Scale Windy City DB 16 June 26, 2010
  • 17. WHAT WE TALK ABOUT WHEN WE TALK ABOUT SCALING • Horizontal scaling: more servers creates more capacity • Transparent to the application: adding more capacity should not affect the business logic of the application. • No single point of failure. Physics Joke! Pseudo Scalars http://adam.heroku.com/past/2009/7/6/sql_databases_dont_scale/ Windy City DB 17 June 26, 2010
  • 18. COUCHDB LOUNGE • Proxy-based partitioning and clustering PUT/GET application • Designed originally for use at Meebo Dumbproxy (nginx) • Uses consistent hashing to partition docs across nodes • Dumbproxy - nginx module that handles simple GETs and PUTs • Smartproxy - A twisted/python daemon that handles view requests Smartproxy • Want to know more? R. Leeds (tilgovi) http://tilgovi.github.com/couchdb-lounge/ GET /_deisgn/... Windy City DB 18 June 26, 2010
  • 19. OPEN CLOUDANT • Clustering in a ring (a la Dynamo) PUT http://alan.cloudant.com/dbname/blah?w=2 • Any node can handle a request • O(1) lookup N=3 Load Balancer • Quorum system (N, R, W) W=2 R=2 • Views distributed like documents 24 Node 1 No • Distributed erlang de A B C D de No B 2 Y Z A C D • Masterless X hash(blah) = E E C N od ✓ Horiziontally Scalable e D 3 E ✓ No SPOF F ✓ Transparent to the D application No E de 4 F Coming soon to a G github near you! Windy City DB 19 June 26, 2010
  • 20. IN THE WILD • 15+ million deployments • Activecommercial support • 3 books • 1.0 imminent • Vibrant, open community Windy City DB 20 June 26, 2010
  • 21. CASE #1: REALTIME ANALYTICS • Analytics on high-rate advertising data • ETL analysis workflow too slow for their customers (24 hr cycle) • Needed a realtime solution • Complicated SQL stored procedures for social graph analysis required 40+ postgres tables • Replaced it all with a single CouchDB document type and two views: • group level collation to bin data at multiple granularities => customers get updated results in seconds, not hours • single view (30 lines of JS) for graph analysis. Windy City DB 21 June 26, 2010
  • 22. MONEY QUOTE Migrating to CouchDB really opened a lot of doors for us product-wise. The time delay between data arriving in our systems and becoming available to our customers went from 24 hours to less than 30 min - on similar hardware - even while we greatly increased the level of granularity that our processing provided Windy City DB 22 June 26, 2010
  • 23. CASE #2: EASYBIB • Online bibliography service, ~10 years old, initially built on MySQL (and Coldfusion) • Had suffered through many migrations • Choice: massive sharding and replication of MySQL v. “another option” • Why Couch: • Schema Free (replacing 40 - 50 tables with 3 DBs) • Easily scalable • Strong community support “In your best Borat voice: ‘Great Success!’” Windy City DB 23 June 26, 2010
  • 24. CASE #3: MEEBO • “All your friends and networks, from wherever you are.” • Why Couch? • No Schema (and ergo, no schema migrations) • Replication • Could deal with queries that would break on a sharded RDBMS • REST interface -- easy to re-use existing tools and libraries • Easy to write a proxy layer that keeps sharding out of the app logic • Wishes? Speed, API stability, native clustering Windy City DB 24 June 26, 2010
  • 25. PARAPHRASING THE MASSES • Why CouchDB? • Simple, robust, concurrent, fun • successful in production • Why Not Couch? • Missing Features • ad hoc queries • authz/authn • doesn’t scale • Too New -- api still changing, still alpha • “Too Slow” Windy City DB 25 June 26, 2010
  • 26. PARAPHRASING THE MASSES • Why CouchDB? • Simple, robust, concurrent, fun, scalable, powerful • successful in production, active community, industry adoption • Why Not Couch? • Missing Features • ad hoc queries • authz/authn • doesn’t scale • Too New -- api still changing, still alpha • “Too Slow” Windy City DB 25 June 26, 2010
  • 27. PARAPHRASING THE MASSES • Why CouchDB? • Simple, robust, concurrent, fun, scalable, powerful • successful in production, active community, industry adoption • Why Not Couch? • Missing Features • ad hoc queries True, by design • authz/authn Included in 0.11 • doesn’t scale Lounge, Pillow, Open Cloudant, etc • Too New -- api still changing, still alpha • “Too Slow” 0.11 Feature freeze and 1.0 imminent Perhaps, but... Windy City DB 25 June 26, 2010
  • 28. DESERVING OF MORE TIME • CouchApp: HTML+JS framework for building lightweight, portable apps and serving them directly from CouchDB • http://github.com/couchapp/couchapp/ • External indexers like CouchDB-Lucene • http://github.com/rnewson/couchdb-lucene • The plethora of client libraries and tools... Windy City DB 26 June 26, 2010
  • 29. TRY IT OUT Hosted Free: Cloudant.com Easy Offline: CouchDBX Windy City DB 27 June 26, 2010
  • 30. THANK YOU • Books • CouchDB: The Definitive Guide. J. Chris Anderson, Jan Lehnardt, Noah Slater • Beginning CouchDB. Joe Lennon • Web • http://wiki.apache.org/couchdb/ • http://planet.couchdb.org/ • IRC • Freenode #couchdb • Freenode #cloudant Windy City DB 28 June 26, 2010
  • 31. relax
  • 32. AUTHZ/AUTHN • Remember, Couch acts like a web service • Authentication: • 0.11+ ships with support for OAuth, cookie, and basic • Handlers specified in a config file • Users defined in authentication database (“_users” by default) • Authorization • 3 levels: DB reader, DB admin, Server Admin • Per DB roles defined in security document Windy City DB 30 June 26, 2010
  • 33. EXAMPLES User Document Security Document Caution! Do not leave arrays blank http://wiki.apache.org/couchdb/ Security_Features_Overview Windy City DB 31 June 26, 2010
  • 34. DRAWBACKS • “Futon -- difficult to use for installations that have a lot of DBs (1000+)” • “Tools for managing design docs are deficient” • “Client libraries too focused on Couch as the ‘M’ in MVC apps.” • “Couch 1.0 is a moving target” Windy City DB 32 June 26, 2010

Hinweis der Redaktion