SlideShare ist ein Scribd-Unternehmen logo
1 von 90
Open source, high performance database




                                   Jared Rosoff
                                     @forjared
                                                  1
Challenges                 Opportunities
• Variably typed data      • Agility
• Complex objects          • Cost reduction
• High transaction rates   • Simplification
• Large data size
• High availability




                                              2
AGILE DEVELOPMENT
                                • Iterative and continuous
                                • New and emerging apps




VOLUME AND TYPE
OF DATA
• Trillions of records
• 10’s of millions of queries            NEW ARCHITECTURES
  per second                             • Systems scaling horizontally,
• Volume of data                           not vertically
• Semi-structured and                    • Commodity servers
  unstructured data                      • Cloud Computing



                                                                       3
PROJECT
                                                                  DENORMALIZE
                                                           START
DEVELOPER PRODUCTIVITY DECREASES                                  DATA MODEL
                                                                                STOP USING
                                                                                   JOINS    CUSTOM
• Needed to add new software layers of ORM, Caching,                                     CACHING LAYER
  Sharding and Message Queue                                                                         CUSTOM
                                                                                                    SHARDING
• Polymorphic, semi-structured and unstructured data
  not well supported




                                                       COST OF DATABASE INCREASES
                                             +1 YEAR
                                                       • Increased database licensing cost
                                                       • Vertical, not horizontal, scaling
                                                       • High cost of SAN
                                 +6 MONTHS

                      +90 DAYS

           +30 DAYS
  LAUNCH
                                                                                                               4
How we got here




                  5
•   Variably typed data
•   Complex data objects
•   High transaction rates
•   Large data size
•   High availability
•   Agile development




                             6
• Metadata management
• EAV anti-pattern
• Sparse table anti-
  pattern




                        7
Entity Attribute    Value
1      Type         Book
1      Title        Inside Intel
                                    Difficult to query
1      Author       Andy Grove
2      Type         Laptop          Difficult to index
2      Manufactur   Dell
       er                          Expensive to query
2      Model        Inspiron
2      RAM          8gb
3      Type         Cereal
                                                         8
Type     Title           Author       Manufacturer        Model      RAM   Screen
                                                                           Width
Book     Inside Intel    Andy Grove
Laptop                                Dell                Inspiron   8gb   12”
TV                                    Panasonic           Viera            52”
MP3      Margin Walker   Fugazi       Dischord


                                  Constant schema changes


                                      Space inefficient


                                      Overloaded fields


                                                                                    9
Type   Manufacturer Model         RA      Screen
                                                        M       Width
                      Laptop Dell            Inspiron 8gb       12”



  Querying is difficult               Manufacturer Model        Screen
                                                                Width
Hard to add more types                Panasonic       Viera     52”




                             Title           Author           Manufacturer
                             Margin          Fugazi           Dischord
                             Walker


                                                                             10
Complex
 objects


       11
Challenges                         Quer
                                    y
                                                                Quer
                                                                 y
                                                  Quer
• Constant load from client                        y

• Far in excess of single server          Quer
                                           y
                                                             Quer
                                                              y
  capacity
• Can never take the system                  Query
                                                       Query

  down
                                                     Query




                                                 Database




                                                                       12
Challenges
• Adding more storage over time
• Aging out data that’s no longer needed
• Minimizing resource overhead of “cold” data

            Recent Data                      Old Data



            Fast Storage                  Archival Storage




                           Add Capacity

                                                             13
14
15
Variably typed     • Rigid schemas
      data
   Complex          • Normalization can be hard
   Objects          • Dependent on joins

High transaction    • Vertical scaling
      rate          • Poor data locality

                    • Difficult to maintain consistency & HA
High Availability   • HA a bolt-on to many RDBMS

    Agile           • Schema changes
 Development        • Monolithic data model

                                                               16
A new data model




                   17
var post = { author: “Jared”,
           date: new Date(),
           text: “NoSQL Now 2012”,
           tags: [“NoSQL”, “MongoDB”]}

> db.posts.save(post)




                                         18
>db.posts.find()

 { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
   author : ”Jared",
   date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
   text : ”NoSQL Now 2012",
   tags : [ ”NoSQL", ”MongoDB" ] }

Notes:
 - _id is unique, but can be anything you’d like


                                                       19
Create index on any Field in Document

 // 1 means ascending, -1 means descending

 >db.posts.ensureIndex({author: 1})

 >db.posts.find({author: ’Jared'})

 { _id    : ObjectId("4c4ba5c0672c685e5e8aabf3"),
   author : ”Jared",
   ... }
                                                    20
• Conditional Operators
   – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type
   – $lt, $lte, $gt, $gte

  // find posts with any tags
  > db.posts.find( {tags: {$exists: true }} )

  // find posts matching a regular expression
  > db.posts.find( {author: /^Jar*/i } )

  // count posts by author
  > db.posts.find( {author: ‘Jared’} ).count()
                                                                    21
• $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit

> comment = { author: “Brendan”,
           date: new Date(),
           text: “I want a freakin pony”}

> db.posts.update( { _id: “...” },
            $push: {comments: comment} );




                                                               22
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
  author : ”Jared",
  date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
  text : ”NoSQL Now 2012",
  tags : [ ”NoSQL", ”MongoDB" ],
  comments : [
    {
        author : "Brendan",
        date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)",
        text : ”I want a freakin pony"
    }
  ]}                                                        23
// Index nested documents
> db.posts.ensureIndex( “comments.author”:1 )
  db.posts.find({‘comments.author’:’Brendan’})

// Index on tags
> db.posts.ensureIndex( tags: 1)
> db.posts.find( { tags: ’MongoDB’ } )

// geospatial index
> db.posts.ensureIndex( “author.location”: “2d” )
> db.posts.find( “author.location” : { $near : [22,42] } )
                                                             24
db.posts.aggregate(
  { $project : {
      author : 1,
      tags : 1,
  } },
  { $unwind : “$tags” },
  { $group : {
      _id : { tags : 1 },
      authors : {
          $addToSet : “$author”
  }}}
);                                25
It’s highly available




                        26
Write
                  Primary
         Read
                             Asynchronous
Driver




                 Secondary   Replication
         Read


                 Secondary
         Read



                                            27
Primary
Driver




                Secondary
         Read


                Secondary
         Read



                            28
Primary

         Write               Automatic
Driver




                  Primary    Leader Election
         Read

                 Secondary
         Read



                                               29
Secondary
         Read

         Write
Driver




                  Primary
         Read

                 Secondary
         Read



                             30
With tunable
consistency




               31
Write
Driver



                  Primary
         Read

                 Secondary


                 Secondary




                             32
Write
                  Primary
Driver




                 Secondary
         Read


                 Secondary
         Read



                             33
Durability




             34
•   Fire and forget
•   Wait for error
•   Wait for fsync
•   Wait for journal sync
•   Wait for replication




                            35
Driver           Primary
         write

                           apply in memory




                                             36
Driver          Primary
        write
     getLastError
                          apply in memory




                                            37
Driver          Primary
         write
     getLastError
                          apply in memory
        j:true
                          Write to journal




                                             38
Driver          Primary                     Secondary
         write
     getLastError
                          apply in memory
        w:2
                              replicate




                                                        39
Value          Meaning
<n:integer>    Replicate to N members of replica set
“majority”     Replicate to a majority of replica set
               members
<m:modeName>   Use custom error mode name




                                                        40
{ _id: “someSet”,
   members: [
     { _id:0, host:”A”, tags: { dc: “ny”}},
     { _id:1, host:”B”, tags: { dc: “ny”}},
     { _id:2, host:”C”, tags: { dc: “sf”}},
     { _id:3, host:”D”, tags: { dc: “sf”}},
     { _id:4, host:”E”, tags: { dc: “cloud”}},
   settings: {
       getLastErrorModes: {
                                              These are the
           veryImportant: { dc: 3 },         modes you can
           sortOfImportant: { dc: 2 }          use in write
       }                                        concern
   }
 }




                                                              41
• Between 0..1000
• Highest member that is up to date wins
   – Up to date == within 10 seconds of primary
• If a higher priority member catches up, it will force election
  and win




            Primary         Secondary       Secondary
             priority = 3    priority = 2     priority = 1




                                                                   42
• Lags behind master by configurable time delay
• Automatically hidden from clients
• Protects against operator errors
   – Accidentally delete database
   – Application corrupts data




                                                  43
• Vote in elections
• Don’t store a copy of data
• Use as tie breaker




                               44
Data Center




Primary




              45
Data Center


 Zone 1      Zone 2         Zone 3



Primary   Secondary      Secondary




                                     46
Data Center


 Zone 1      Zone 2          Zone 3



Primary   Secondary      Secondary
                         hidden = true



                           backups




                                         47
Active Data Center          Standby Data Center


  Zone 1                 Zone 2



Primary            Secondary        Secondary
priority = 1         priority = 1    priority = 0




                                                          48
West Coast DC    Central DC     East Coast DC
                   Zone 1           Zone 1


                 Primary        Secondary
                 priority = 2     priority = 1

                   Zone 2           Zone 2
  Abiter
                Secondary       Secondary
                 priority = 2    priority = 1




                                                 49
Sharding




           50
client   client   client   client

                                             config
         mongos            mongos
                                             config
                                             config

                                             Config
                                             Servers
mongod            mongod            mongod
mongod            mongod            mongod
mongod            mongod            mongod

Shard             Shard             Shard



                                                       51
> db.runCommand( { shardcollection: “test.users”,
                        key: { email: 1 }} )



{
    name: “Jared”,
    email: “jsr@10gen.com”,
}
{
    name: “Scott”,
    email: “scott@10gen.com”,
}
{
    name: “Dan”,
    email: “dan@10gen.com”,
}


                                                         52
-∞   +
     ∞




         53
-∞                                              +
                                                ∞


     dan@10gen.com            scott@10gen.com

              jsr@10gen.com




                                                    54
Split!



-∞                                              +
                                                ∞


     dan@10gen.com            scott@10gen.com

              jsr@10gen.com




                                                    55
This is a              Split!         This is a
      chunk                                 chunk


-∞                                                     +
                                                       ∞


            dan@10gen.com            scott@10gen.com

                     jsr@10gen.com




                                                           56
-∞                                              +
                                                ∞


     dan@10gen.com            scott@10gen.com

              jsr@10gen.com




                                                    57
-∞                                              +
                                                ∞


     dan@10gen.com            scott@10gen.com

              jsr@10gen.com




                                                    58
Split!



-∞                                              +
                                                ∞


     dan@10gen.com            scott@10gen.com

              jsr@10gen.com




                                                    59
Min Key               Max Key              Shard
-∞                    dan@10gen.com        1
dan@10gen.com         jsr@10gen.com        1
jsr@10gen.com         scott@10gen.com      1
scott@10gen.com       +∞                   1


     • Stored in the config servers
     • Cached in MongoS
     • Used to route requests and keep cluster balanced




                                                          60
mongos
                                                                                  config
                                      balancer
                                                                                  config
Chunks!
                                                                                  config




  1    2    3    4    13    14   15   16         25    26   27   28   37    38   39   40

  5    6    7    8    17    18   19   20         29    30   31   32   41    42   43   44

  9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


      Shard 1              Shard 2                    Shard 3              Shard 4



                                                                                           61
mongos
                                                                                  config
                                      balancer
                                                                                  config

                    Imbalance
                     Imbalance                                                    config




1    2    3    4

5    6    7    8

9    10   11   12     21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1                Shard 2                    Shard 3              Shard 4



                                                                                           62
mongos
                                                                                 config
                                     balancer
                                                                                 config

                               Move chunk 1                                      config
                               to Shard 2




1    2    3    4

5    6    7    8

9    10   11   12   21    22    23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                     Shard 3              Shard 4



                                                                                          63
mongos
                                                                                config
                                    balancer
                                                                                config

                                                                                config




1    2    3    4

5    6    7    8

9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                    Shard 3              Shard 4



                                                                                         64
mongos
                                                                                config
                                    balancer
                                                                                config
                                         Chunk 1 now lives
                                            on Shard 2                          config




     2    3    4

5    6    7    8    1

9    10   11   12   21    22   23   24         33    34   35   36   45    46   47   48


    Shard 1              Shard 2                    Shard 3              Shard 4



                                                                                         65
By Shard Key   Routed            db.users.find(
                                   {email: “jsr@10gen.com”})
Sorted by      Routed in order   db.users.find().sort({email:-1})
shard key
Find by non    Scatter Gather    db.users.find({state:”CA”})
shard key
Sorted by      Distributed merge db.users.find().sort({state:1})
               sort
non shard
key



                                                                   66
Inserts   Requires shard   db.users.insert({
          key                name: “Jared”,
                             email: “jsr@10gen.com”})

Removes   Routed           db.users.delete({
                             email: “jsr@10gen.com”})
          Scattered        db.users.delete({name: “Jared”})

Updates   Routed           db.users.update(
                             {email: “jsr@10gen.com”},
                             {$set: { state: “CA”}})
          Scattered        db.users.update(
                             {state: “FZ”},
                             {$set:{ state: “CA”}}, false, true )




                                                              67
1

                                       1. Query arrives at
                         4
                    mongos                MongoS
                                       2. MongoS routes query
                                          to a single shard
                                       3. Shard returns results
                                          of query
                    2
                                       4. Results returned to
                                          client
                    3




Shard 1   Shard 2            Shard 3




                                                                  68
1

                             4                 1. Query arrives at
                        mongos                    MongoS
                                               2. MongoS broadcasts
                                                  query to all shards
                                               3. Each shard returns
                                                  results for query
          2                                    4. Results combined and
                    2            2                returned to client
                                      3
          3             3



Shard 1       Shard 2                Shard 3




                                                                         69
1

                             6                1. Query arrives at
                        mongos                   MongoS
                                              2. MongoS broadcasts
                            5                    query to all shards
                                              3. Each shard locally sorts
                                                 results
          2                                   4. Results returned to
                    2           2                mongos
          4             4            4        5. MongoS merge sorts
                                                 individual results
3                                             6. Combined sorted result
              3                           3
                                                 returned to client
Shard 1       Shard 2               Shard 3




                                                                            70
Use cases




            71
Content Management       Operational Intelligence      Meta Data Management



        w




            User Data Management         High Volume Data Feeds




                                                                          72
Machine      • More machines, more sensors,
 Generated       more data
   Data        • Variably structured


Stock Market   • High frequency trading
    Data

               • Multiple sources of data
Social Media
               • Each changes their format
 Firehose        constantly

                                                73
Flexible document
                              model can adapt to
                              changes in sensor
                                    format
   Asynchronous writes




 Data
  Data
Sources
    Data
 Sources
     Data                     Write to memory with
  Sources                      periodic disk flush
    Sources




          Scale writes over
           multiple shards



                                                     74
• Large volume of state about users
Ad Targeting   • Very strict latency requirements




               • Expose report data to millions of customers
 Real time     • Report on large volumes of data
dashboards     • Reports that update in real time




Social Media   • What are people talking about?
 Monitoring
                                                               75
Parallelize queries
               Low latency reads
                                   across replicas and
                                          shards




    API
                                      In database
                                      aggregation




Dashboards
                                    Flexible schema
                                   adapts to changing
                                       input data
Can use same cluster
to collect, store, and
   report on data
                                                          76
Intuit relies on a MongoDB-powered real-time analytics tool for small businesses to
   derive interesting and actionable patterns from their customers’ website traffic

            Problem                          Why MongoDB                                Impact
 Intuit hosts more than 500,000      Intuit hosts more than 500,000       In one week Intuit was able to
  websites                             websites                              become proficient in MongoDB
 wanted to collect and analyze       wanted to collect and analyze         development
  data to recommend conversion         data to recommend conversion         Developed application features
  and lead generation                  and lead generation                   more quickly for MongoDB than
  improvements to customers.           improvements to customers.            for relational databases
 With 10 years worth of user         With 10 years worth of user          MongoDB was 2.5 times faster
  data, it took several days to        data, it took several days to         than MySQL
  process the information using a      process the information using a
  relational database.                 relational database.




   We did a prototype for one week, and within one week we had made big progress. Very big progress. It
   was so amazing that we decided, “Let’s go with this.” -Nirmala Ranganathan, Intuit

                                                                                                          77
Rich profiles
                                                          collecting multiple
                                                           complex actions
1   See Ad

              Scale out to support   { cookie_id: ‚1234512413243‛,
               high throughput of      advertiser:{
                                          apple: {
                activities tracked           actions: [
2   See Ad                                      { impression: ‘ad1’, time: 123 },
                                                { impression: ‘ad2’, time: 232 },
                                                { click: ‘ad2’, time: 235 },
                                                { add_to_cart: ‘laptop’,
                                                   sku: ‘asdf23f’,
                                                   time: 254 },
    Click                                       { purchase: ‘laptop’, time: 354 }
3                                            ]
                                          }
                                       }
                                     }
                       Dynamic schemas
                      make it easy to track
                                                     Indexing and
4   Convert             vendor specific
                                                  querying to support
                          attributes
                                                  matching, frequency
                                                       capping
                                                                                    78
Data        • Meta data about artifacts
              • Content in the library
Archiving

              • Have data sources that you don’t have
Information     access to
              • Stores meta-data on those stores and figure
 discovery      out which ones have the content



              • Retina scans
Biometrics    • Finger prints


                                                              79
Indexing and rich
                                               query API for easy
                                             searching and sorting
    db.archives.
       find({ ‚country”: ‚Egypt‛ });

                                                  Flexible data model
                                                     for similar, but
                                                   different objects


{ type: ‚Artefact‛,        { ISBN: ‚00e8da9b‛,
  medium: ‚Ceramic‛,         type: ‚Book‛,
  country: ‚Egypt‛,          country: ‚Egypt‛,
  year: ‚3000 BC‛            title: ‚Ancient Egypt‛
}                          }




                                                                        80
Shutterfly uses MongoDB to safeguard more than six billion images for millions of
  customers in the form of photos and videos, and turn everyday pictures into keepsakes

           Problem                           Why MongoDB                                  Impact
 Managing 20TB of data (six            JSON-based data structure             500% cost reduction and 900%
  billion images for millions of        Provided Shutterfly with an            performance improvement
  customers) partitioning by             agile, high performance,               compared to previous Oracle
  function.                              scalable solution at a low cost.       implementation
 Home-grown key value store on         Works seamlessly with                 Accelerated time-to-market for
  top of their Oracle database           Shutterfly’s services-based            nearly a dozen projects on
  offered sub-par performance            architecture                           MongoDB
 Codebase for this hybrid store                                               Improved Performance by
  became hard to manage                                                         reducing average latency for
 High licensing, HW costs                                                      inserts from 400ms to 2ms.




   The “really killer reason” for using MongoDB is its rich JSON-based data structure, which offers Shutterfly
   an agile approach to develop software. With MongoDB, the Shutterfly team can quickly develop and
   deploy new applications, especially Web 2.0 and social features. -Kenny Gorman, Director of Data Services
                                                                                                                 81
• Comments and user generated
 News Site       content
               • Personalization of content, layout

Multi-Device   • Generate layout on the fly for
 rendering       each device that connects
               • No need to cache static pages


               • Store large objects
  Sharing      • Simple modeling of metadata

                                                      82
Geo spatial indexing
                              Flexible data model                             for location based
GridFS for large
                                 for similar, but                                  searches
 object storage
                               different objects

                                                { camera: ‚Nikon d4‛,
                                                  location: [ -122.418333, 37.775 ]
                                                }



                                                { camera: ‚Canon 5d mkII‛,
                                                  people: [ ‚Jim‛, ‚Carol‛ ],
                                                  taken_on: ISODate("2012-03-07T18:32:35.002Z")
                                                }


                                                { origin: ‚facebook.com/photos/xwdf23fsdf‛,
                                                  license: ‚Creative Commons CC0‛,
                                                  size: {
                                                     dimensions: [ 124, 52 ],
                                                     units: ‚pixels‛
     Horizontal scalability                       }
      for large data sets                       }



                                                                                                    83
Wordnik uses MongoDB as the foundation for its “live” dictionary that stores its entire
                    text corpus – 3.5T of data in 20 billion records

          Problem                           Why MongoDB                                 Impact
 Analyze a staggering amount of       Migrated 5 billion records in a      Reduced code by 75%
  data for a system build on            single day with zero downtime         compared to MySQL
  continuous stream of high-           MongoDB powers every                 Fetch time cut from 400ms to
  quality text pulled from online       website requests: 20m API calls       60ms
  sources                               per day                              Sustained insert speed of 8k
 Adding too much data too             Ability to eliminated                 words per second, with
  quickly resulted in outages;          memcached layer, creating a           frequent bursts of up to 50k per
  tables locked for tens of             simplified system that required       second
  seconds during inserts                fewer resources and was less         Significant cost savings and 15%
 Initially launched entirely on        prone to error.                       reduction in servers
  MySQL but quickly hit
  performance road blocks



   Life with MongoDB has been good for Wordnik. Our code is faster, more flexible and dramatically smaller.
   Since we don’t spend time worrying about the database, we can spend more time writing code for our
   application. -Tony Tam, Vice President of Engineering and Technical Co-founder
                                                                                                              84
• Scale out to large graphs
  Social
  Graphs     • Easy to search and
               process



             • Authentication,
  Identity
               Authorization and
Management
               Accounting

                                           85
Native support for
Arrays makes it easy
to store connections
 inside user profile




                           Sharding partitions
                           user profiles across    Documents enable
            Social Graph    available servers       disk locality of all
                                                  profile data for a user
                                                                            86
Review




         87
• Variety, Velocity and Volume make it difficult
• Documents are easier for many use cases




                                                   88
• Distributed by default
• High availability
• Cloud deployment




                           89
•   Document oriented data model
•   Highly available deployments
•   Strong consistency model
•   Horizontally scalable architecture




                                         90

Weitere ähnliche Inhalte

Was ist angesagt?

Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
IP Expo 2012 Storage Lab Presentation - Nimble Storage
IP Expo 2012 Storage Lab Presentation - Nimble StorageIP Expo 2012 Storage Lab Presentation - Nimble Storage
IP Expo 2012 Storage Lab Presentation - Nimble Storageresponsedatacomms
 
NYC Meetup November 15, 2012
NYC Meetup November 15, 2012NYC Meetup November 15, 2012
NYC Meetup November 15, 2012NuoDB
 
SNW Tarmin 040709
SNW Tarmin 040709SNW Tarmin 040709
SNW Tarmin 040709Eric Herzog
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Gavin Heavyside
 
The Lean Cloud for Startups with AWS - Cost Optimisation
The Lean Cloud for Startups with AWS - Cost OptimisationThe Lean Cloud for Startups with AWS - Cost Optimisation
The Lean Cloud for Startups with AWS - Cost OptimisationAmazon Web Services
 
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...Novell
 
Data Kinetics Products
Data Kinetics ProductsData Kinetics Products
Data Kinetics Productssheena82
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
Improve DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data ManagementImprove DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data Managementsoftbasemarketing
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaCloudera, Inc.
 
Manage rising disk prices with storage virtualization webinar
Manage rising disk prices with storage virtualization webinarManage rising disk prices with storage virtualization webinar
Manage rising disk prices with storage virtualization webinarHitachi Vantara
 
ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...
ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...
ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...Amazon Web Services
 

Was ist angesagt? (14)

Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
IP Expo 2012 Storage Lab Presentation - Nimble Storage
IP Expo 2012 Storage Lab Presentation - Nimble StorageIP Expo 2012 Storage Lab Presentation - Nimble Storage
IP Expo 2012 Storage Lab Presentation - Nimble Storage
 
NYC Meetup November 15, 2012
NYC Meetup November 15, 2012NYC Meetup November 15, 2012
NYC Meetup November 15, 2012
 
SNW Tarmin 040709
SNW Tarmin 040709SNW Tarmin 040709
SNW Tarmin 040709
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
 
The Lean Cloud for Startups with AWS - Cost Optimisation
The Lean Cloud for Startups with AWS - Cost OptimisationThe Lean Cloud for Startups with AWS - Cost Optimisation
The Lean Cloud for Startups with AWS - Cost Optimisation
 
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
 
Data Kinetics Products
Data Kinetics ProductsData Kinetics Products
Data Kinetics Products
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
Improve DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data ManagementImprove DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data Management
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
 
Manage rising disk prices with storage virtualization webinar
Manage rising disk prices with storage virtualization webinarManage rising disk prices with storage virtualization webinar
Manage rising disk prices with storage virtualization webinar
 
ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...
ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...
ENT401 Oracle Enterprise Performance Management Applications in the AWS Cloud...
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
 

Ähnlich wie MongoDB at NoSQL Now! 2012: Benefits and Challenges of Using MongoDB in the Enterprise

How to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot ProjectHow to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot ProjectDATAVERSITY
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use CasesDATAVERSITY
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use CasesDATAVERSITY
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
Common MongoDB Use Cases Webinar
Common MongoDB Use Cases WebinarCommon MongoDB Use Cases Webinar
Common MongoDB Use Cases WebinarMongoDB
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Virtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireVirtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireCarter Shanklin
 
Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...
Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...
Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...Cloudera, Inc.
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloudboorad
 
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightScale
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012Eonblast
 
Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructureelliando dias
 
Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012Munish Gupta
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011David Funaro
 
Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQLPeter Eisentraut
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridScaleOut Software
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloudImaginea
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The CloudImaginea
 
Spil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startupSpil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startupart-spilgames
 
Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Open Stack
 

Ähnlich wie MongoDB at NoSQL Now! 2012: Benefits and Challenges of Using MongoDB in the Enterprise (20)

How to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot ProjectHow to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot Project
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Common MongoDB Use Cases Webinar
Common MongoDB Use Cases WebinarCommon MongoDB Use Cases Webinar
Common MongoDB Use Cases Webinar
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Virtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireVirtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFire
 
Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...
Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...
Hadoop World 2011: Hadoop and Netezza Deployment Models and Case Study - Kris...
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012
 
Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructure
 
Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011
 
Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQL
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data Grid
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Spil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startupSpil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startup
 
Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Gluster open stack dev summit 042011
Gluster open stack dev summit 042011
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 

Kürzlich hochgeladen (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 

MongoDB at NoSQL Now! 2012: Benefits and Challenges of Using MongoDB in the Enterprise

  • 1. Open source, high performance database Jared Rosoff @forjared 1
  • 2. Challenges Opportunities • Variably typed data • Agility • Complex objects • Cost reduction • High transaction rates • Simplification • Large data size • High availability 2
  • 3. AGILE DEVELOPMENT • Iterative and continuous • New and emerging apps VOLUME AND TYPE OF DATA • Trillions of records • 10’s of millions of queries NEW ARCHITECTURES per second • Systems scaling horizontally, • Volume of data not vertically • Semi-structured and • Commodity servers unstructured data • Cloud Computing 3
  • 4. PROJECT DENORMALIZE START DEVELOPER PRODUCTIVITY DECREASES DATA MODEL STOP USING JOINS CUSTOM • Needed to add new software layers of ORM, Caching, CACHING LAYER Sharding and Message Queue CUSTOM SHARDING • Polymorphic, semi-structured and unstructured data not well supported COST OF DATABASE INCREASES +1 YEAR • Increased database licensing cost • Vertical, not horizontal, scaling • High cost of SAN +6 MONTHS +90 DAYS +30 DAYS LAUNCH 4
  • 5. How we got here 5
  • 6. Variably typed data • Complex data objects • High transaction rates • Large data size • High availability • Agile development 6
  • 7. • Metadata management • EAV anti-pattern • Sparse table anti- pattern 7
  • 8. Entity Attribute Value 1 Type Book 1 Title Inside Intel Difficult to query 1 Author Andy Grove 2 Type Laptop Difficult to index 2 Manufactur Dell er Expensive to query 2 Model Inspiron 2 RAM 8gb 3 Type Cereal 8
  • 9. Type Title Author Manufacturer Model RAM Screen Width Book Inside Intel Andy Grove Laptop Dell Inspiron 8gb 12” TV Panasonic Viera 52” MP3 Margin Walker Fugazi Dischord Constant schema changes Space inefficient Overloaded fields 9
  • 10. Type Manufacturer Model RA Screen M Width Laptop Dell Inspiron 8gb 12” Querying is difficult Manufacturer Model Screen Width Hard to add more types Panasonic Viera 52” Title Author Manufacturer Margin Fugazi Dischord Walker 10
  • 12. Challenges Quer y Quer y Quer • Constant load from client y • Far in excess of single server Quer y Quer y capacity • Can never take the system Query Query down Query Database 12
  • 13. Challenges • Adding more storage over time • Aging out data that’s no longer needed • Minimizing resource overhead of “cold” data Recent Data Old Data Fast Storage Archival Storage Add Capacity 13
  • 14. 14
  • 15. 15
  • 16. Variably typed • Rigid schemas data Complex • Normalization can be hard Objects • Dependent on joins High transaction • Vertical scaling rate • Poor data locality • Difficult to maintain consistency & HA High Availability • HA a bolt-on to many RDBMS Agile • Schema changes Development • Monolithic data model 16
  • 17. A new data model 17
  • 18. var post = { author: “Jared”, date: new Date(), text: “NoSQL Now 2012”, tags: [“NoSQL”, “MongoDB”]} > db.posts.save(post) 18
  • 19. >db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”Jared", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : ”NoSQL Now 2012", tags : [ ”NoSQL", ”MongoDB" ] } Notes: - _id is unique, but can be anything you’d like 19
  • 20. Create index on any Field in Document // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.posts.find({author: ’Jared'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”Jared", ... } 20
  • 21. • Conditional Operators – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type – $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^Jar*/i } ) // count posts by author > db.posts.find( {author: ‘Jared’} ).count() 21
  • 22. • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit > comment = { author: “Brendan”, date: new Date(), text: “I want a freakin pony”} > db.posts.update( { _id: “...” }, $push: {comments: comment} ); 22
  • 23. { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”Jared", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : ”NoSQL Now 2012", tags : [ ”NoSQL", ”MongoDB" ], comments : [ { author : "Brendan", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : ”I want a freakin pony" } ]} 23
  • 24. // Index nested documents > db.posts.ensureIndex( “comments.author”:1 )  db.posts.find({‘comments.author’:’Brendan’}) // Index on tags > db.posts.ensureIndex( tags: 1) > db.posts.find( { tags: ’MongoDB’ } ) // geospatial index > db.posts.ensureIndex( “author.location”: “2d” ) > db.posts.find( “author.location” : { $near : [22,42] } ) 24
  • 25. db.posts.aggregate( { $project : { author : 1, tags : 1, } }, { $unwind : “$tags” }, { $group : { _id : { tags : 1 }, authors : { $addToSet : “$author” }}} ); 25
  • 27. Write Primary Read Asynchronous Driver Secondary Replication Read Secondary Read 27
  • 28. Primary Driver Secondary Read Secondary Read 28
  • 29. Primary Write Automatic Driver Primary Leader Election Read Secondary Read 29
  • 30. Secondary Read Write Driver Primary Read Secondary Read 30
  • 32. Write Driver Primary Read Secondary Secondary 32
  • 33. Write Primary Driver Secondary Read Secondary Read 33
  • 35. Fire and forget • Wait for error • Wait for fsync • Wait for journal sync • Wait for replication 35
  • 36. Driver Primary write apply in memory 36
  • 37. Driver Primary write getLastError apply in memory 37
  • 38. Driver Primary write getLastError apply in memory j:true Write to journal 38
  • 39. Driver Primary Secondary write getLastError apply in memory w:2 replicate 39
  • 40. Value Meaning <n:integer> Replicate to N members of replica set “majority” Replicate to a majority of replica set members <m:modeName> Use custom error mode name 40
  • 41. { _id: “someSet”, members: [ { _id:0, host:”A”, tags: { dc: “ny”}}, { _id:1, host:”B”, tags: { dc: “ny”}}, { _id:2, host:”C”, tags: { dc: “sf”}}, { _id:3, host:”D”, tags: { dc: “sf”}}, { _id:4, host:”E”, tags: { dc: “cloud”}}, settings: { getLastErrorModes: { These are the veryImportant: { dc: 3 }, modes you can sortOfImportant: { dc: 2 } use in write } concern } } 41
  • 42. • Between 0..1000 • Highest member that is up to date wins – Up to date == within 10 seconds of primary • If a higher priority member catches up, it will force election and win Primary Secondary Secondary priority = 3 priority = 2 priority = 1 42
  • 43. • Lags behind master by configurable time delay • Automatically hidden from clients • Protects against operator errors – Accidentally delete database – Application corrupts data 43
  • 44. • Vote in elections • Don’t store a copy of data • Use as tie breaker 44
  • 46. Data Center Zone 1 Zone 2 Zone 3 Primary Secondary Secondary 46
  • 47. Data Center Zone 1 Zone 2 Zone 3 Primary Secondary Secondary hidden = true backups 47
  • 48. Active Data Center Standby Data Center Zone 1 Zone 2 Primary Secondary Secondary priority = 1 priority = 1 priority = 0 48
  • 49. West Coast DC Central DC East Coast DC Zone 1 Zone 1 Primary Secondary priority = 2 priority = 1 Zone 2 Zone 2 Abiter Secondary Secondary priority = 2 priority = 1 49
  • 50. Sharding 50
  • 51. client client client client config mongos mongos config config Config Servers mongod mongod mongod mongod mongod mongod mongod mongod mongod Shard Shard Shard 51
  • 52. > db.runCommand( { shardcollection: “test.users”, key: { email: 1 }} ) { name: “Jared”, email: “jsr@10gen.com”, } { name: “Scott”, email: “scott@10gen.com”, } { name: “Dan”, email: “dan@10gen.com”, } 52
  • 53. -∞ + ∞ 53
  • 54. -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 54
  • 55. Split! -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 55
  • 56. This is a Split! This is a chunk chunk -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 56
  • 57. -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 57
  • 58. -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 58
  • 59. Split! -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 59
  • 60. Min Key Max Key Shard -∞ dan@10gen.com 1 dan@10gen.com jsr@10gen.com 1 jsr@10gen.com scott@10gen.com 1 scott@10gen.com +∞ 1 • Stored in the config servers • Cached in MongoS • Used to route requests and keep cluster balanced 60
  • 61. mongos config balancer config Chunks! config 1 2 3 4 13 14 15 16 25 26 27 28 37 38 39 40 5 6 7 8 17 18 19 20 29 30 31 32 41 42 43 44 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 61
  • 62. mongos config balancer config Imbalance Imbalance config 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 62
  • 63. mongos config balancer config Move chunk 1 config to Shard 2 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 63
  • 64. mongos config balancer config config 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 64
  • 65. mongos config balancer config Chunk 1 now lives on Shard 2 config 2 3 4 5 6 7 8 1 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 65
  • 66. By Shard Key Routed db.users.find( {email: “jsr@10gen.com”}) Sorted by Routed in order db.users.find().sort({email:-1}) shard key Find by non Scatter Gather db.users.find({state:”CA”}) shard key Sorted by Distributed merge db.users.find().sort({state:1}) sort non shard key 66
  • 67. Inserts Requires shard db.users.insert({ key name: “Jared”, email: “jsr@10gen.com”}) Removes Routed db.users.delete({ email: “jsr@10gen.com”}) Scattered db.users.delete({name: “Jared”}) Updates Routed db.users.update( {email: “jsr@10gen.com”}, {$set: { state: “CA”}}) Scattered db.users.update( {state: “FZ”}, {$set:{ state: “CA”}}, false, true ) 67
  • 68. 1 1. Query arrives at 4 mongos MongoS 2. MongoS routes query to a single shard 3. Shard returns results of query 2 4. Results returned to client 3 Shard 1 Shard 2 Shard 3 68
  • 69. 1 4 1. Query arrives at mongos MongoS 2. MongoS broadcasts query to all shards 3. Each shard returns results for query 2 4. Results combined and 2 2 returned to client 3 3 3 Shard 1 Shard 2 Shard 3 69
  • 70. 1 6 1. Query arrives at mongos MongoS 2. MongoS broadcasts 5 query to all shards 3. Each shard locally sorts results 2 4. Results returned to 2 2 mongos 4 4 4 5. MongoS merge sorts individual results 3 6. Combined sorted result 3 3 returned to client Shard 1 Shard 2 Shard 3 70
  • 71. Use cases 71
  • 72. Content Management Operational Intelligence Meta Data Management w User Data Management High Volume Data Feeds 72
  • 73. Machine • More machines, more sensors, Generated more data Data • Variably structured Stock Market • High frequency trading Data • Multiple sources of data Social Media • Each changes their format Firehose constantly 73
  • 74. Flexible document model can adapt to changes in sensor format Asynchronous writes Data Data Sources Data Sources Data Write to memory with Sources periodic disk flush Sources Scale writes over multiple shards 74
  • 75. • Large volume of state about users Ad Targeting • Very strict latency requirements • Expose report data to millions of customers Real time • Report on large volumes of data dashboards • Reports that update in real time Social Media • What are people talking about? Monitoring 75
  • 76. Parallelize queries Low latency reads across replicas and shards API In database aggregation Dashboards Flexible schema adapts to changing input data Can use same cluster to collect, store, and report on data 76
  • 77. Intuit relies on a MongoDB-powered real-time analytics tool for small businesses to derive interesting and actionable patterns from their customers’ website traffic Problem Why MongoDB Impact  Intuit hosts more than 500,000  Intuit hosts more than 500,000  In one week Intuit was able to websites websites become proficient in MongoDB  wanted to collect and analyze  wanted to collect and analyze development data to recommend conversion data to recommend conversion  Developed application features and lead generation and lead generation more quickly for MongoDB than improvements to customers. improvements to customers. for relational databases  With 10 years worth of user  With 10 years worth of user  MongoDB was 2.5 times faster data, it took several days to data, it took several days to than MySQL process the information using a process the information using a relational database. relational database. We did a prototype for one week, and within one week we had made big progress. Very big progress. It was so amazing that we decided, “Let’s go with this.” -Nirmala Ranganathan, Intuit 77
  • 78. Rich profiles collecting multiple complex actions 1 See Ad Scale out to support { cookie_id: ‚1234512413243‛, high throughput of advertiser:{ apple: { activities tracked actions: [ 2 See Ad { impression: ‘ad1’, time: 123 }, { impression: ‘ad2’, time: 232 }, { click: ‘ad2’, time: 235 }, { add_to_cart: ‘laptop’, sku: ‘asdf23f’, time: 254 }, Click { purchase: ‘laptop’, time: 354 } 3 ] } } } Dynamic schemas make it easy to track Indexing and 4 Convert vendor specific querying to support attributes matching, frequency capping 78
  • 79. Data • Meta data about artifacts • Content in the library Archiving • Have data sources that you don’t have Information access to • Stores meta-data on those stores and figure discovery out which ones have the content • Retina scans Biometrics • Finger prints 79
  • 80. Indexing and rich query API for easy searching and sorting db.archives. find({ ‚country”: ‚Egypt‛ }); Flexible data model for similar, but different objects { type: ‚Artefact‛, { ISBN: ‚00e8da9b‛, medium: ‚Ceramic‛, type: ‚Book‛, country: ‚Egypt‛, country: ‚Egypt‛, year: ‚3000 BC‛ title: ‚Ancient Egypt‛ } } 80
  • 81. Shutterfly uses MongoDB to safeguard more than six billion images for millions of customers in the form of photos and videos, and turn everyday pictures into keepsakes Problem Why MongoDB Impact  Managing 20TB of data (six  JSON-based data structure  500% cost reduction and 900% billion images for millions of  Provided Shutterfly with an performance improvement customers) partitioning by agile, high performance, compared to previous Oracle function. scalable solution at a low cost. implementation  Home-grown key value store on  Works seamlessly with  Accelerated time-to-market for top of their Oracle database Shutterfly’s services-based nearly a dozen projects on offered sub-par performance architecture MongoDB  Codebase for this hybrid store  Improved Performance by became hard to manage reducing average latency for  High licensing, HW costs inserts from 400ms to 2ms. The “really killer reason” for using MongoDB is its rich JSON-based data structure, which offers Shutterfly an agile approach to develop software. With MongoDB, the Shutterfly team can quickly develop and deploy new applications, especially Web 2.0 and social features. -Kenny Gorman, Director of Data Services 81
  • 82. • Comments and user generated News Site content • Personalization of content, layout Multi-Device • Generate layout on the fly for rendering each device that connects • No need to cache static pages • Store large objects Sharing • Simple modeling of metadata 82
  • 83. Geo spatial indexing Flexible data model for location based GridFS for large for similar, but searches object storage different objects { camera: ‚Nikon d4‛, location: [ -122.418333, 37.775 ] } { camera: ‚Canon 5d mkII‛, people: [ ‚Jim‛, ‚Carol‛ ], taken_on: ISODate("2012-03-07T18:32:35.002Z") } { origin: ‚facebook.com/photos/xwdf23fsdf‛, license: ‚Creative Commons CC0‛, size: { dimensions: [ 124, 52 ], units: ‚pixels‛ Horizontal scalability } for large data sets } 83
  • 84. Wordnik uses MongoDB as the foundation for its “live” dictionary that stores its entire text corpus – 3.5T of data in 20 billion records Problem Why MongoDB Impact  Analyze a staggering amount of  Migrated 5 billion records in a  Reduced code by 75% data for a system build on single day with zero downtime compared to MySQL continuous stream of high-  MongoDB powers every  Fetch time cut from 400ms to quality text pulled from online website requests: 20m API calls 60ms sources per day  Sustained insert speed of 8k  Adding too much data too  Ability to eliminated words per second, with quickly resulted in outages; memcached layer, creating a frequent bursts of up to 50k per tables locked for tens of simplified system that required second seconds during inserts fewer resources and was less  Significant cost savings and 15%  Initially launched entirely on prone to error. reduction in servers MySQL but quickly hit performance road blocks Life with MongoDB has been good for Wordnik. Our code is faster, more flexible and dramatically smaller. Since we don’t spend time worrying about the database, we can spend more time writing code for our application. -Tony Tam, Vice President of Engineering and Technical Co-founder 84
  • 85. • Scale out to large graphs Social Graphs • Easy to search and process • Authentication, Identity Authorization and Management Accounting 85
  • 86. Native support for Arrays makes it easy to store connections inside user profile Sharding partitions user profiles across Documents enable Social Graph available servers disk locality of all profile data for a user 86
  • 87. Review 87
  • 88. • Variety, Velocity and Volume make it difficult • Documents are easier for many use cases 88
  • 89. • Distributed by default • High availability • Cloud deployment 89
  • 90. Document oriented data model • Highly available deployments • Strong consistency model • Horizontally scalable architecture 90

Hinweis der Redaktion

  1. Database requirements are changing … because of i) volume ii) Type of data iii) Agile Development, iv) New architectures.. V) New Apps
  2. Can’t do update w/o shard key unless doing multi.