SlideShare ist ein Scribd-Unternehmen logo
1 von 34
#antoinegirbal


Understanding Storage for
performance and data
safety
Antoine Girbal
Solutions Architect, 10gen
Why pop the hood?
• Understanding data safety

• Estimating RAM / disk requirements

• Optimizing performance
Storage Layout
Directory Layout
drwxr-xr-x     4 antoine wheel     136 Nov 19 10:12 journal
-rw------- 1   antoine wheel 16777216 Oct 25 14:58 test.0
-rw------- 1   antoine wheel 134217728 Mar 13 2012 test.1
-rw------- 1   antoine wheel 268435456 Mar 13 2012 test.2
-rw------- 1   antoine wheel 536870912 May 11 2012 test.3
-rw------- 1   antoine wheel 1073741824 May 11 2012 test.4
-rw------- 1   antoine wheel 2146435072 Nov 19 10:14 test.5
-rw------- 1   antoine wheel 16777216 Nov 19 10:13 test.ns
Directory Layout
• Each database has one or more data files, all in
 same folder (e.g. test.0, test.1, …)
• Aggressive preallocation (always 1 spare file)
• Those files get larger and larger, up to 2GB
• There is one namespace file per db which can
 hold 24000 entries per default. A namespace is
 a collection or an index.
• The journal folder contains the journal files
Tuning with options
• Use --directoryperdb to separate dbs into own folders
  which allows to use different volumes (isolation,
  performance)
• Use --nopreallocate to prevent preallocation

• Use --smallfiles to keep data files smaller

• If using many databases, use –nopreallocate and --
  smallfiles to reduce storage size
• If using thousands of collections & indexes, increase
  namespace capacity with --nssize
Internal Structure
Internal File Format
• Files on disk are broken into extents which
 contain the documents
• A collection has 1 to many extents
• Extent grow exponentially up to 2GB
• Namespace entries in the ns file point to the first
 extent for that collection
Internal File Format
 test.ns                     Namespaces


 test.0    test.1   test.2


                                   Extents

      Data Files
Extent Structure
 Extent            Extent
    length           length

    xNext            xNext


    xPrev            xPrev


 firstRecord       firstRecord


  lastRecord       lastRecord
Extents and Records
Extent
  length
              Data Record                Data Record
  xNext


  xPrev       length   Documen           length   Documen
                       {
                          t                       {
                                                     t
              rNext
                           _id: “foo”,
                                         rNext        _id: “bar”,
firstRecord
                           ...                        ...
                       }                          }
              rPrev                      rPrev
lastRecord
What about indices?
Indexes
• Indexes are BTree structures serialized to disk
• They are stored in the same files as data but
 using own extents
Index Extents                                      4   9



                                         1   3     5   6   8   A   B




Extent
  length
              Index                              Index
  xNext
              Record                             Record
  xPrev       length           Bucket            length        Bucket
                                parent                             parent
              rNext                              rNext
firstRecord                    numKeys                         numKeys
              rPrev        K                     rPrev
lastRecord



                       { Document }
the db stats
> db.stats()
{
    "db" : "test",
    "collections" : 22,
    "objects" : 17000383, ## number of documents
    "avgObjSize" : 44.33690276272011,
    "dataSize" : 753744328, ## size of data
    "storageSize" : 1159569408, ## size of all containing extents
    "numExtents" : 81,
    "indexes" : 85,
    "indexSize" : 624204896, ## separate index storage size
    "fileSize" : 4176478208, ## size of data files on disk
    "nsSizeMB" : 16,
    "ok" : 1
}
the collection stats
> db.large.stats()
{
    "ns" : "test.large",
    "count" : 5000000, ## number of documents
    "size" : 280000024, ## size of data
    "avgObjSize" : 56.0000048,
    "storageSize" : 409206784, ## size of all containing extents
    "numExtents" : 18,
    "nindexes" : 1,
    "lastExtentSize" : 74846208,
    "paddingFactor" : 1, ## amount of padding
    "systemFlags" : 0,
    "userFlags" : 0,
    "totalIndexSize" : 162228192, ## separate index storage size
    "indexSizes" : {
        "_id_" : 162228192
    },
    "ok" : 1
}
What’s memory
mapping?
Memory Mapped Files
• All data files are memory mapped to RAM by the OS

• Mongo just reads / writes to RAM in the filesystem cache

• OS takes care of the rest!

• Mongo calls fsync every 60 seconds to flush changes to disk

• Virtual process size = total files size + overhead (connections,
  heap)

• If journal is on, the virtual size will be roughly doubled
Virtual Address Space

        32-bit System                 64-bit System
232 = 4GB                      264 = 1.7 x 1010 GB (16EB)?

- 1GB kernel                   0xF0 – 0xFF Kernel

- .5GB binaries, stack, etc.   0x00 – 0x7F User

= 2.5GB for data               247 = 128TB for data
               BAD                        GOOD
Virtual Address Space
  0x7fffffffffff      Kernel

                      STACK
                        …
                       LIBS
                        …
                                Disk
                     test.ns
                     test.0

                     test.1
                        …
                       …
                      HEAP      {…
                                }
                     MONGO
                       D
               0x0
                      NULL
                               Document
 Process Virtual Memory
Memory map, love it or hate it
• Pros:
   –   No complex memory / disk code in MongoDB, huge win!
   –   The OS is very good at caching for any type of storage
   –   Pure Least Recently Used behavior
   –   Cache stays warm between Mongo restart
• Cons:
   – RAM is affected by disk fragmentation
   – RAM is affected by high read-ahead
   – LRU behavior does not prioritize things (like indices)
How much data is in RAM?
• Resident memory the best indicator of how
 much data in RAM
• Resident is: process overhead (connections,
 heap) + FS pages in RAM that were accessed
• Means that it resets to 0 upon restart even
 though data is still in RAM due to FS cache
• Use free command to check on FS cache size
• Can be affected by fragmentation and read-
 ahead
Journaling
The problem
• A single insert/update involves writing to many
 places (the record, indexes, ns details..)
• What if the electricity goes out? Corruption…
Solution – use a journal
• Data gets written to a journal before making it to
 the data files
• Operations written to a journal buffer in RAM that
 gets flushed every 100ms or 100MB
• Once journal written to disk, data safe unless
 hardware entirely fails
• Journal prevents corruption and allows
 durability
• Can be turned off, but don’t!
Journal Format
JHeader             • Section contains single group
JSectHeader [LSN      commit
3]                  • Applied all-or-nothing
      DurOp
      DurOp                        Set database context
                   Op_DbContext
                                   for subsequent
      DurOp        length          operations
JSectFooter        offset
                   fileNo
JSectHeader [LSN
                   data[length]
7]
                   length
      DurOp                        Write
                   offset
      DurOp        fileNo          Operation
                   data[length]
      DurOp        length
JSectFooter        offset
                   fileNo
…                  data[length]
Can I lose data on hard
crash?
• Maximum data loss is 100ms (journal flush). This
 can be reduced with –journalCommitInterval
• For durability (data is on disk when ack’ed) use the
 JOURNAL_SAFE write concern (“j” option).
• Note that replication can reduce the data loss
 further. Use the REPLICAS_SAFE write concern
 (“w” option).
• As write guarantees increase, latency increases.
 To maintain performance, use more connections!
What is cost of journal?
• On read-heavy systems, no impact
• Write performance is reduced by 5-30%
• If using separate drive for journal, as low as 3%
• For apps that are write-heavy (1000+ writes per
 server) there can be slowdown due to mix of
 journal and data flushes. Use a separate drive!
Fragmentation
Fragmentation
• Files can get fragmented over time if remove()
 and update() are issued.
• It gets worse if documents have varied sizes
• Fragmentation wastes disk space and RAM
• Also makes writes scattered and slower
• Fragmentation can be checked by comparing
 size to storageSize in the collection’s stats.
How it looks like
  EXTENT



   Doc     Doc        Doc      X    Doc         X


     Doc         X          Doc           Doc       X


                               …



                     BOTH ON DISK AND IN RAM!
How to combat
fragmentation?
• compact command (maintenance op)
• Normalize schema more (documents don’t
 grow)
• Pre-pad documents (documents don’t grow)
• Use separate collections over time, then use
 collection.drop() instead of
 collection.remove(query)
• --usePowerOf2sizes option makes disk buckets
 more reusable
Conclusion
• Understand disk layout and footprint
• See how much data is actually in RAM
• Memory mapping is cool
• Answer how much data is ok to lose
• Check on fragmentation and avoid it
#antoinegirbal




Questions?
Antoine Girbal
Solutions Architect, 10gen

Weitere ähnliche Inhalte

Was ist angesagt?

HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User ReferenceBiju Nair
 
Ceph Day KL - Bluestore
Ceph Day KL - Bluestore Ceph Day KL - Bluestore
Ceph Day KL - Bluestore Ceph Community
 
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Big Data Spain
 
Demystifying Storage
Demystifying  StorageDemystifying  Storage
Demystifying Storagebhavintu79
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeDataWorks Summit
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisSameer Tiwari
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFSApache Apex
 
General commands for navisphere cli
General commands for navisphere cliGeneral commands for navisphere cli
General commands for navisphere climsaleh1234
 
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksHadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksCloudera, Inc.
 
Mass storagestructure pre-final-formatting
Mass storagestructure pre-final-formattingMass storagestructure pre-final-formatting
Mass storagestructure pre-final-formattingmarangburu42
 
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "Kuniyasu Suzaki
 
AOS Lab 10: File system -- Inodes and beyond
AOS Lab 10: File system -- Inodes and beyondAOS Lab 10: File system -- Inodes and beyond
AOS Lab 10: File system -- Inodes and beyondZubair Nabi
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introductioninjae yeo
 
CASPUR Staging System II
CASPUR Staging System IICASPUR Staging System II
CASPUR Staging System IIAndrea PETRUCCI
 
12. Computer Systems Hardware 2
12. Computer Systems   Hardware 212. Computer Systems   Hardware 2
12. Computer Systems Hardware 2New Era University
 

Was ist angesagt? (20)

PyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc AltedPyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc Alted
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
Ceph Day KL - Bluestore
Ceph Day KL - Bluestore Ceph Day KL - Bluestore
Ceph Day KL - Bluestore
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
 
Demystifying Storage
Demystifying  StorageDemystifying  Storage
Demystifying Storage
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage Challenge
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
General commands for navisphere cli
General commands for navisphere cliGeneral commands for navisphere cli
General commands for navisphere cli
 
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksHadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
 
Mass storagestructure pre-final-formatting
Mass storagestructure pre-final-formattingMass storagestructure pre-final-formatting
Mass storagestructure pre-final-formatting
 
Anatomy of file write in hadoop
Anatomy of file write in hadoopAnatomy of file write in hadoop
Anatomy of file write in hadoop
 
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
 
AOS Lab 10: File system -- Inodes and beyond
AOS Lab 10: File system -- Inodes and beyondAOS Lab 10: File system -- Inodes and beyond
AOS Lab 10: File system -- Inodes and beyond
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
 
Fast File System
Fast File SystemFast File System
Fast File System
 
CASPUR Staging System II
CASPUR Staging System IICASPUR Staging System II
CASPUR Staging System II
 
12. Computer Systems Hardware 2
12. Computer Systems   Hardware 212. Computer Systems   Hardware 2
12. Computer Systems Hardware 2
 

Andere mochten auch

DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...
DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...
DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...Tal Lavian Ph.D.
 
dataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearchdataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearchMathieu Elie
 
elasticsearch basics workshop
elasticsearch basics workshopelasticsearch basics workshop
elasticsearch basics workshopMathieu Elie
 
Graphical Analysis of PV Plant Data
Graphical Analysis of PV Plant DataGraphical Analysis of PV Plant Data
Graphical Analysis of PV Plant DataCupertinoElectric
 
Data vizualisation: d3.js + sinatra + elasticsearch
Data vizualisation: d3.js + sinatra + elasticsearchData vizualisation: d3.js + sinatra + elasticsearch
Data vizualisation: d3.js + sinatra + elasticsearchMathieu Elie
 
Applying Safety Data and Analysis to Performance-based Transportation Planning
Applying Safety Data and Analysis to Performance-based Transportation PlanningApplying Safety Data and Analysis to Performance-based Transportation Planning
Applying Safety Data and Analysis to Performance-based Transportation PlanningRPO America
 
Ruby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxRuby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxMathieu Elie
 
Measuring Safety Performance - An Analyst’s Perspective
Measuring Safety Performance - An Analyst’s PerspectiveMeasuring Safety Performance - An Analyst’s Perspective
Measuring Safety Performance - An Analyst’s Perspectivewalk_the_safety_talk
 

Andere mochten auch (10)

DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...
DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...
DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advan...
 
Safety kpi examples
Safety kpi examplesSafety kpi examples
Safety kpi examples
 
dataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearchdataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearch
 
elasticsearch basics workshop
elasticsearch basics workshopelasticsearch basics workshop
elasticsearch basics workshop
 
Linked Data on Rails
Linked Data on RailsLinked Data on Rails
Linked Data on Rails
 
Graphical Analysis of PV Plant Data
Graphical Analysis of PV Plant DataGraphical Analysis of PV Plant Data
Graphical Analysis of PV Plant Data
 
Data vizualisation: d3.js + sinatra + elasticsearch
Data vizualisation: d3.js + sinatra + elasticsearchData vizualisation: d3.js + sinatra + elasticsearch
Data vizualisation: d3.js + sinatra + elasticsearch
 
Applying Safety Data and Analysis to Performance-based Transportation Planning
Applying Safety Data and Analysis to Performance-based Transportation PlanningApplying Safety Data and Analysis to Performance-based Transportation Planning
Applying Safety Data and Analysis to Performance-based Transportation Planning
 
Ruby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxRuby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdx
 
Measuring Safety Performance - An Analyst’s Perspective
Measuring Safety Performance - An Analyst’s PerspectiveMeasuring Safety Performance - An Analyst’s Perspective
Measuring Safety Performance - An Analyst’s Perspective
 

Ähnlich wie Webinar: Understanding Storage for Performance and Data Safety

Storage talk
Storage talkStorage talk
Storage talkchristkv
 
MongoDB Journaling and the Storage Enginer
MongoDB Journaling and the Storage EnginerMongoDB Journaling and the Storage Enginer
MongoDB Journaling and the Storage EnginerMongoDB
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)MongoDB
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBRick Copeland
 
MongoDB Best Practices in AWS
MongoDB Best Practices in AWS MongoDB Best Practices in AWS
MongoDB Best Practices in AWS Chris Harris
 
Deployment Preparedness
Deployment Preparedness Deployment Preparedness
Deployment Preparedness MongoDB
 
Google File System
Google File SystemGoogle File System
Google File SystemDreamJobs1
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Kyle Hailey
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment StrategyMongoDB
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...npinto
 
Bigdata and Hadoop
 Bigdata and Hadoop Bigdata and Hadoop
Bigdata and HadoopGirish L
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment StrategiesMongoDB
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Javamalduarte
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable PythonTravis Oliphant
 

Ähnlich wie Webinar: Understanding Storage for Performance and Data Safety (20)

Storage talk
Storage talkStorage talk
Storage talk
 
Five steps perform_2009 (1)
Five steps perform_2009 (1)Five steps perform_2009 (1)
Five steps perform_2009 (1)
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
MongoDB Journaling and the Storage Enginer
MongoDB Journaling and the Storage EnginerMongoDB Journaling and the Storage Enginer
MongoDB Journaling and the Storage Enginer
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
MongoDB Best Practices in AWS
MongoDB Best Practices in AWS MongoDB Best Practices in AWS
MongoDB Best Practices in AWS
 
Deployment Preparedness
Deployment Preparedness Deployment Preparedness
Deployment Preparedness
 
Google File System
Google File SystemGoogle File System
Google File System
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
 
Bigdata and Hadoop
 Bigdata and Hadoop Bigdata and Hadoop
Bigdata and Hadoop
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Java
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Webinar: Understanding Storage for Performance and Data Safety

  • 1. #antoinegirbal Understanding Storage for performance and data safety Antoine Girbal Solutions Architect, 10gen
  • 2. Why pop the hood? • Understanding data safety • Estimating RAM / disk requirements • Optimizing performance
  • 4. Directory Layout drwxr-xr-x 4 antoine wheel 136 Nov 19 10:12 journal -rw------- 1 antoine wheel 16777216 Oct 25 14:58 test.0 -rw------- 1 antoine wheel 134217728 Mar 13 2012 test.1 -rw------- 1 antoine wheel 268435456 Mar 13 2012 test.2 -rw------- 1 antoine wheel 536870912 May 11 2012 test.3 -rw------- 1 antoine wheel 1073741824 May 11 2012 test.4 -rw------- 1 antoine wheel 2146435072 Nov 19 10:14 test.5 -rw------- 1 antoine wheel 16777216 Nov 19 10:13 test.ns
  • 5. Directory Layout • Each database has one or more data files, all in same folder (e.g. test.0, test.1, …) • Aggressive preallocation (always 1 spare file) • Those files get larger and larger, up to 2GB • There is one namespace file per db which can hold 24000 entries per default. A namespace is a collection or an index. • The journal folder contains the journal files
  • 6. Tuning with options • Use --directoryperdb to separate dbs into own folders which allows to use different volumes (isolation, performance) • Use --nopreallocate to prevent preallocation • Use --smallfiles to keep data files smaller • If using many databases, use –nopreallocate and -- smallfiles to reduce storage size • If using thousands of collections & indexes, increase namespace capacity with --nssize
  • 8. Internal File Format • Files on disk are broken into extents which contain the documents • A collection has 1 to many extents • Extent grow exponentially up to 2GB • Namespace entries in the ns file point to the first extent for that collection
  • 9. Internal File Format test.ns Namespaces test.0 test.1 test.2 Extents Data Files
  • 10. Extent Structure Extent Extent length length xNext xNext xPrev xPrev firstRecord firstRecord lastRecord lastRecord
  • 11. Extents and Records Extent length Data Record Data Record xNext xPrev length Documen length Documen { t { t rNext _id: “foo”, rNext _id: “bar”, firstRecord ... ... } } rPrev rPrev lastRecord
  • 13. Indexes • Indexes are BTree structures serialized to disk • They are stored in the same files as data but using own extents
  • 14. Index Extents 4 9 1 3 5 6 8 A B Extent length Index Index xNext Record Record xPrev length Bucket length Bucket parent parent rNext rNext firstRecord numKeys numKeys rPrev K rPrev lastRecord { Document }
  • 15. the db stats > db.stats() { "db" : "test", "collections" : 22, "objects" : 17000383, ## number of documents "avgObjSize" : 44.33690276272011, "dataSize" : 753744328, ## size of data "storageSize" : 1159569408, ## size of all containing extents "numExtents" : 81, "indexes" : 85, "indexSize" : 624204896, ## separate index storage size "fileSize" : 4176478208, ## size of data files on disk "nsSizeMB" : 16, "ok" : 1 }
  • 16. the collection stats > db.large.stats() { "ns" : "test.large", "count" : 5000000, ## number of documents "size" : 280000024, ## size of data "avgObjSize" : 56.0000048, "storageSize" : 409206784, ## size of all containing extents "numExtents" : 18, "nindexes" : 1, "lastExtentSize" : 74846208, "paddingFactor" : 1, ## amount of padding "systemFlags" : 0, "userFlags" : 0, "totalIndexSize" : 162228192, ## separate index storage size "indexSizes" : { "_id_" : 162228192 }, "ok" : 1 }
  • 18. Memory Mapped Files • All data files are memory mapped to RAM by the OS • Mongo just reads / writes to RAM in the filesystem cache • OS takes care of the rest! • Mongo calls fsync every 60 seconds to flush changes to disk • Virtual process size = total files size + overhead (connections, heap) • If journal is on, the virtual size will be roughly doubled
  • 19. Virtual Address Space 32-bit System 64-bit System 232 = 4GB 264 = 1.7 x 1010 GB (16EB)? - 1GB kernel 0xF0 – 0xFF Kernel - .5GB binaries, stack, etc. 0x00 – 0x7F User = 2.5GB for data 247 = 128TB for data BAD GOOD
  • 20. Virtual Address Space 0x7fffffffffff Kernel STACK … LIBS … Disk test.ns test.0 test.1 … … HEAP {… } MONGO D 0x0 NULL Document Process Virtual Memory
  • 21. Memory map, love it or hate it • Pros: – No complex memory / disk code in MongoDB, huge win! – The OS is very good at caching for any type of storage – Pure Least Recently Used behavior – Cache stays warm between Mongo restart • Cons: – RAM is affected by disk fragmentation – RAM is affected by high read-ahead – LRU behavior does not prioritize things (like indices)
  • 22. How much data is in RAM? • Resident memory the best indicator of how much data in RAM • Resident is: process overhead (connections, heap) + FS pages in RAM that were accessed • Means that it resets to 0 upon restart even though data is still in RAM due to FS cache • Use free command to check on FS cache size • Can be affected by fragmentation and read- ahead
  • 24. The problem • A single insert/update involves writing to many places (the record, indexes, ns details..) • What if the electricity goes out? Corruption…
  • 25. Solution – use a journal • Data gets written to a journal before making it to the data files • Operations written to a journal buffer in RAM that gets flushed every 100ms or 100MB • Once journal written to disk, data safe unless hardware entirely fails • Journal prevents corruption and allows durability • Can be turned off, but don’t!
  • 26. Journal Format JHeader • Section contains single group JSectHeader [LSN commit 3] • Applied all-or-nothing DurOp DurOp Set database context Op_DbContext for subsequent DurOp length operations JSectFooter offset fileNo JSectHeader [LSN data[length] 7] length DurOp Write offset DurOp fileNo Operation data[length] DurOp length JSectFooter offset fileNo … data[length]
  • 27. Can I lose data on hard crash? • Maximum data loss is 100ms (journal flush). This can be reduced with –journalCommitInterval • For durability (data is on disk when ack’ed) use the JOURNAL_SAFE write concern (“j” option). • Note that replication can reduce the data loss further. Use the REPLICAS_SAFE write concern (“w” option). • As write guarantees increase, latency increases. To maintain performance, use more connections!
  • 28. What is cost of journal? • On read-heavy systems, no impact • Write performance is reduced by 5-30% • If using separate drive for journal, as low as 3% • For apps that are write-heavy (1000+ writes per server) there can be slowdown due to mix of journal and data flushes. Use a separate drive!
  • 30. Fragmentation • Files can get fragmented over time if remove() and update() are issued. • It gets worse if documents have varied sizes • Fragmentation wastes disk space and RAM • Also makes writes scattered and slower • Fragmentation can be checked by comparing size to storageSize in the collection’s stats.
  • 31. How it looks like EXTENT Doc Doc Doc X Doc X Doc X Doc Doc X … BOTH ON DISK AND IN RAM!
  • 32. How to combat fragmentation? • compact command (maintenance op) • Normalize schema more (documents don’t grow) • Pre-pad documents (documents don’t grow) • Use separate collections over time, then use collection.drop() instead of collection.remove(query) • --usePowerOf2sizes option makes disk buckets more reusable
  • 33. Conclusion • Understand disk layout and footprint • See how much data is actually in RAM • Memory mapping is cool • Answer how much data is ok to lose • Check on fragmentation and avoid it

Hinweis der Redaktion

  1. We’ll cover how data is laid out in more detail later, but at a very high level this is what’s contained in each data file.Extents are allocated within files as needed until space inside the file is fully consumed by extents, then another file is allocated.As more and more data is inserted into a collection, the size of the allocated extents grows exponentially up to 2GB.If you’re writing to many small collections, extents will be interleaved within a single database file, but as the collection grows, eventually a single 2GB file will be devoted to data for that single collection.
  2. A contiguous region of a file that is allocated to a specific namespace (namespace being either an index or a collection). Extents are allocated within files as needed until space inside the file is fully consumed by extents, then another file is allocated.The extents within a namespace are chained to each other with the xNext and xPrev record pointers.Important to note that a file can have extents of different namespaces interleaved. Extents will be sized progressively larger from 4k up to 2GB.Extent is comprised of records, and these are chained together with a doubly linked list referenced from the extent. So if you were to do a table scan such as db.foo.find() we would first lookup the first extent from the ns details, and from that access the firstRecord and traverse the list of records until the query was satisfied.
  3. Imagine your data laid out in these (possibly) sparse extents like a backbone, with linked lists of data records hanging off of each of themGo over table scan logic. $natural  (sort({$natural:-1})When executing a find() with no parameters, the database returns objects in forward natural order. To do a TABLE SCAN, you would start at the first extent indicated in the NamespaceDetails and then traverse the records within.
  4. Variable length keys, fixed size bucketConstraints: 1k per index key limit; 8k bucket sizeKeyNodes (sorted and point to DataRecord and a KeyNode with offset) KeyObjects (unsorted and variable length)Keys names are not stored in the index – i.e., if you index spec was on CityName and EmailAddress, we won’t store the keyName, but we would store that key value as a document with zero length key.As of 2.0 in the v1 indexes we remove the extra overhead of keeping the document structure and store just the keys themselvesB-tree uses standard split/merge.Unless we detect a monotonically increasing key value, in this case we’ll do a 90/10 split. So in the case that you are inserting timestamp data or an ObjectID, your buckets will stay more optimally filled. This also has the affect that if you are only interested in things that happened recently, the old stuff will fall out of memory and won’t need to be-paged in.
  5. 64-BIT:Note that you may think with 64 bits of addressable memory, you might think you have most of that for data files, but in fact you’re only allowed to address 48 bits of memory and the top bit is reserved for kernel memoryThis leaves about 128TB for your data (per process)How many plan to store > 128TB on a single server?Remember, this is just per-process, if you had a sharded system of 100 dbs, you could spread 128TB to each server
  6. Can use pmap to view this layout on linux.So here I have a single document located on disk and show how it’s mapped into memory.Let’s look at how we go about actually accessing this document once it’s loaded into memory
  7. Without journaling, the approach is quite straightforward, there is a one-to-one mapping of data files to memory and when either the OS or an explicit fsync happens, your data is now safe on disk.With journaling we do some tricks.Write ahead log, that is, we write the data to the journal before we update the data itself.Each file is mapped twice, once to a private view which is marked copy-on-write, and once to the shared view – shared in the context that the disk has access to this memory.Every time we do a write, we keep a list of the region of memory that was written to.Batches into group commits, compresses and appends in a group commit to disk by appending to a special journal fileOnce that data has been written to disk, we then do a remapping phase which copies the changes into the shared view, at which point those changes can then be synced to disk.Once that data is synced to disk then it’s safe (barring hardware failure). If there is a failure before the shared/storage view is written to disk, we simply need to apply all the changes in order to the data files since the last time it was synced and we get back to a consistent view of the data
  8. *** Journal is Compressed using snappy to reduce the write volumeLSN (Last Sequence Number) is written to disk as a way to provide a marker to which section of the journal was last flushed to