In this webinar, we will be covering general best practices for running MongoDB on AWS.
Topics will range from instance selection to storage selection and service distribution to ensure service availability. We will also look at any specific best practices related to using WiredTiger. We will then shift gears and explore recommended strategies for managing your MongoDB instance on AWS.
This session also includes a live Q&A portion during which you are encouraged to ask questions of our team.
5. 5
MongoDB Basics
• Open source
• Document database
• High performance
• Horizontally scalable
• Full featured
• Built to match agile development and deployment
6. 6
MongoDB Features
• Flexible document data model
• Rich ad-hoc queries
• Real-time aggregation
• Geospatial support (Within, Intersects and Near operators)
• Text search
• Pluggable Storage Engine Architecture
• Built-in support for
– Redundancy, failover, auto-partitioning
7. 7
7x-10x Performance, 50%-80% Less Storage
How: WiredTiger Storage Engine
• Same data model, same query language, same ops
• Write performance gains driven by document-level concurrency control
• Storage savings driven by native compression
• Non-disruptive upgrade
MongoDB 3.0MongoDB 2.6
Performance
8. 8
MMAPv1 Storage Engine
• History
– MMAPv0 was initial storage engine of MongoDB
– Delegates memory management to operating system
• New Capabilities
– Collection-level concurrency control
– Multiple performance enhancements
– Windows performance now equivalent to Linux
• Advantages
– Read-intensive applications
– Cache survives MongoDB restart, upgrades
– Drop-in upgrade
9. 9
Accessing MongoDB
Shell
Command-line shell for
interacting directly with
database
Drivers
Drivers for most popular
programming languages and
frameworks
> db.collection.insert({product:“MongoDB”,
type:“Document Database”})
>
> db.collection.findOne()
{
“_id” : ObjectId(“5106c1c2fc629bfe52792e86”),
“product” : “MongoDB”
“type” : “Document Database”
}
Java
Python
Perl
Ruby
Haskell
JavaScript
17. 17
EC2 Instance Types
• General Purpose
• Compute-optimized
• GPU (compute resources not needed)
• Memory-optimized
• Storage-optimized
• Micro (bursty, no sustained CPU)
18. 18
EC2 Instance Types
• General Purpose
– M3, M4 – (Instance Store vs EBS)
• Compute-optimized
– C3, C4 – (Instance Store vs EBS)
• Memory-optimized
– R3
• Storage-optimized
– I2, D2
19. 19
Additional Considerations
• Memory Optimized Instances for larger working set
• More CPUs are suggested for WiredTiger based instances
• Placement groups can be used for high-bandwidth needs
20. 20
Components and Sizing
mongod
Core database
process
High
performance
Memory, CPU
Storage,
Network
config
Shard metadata
Smaller
m4.medium or
better
mongos
Shard query
router
Deploy on app
server
26. 26
High Availability
• Use Replica Sets
– Deploy in odd numbers
– Maintain majority
• Withstand the loss of
– Any single zone?
– Any single region?
– Deploy in 3 places
• Scale
– Replica Sets for HA
– Shards for scale
– Combine for both
MongoDB
Primary
1
MongoDB
Secondar
y
2
MongoDB
Secondar
y
3
28. 28
Sensible Instance Defaults
• Best practices are meant to be a sensible starting point
• Strive for smooth and consistent performance
• Tune -> Scale Vertically -> Scale Horizontally
• Amazon Linux optimized for EC2
• EBS provides persistent storage
• EBS-optimized allocates additional NIC for storage
• Provisioned IOPS provides consistent EBS performance
• Use separate PIOPS volumes for data, log, journal
29. 29
Instance Configuration Best Practices
• Install via yum for flexibility and simplicity – See mongodb.org for details
• Update system settings (Don’t forget about NTP!)
• Use EXT4 or XFS (WiredTiger runs best on XFS)
• Set read ahead (default is too high)
• Update ulimits (default is too low)
• Update TCP KeepAlive
https://docs.mongodb.org/manual/administration/production-notes/
30. 30
Data Safety
• What’s your backup plan?
• Have you tested restoring?
• Is your data highly available?
• How do you recover from disaster?
31. 31
Protecting Your Data
• Replica Sets
– Proper deployments provide HA and DR
• Manual backup/restore
– Scriptable, tunable
• Cloud Manager Backup
– Continuous, secure backup
32. 32
Manual Backup Considerations
• Consider Journaling (Write Ahead Log)– on by default
• Allow for DB durability in case of a fault
• With Journaling a snapshot technology can be used with MMAPv1
• MMAP v1 does in-place updates – fsync is required if you don’t use
journaling
• WiredTiger does not require fsync as it effectively does write ahead natively
• Journaling with WiredTiger is still a good idea
33. 33
MongoDB Cloud Manager
Single-click provisioning, scaling &
upgrades, admin tasks – including
instance deployment on EC2
Monitoring, with charts, dashboards and
alerts on 100+ metrics
Backup and restore, with point-in-time
recovery, support for shard clusters
The Best Way to Manage MongoDB In Your Data Center
Up to 95% Reduction in Operational Overhead
34. 34
Resources
• MongoDB on AWS best practices:
– http://docs.mongodb.org/ecosystem/platforms/amazon-ec2/
• MongoDB production Notes
– http://docs.mongodb.org/manual/administration/production-notes/
• MongoDB docs
– http://docs.mongodb.org