Our application development is nearing completion. It's time to prepare our cluster for production, but are we sure the system is capable of handing the load? Have we achieved high availability? What preflight checks should we be running. Learn how Dev & Ops work together to achieve production readiness and plan for scale, availability, monitoring.
7. 7
Sizing
• Indexes need to be in RAM
• Working set needs to be in RAM
• I/O Bandwidth
- write load
- Index updates
- Working set migration
{ _id: ObjectId(),
tour: UUID,
user: UUID,
name: "Doug's Dogs",
desc: "The best hot-dog",
clues: [
"Hungry for a Coney Island?",
"Ask for Dr. Frankenfurter",
"Look for the hot dog stand"
]
"geometry": {
"type": "Point",
"coordinates": [125.6, 10.1]
}
}
11. 14
Load Testing
• Test it like you use it, benchmarks don’t count
• Test to failure
• Instrument your code!
12. 15
Load Testing
• Test it like you use it, benchmarks don’t count
• Test to failure
• Instrument your code!
https://github.com/breinero/Firehose
https://github.com/ParsePlatform/flashback
22. 25
Security
• Firewall
• Bind IP
• Encrypt Networks
• Enable Access Control
• Don’t enable REST interface
• Auditing
Limit Exposure
and use
Principal of Least Privileges
23. 26
Tuning
Best Practices
• Disable Transparent hugepages
• NTP to synchronize time
• Set ulimits
• Use XFS or Ext4
• Don’t use NFS
• Disable NUMA
• Have swap
Read Production Notes
Tunables
• Set IO Scheduler NOOP
• Adjust readaheads ( MMapV1 )
• Avoid cgroups
• SE Linux (?)
• RAID
40. 43
Circuit Breaker
Trigger Conditions
• Latency stats.getMean() >= max
• OpsPerSecond stats.getN() >= max
• ConcurrentOperations stats.getN()*stats.getMean() >= max
41. 44
Circuit Breaker
Trigger Conditions
• Latency stats.getMean() >= max
• OpsPerSecond stats.getN() >= max
• ConcurrentOperations stats.getN()*stats.getMean() >= max
https://github.com/breinero/Firehose
42. 45
Client Side
• Don’t use ensureIndex() in application
• Look out for connection bombs
--maxConnect
• DO use operation timeouts
• DON’T cause socket timeouts
Lower keepalives
• Avoid retry bombs
43. 46
Requirements & Specs
Make a DevOps Contract
• Database Access Requirements
• Database Access Fulfillment Specification
• Cluster Configuration
• Monitoring and Alerting Specification