SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
DAT209 - Scaling MongoDB on Amazon Web
Services
Michael Saffitz, CTO & Co-Founder, Apptentive
November 15, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Nice to Meet You!
Mike Saffitz
CTO, Co-Founder, Apptentive
Follow at: @msaffitz • Connect at: mike@apptentive.com

Apptentive
The easiest way for anyone with an app to talk with their customers
Follow at: @apptentive • Connect at: info@apptentive.com
Apptentive & AWS
Apptentive & AWS
Route53

CloudFront

IAM

S3
Web Servers
EC2: 6 x c1.medium

api.apptentive.com
www.apptentive.com
(Elastic Load
Balancer)
apptentive.com/blog
Elastic Beanstalk, RDS
CloudWatch
Elastic
MapReduce

VPN Server
EC2: m1.small

Stats & Logging
EC2: 2x m1.medium
m1.small

Sharded MongoDB
Cluster
EC2: 9 Instances

CI & Chef
EC2: m1.medium
m1.small

Redis
EC2: m1.medium

Virtual Private Cloud
Agenda
• Why Scale MongoDB on AWS?
• Planning
• Deploying
• Maintaining
Why Scale MongoDB on AWS?
Why Scale MongoDB on AWS?
Supports
Diverse Set of
Scenarios

Rapidly Scale
On Demand

Simple To
Administer

Easy

Friendly Query
Syntax
Well
Documented

Flexible

Broad
Language
Support

Competitive
TCO

Cost
Effective

Fine Grain Control
Over Price &
Performance
Why Not Scale MongoDB on AWS?
Your Data is
Predominately
Relational in Nature
Don’t Want to Incur the
Administrative Costs
Consider RDS

Hosted Alternatives

Consider DynamoDB
1. Planning
Planning Checklist
• Topologies
– MongoDB
– AWS

• Instance Selection
• Storage
MongoDB Topologies: Single Server

mongod
MongoDB Topologies: Single ReplicaSet w/ Arbiter

Automatic
Failover

mongod
(primary)

mongod
(secondary)

Contains Full Copy of
Data on the Primary –
Can be Used for Reads

mongod
(arbiter)

Arbiter Only Participates
in Voting to Elect a New
Primary
(Must Have Odd #)
MongoDB Topologies: Single ReplicaSet

Automatic
Failover

mongod
(primary)

mongod
(secondary)

Scale Across
Instance
Types

mongod
(secondary)

Data Replicated Within ReplicaSet
MongoDB Topologies: Sharded Cluster
App Server
mongos

App Server

…

mongos

mongod
process
config

config

config

Data Partitioned Across Shards
mongod
(primary)

mongod
(secondary)

mongod
(secondary)

Data Replicated Within Shard

…

mongod
(primary)

mongod
(secondary)

mongod
(secondary)
MongoDB Topologies: Picking One
• Single Server? Not For Production
• Don’t Shard Prematurely
– ReplicaSets can take you surprisingly far

• … But Don’t Wait Too Long to Shard
– Collections over 256GB may have issues migrating to shards
– Rebalancing consumes IO and can be very slow

• Pick the Right Instance Size for Your Topology…
– We’re going to get to this in a moment
AWS Topologies: AZs & Regions
• Obvious: Distribute Across Availability Zones in a
Region
– No Single Point of Failure

• Distributing Across Regions
– Shard per Region versus Shards Across Regions
– Considerations
•
•
•
•

Replication Latency
Data Transfer Costs
Administration Costs
Speedup from Geo-Based Tag Aware Sharding
Selecting an Instance: Considerations

Compute

Memory

EBS
Optimized?

Cost
Selecting an Instance: Compute
• Most Likely to Not Be A Significant Factor
– Exceptions: Heavy use of Map/Reduce, Aggregation Framework
– Mongo 2.4 added concurrency via V8
– Important! Only run 64-Bit ; 32-Bit is limited to ~2GB

• Real World Numbers on m1.large:
Selecting an Instance: Memory
• Estimate Necessary Working Set
–

db.runCommand( { serverStatus: 1, workingSet: 1 } )
Is pagesInMemory * 4k approaching total RAM? Is overSeconds decreasing / small?

– db.stats()

• Pick the Instance that Matches
• Monitor on MMS
– Page Faults (abstract)
– Queues (better)
– Response Times (best)
Selecting an Instance: EBS Optimization
• Run EBS Optimized When Available
– Especially with Provisioned IOPs

• Volume Config Impacts IO Perf Far More than
Instance Selection
Storage
• Instance Storage
– Non-Durable
– Fast But Inconsistent Performance
– Can’t Use Snapshots for Backups

• “Standard” EBS
– Slower
– Higher Variability Performance

• Provisioned IOPs EBS
– Consistent Performance
– Don’t Under Provision -- Watch Queue
Length
Storage
• RAID 10? Just use LVM on RAID 0
– More: http://blog.mongohq.com/debunking-myth-of-raid-10-asbest-practice-on-aws/

• Use XFS or Ext4
• Mount with noatime, noexec, nodiratime
Selecting an Instance: Summary
1. Lead with Working Set Requirements
2. Validate Compute is Sufficient
3. Enable EBS Optimized if Available

4. Use Provisioned IOPS EBS
5. (Confirm Cost is Acceptable)
2. Deploying
It’s Easy. Let me show you.
Scaling Deployment
• DevOps: Go for ‘bilities:
– Reliability, Predictability, Repeatability, and Auditability

• The Result is Easy Replaceability and
Scalability
– Build your infrastructure so it can be treated like an appliance
– The impact of your decisions during planning will be significantly
mitigated
DevOps Tools
• AWS Marketplace AMIs
– Preconfigured with MongoDB best practices
– Do-it-yourself scaling to ReplicaSets / Shards
– Helpful, but not a DevOps Solution

• AWS CloudFormation
– Templates for Resource Setup & Initial Configuration

• Chef, Puppet, Ansible, SaltStack, & More
– AWS OpsWorks, but limited by chef-solo
Security
• Run in a VPC
– Complications: Cross Region, Multiple Source Ingress

• Use KeyFiles & Roles
– KeyFiles: Internal authentication for cluster members
– Roles allow for user-level fine grain access control

• Advanced:
– Keberos support in MongoDB 2.4
– SSL Support in Custom Builds & MongoDB Enterprise
3. Maintaining
Monitoring: MongoDB Monitoring Service
• Very Good, Free Holistic Monitoring
–
–

Important: ReplLag, Page Faults, Lock %
Informative: OpCounters, Connections, Queue Lengths

• Includes Basic Alerting of Host Failures and Metric Thresholds
• Query Profiler Details Slow Queries
–

db.setProfilingLevel(1)
Monitoring: Amazon CloudWatch
• Detailed Resource Level Monitoring
– Important: Queue Length, Read/Write Latencies

• Versatile alerting based on Amazon Simple Notification
Service (SNS)
Backups
• Delayed Secondary
– Questionable as a primary backup strategy

• Dump/Restore
– Impractical for larger deployments

• MongoDB Service
– Managed, Secure, Point in Time. Unclear suitability for larger deployments
– Expensive

• Snapshots
– Fast, Easy, Scalable. Pay Attention to Consistency (RAID, Shards)
Easy Snapshot-Based Backups With Mongolly
• Automatic topology detection, snapshotting, and
snapshot management for EBS-backed MongoDB
Databases
• Easy as: $ mongolly backup
• https://github.com/msaffitz/mongolly
Conclusions
• MongoDB + AWS =
• Options For All Deployment / Workload Sizes
– I/O typically the focal point for optimization

• Investing in a DevOps Strategy + Solution
Makes It Near Effortless
Please give us your feedback on this
presentation

DAT209
As a thank you, we will select prize
winners daily for completed surveys!

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block Store
 
Understanding AWS Storage Options
Understanding AWS Storage OptionsUnderstanding AWS Storage Options
Understanding AWS Storage Options
 
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
 
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
 
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
 
Everything You Need for a Viral Game (Except the Game)
Everything You Need for a Viral Game (Except the Game)Everything You Need for a Viral Game (Except the Game)
Everything You Need for a Viral Game (Except the Game)
 
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
 
Backing up Amazon EC2 with Amazon EBS Snapshots - June 2017 AWS Online Tech T...
Backing up Amazon EC2 with Amazon EBS Snapshots - June 2017 AWS Online Tech T...Backing up Amazon EC2 with Amazon EBS Snapshots - June 2017 AWS Online Tech T...
Backing up Amazon EC2 with Amazon EBS Snapshots - June 2017 AWS Online Tech T...
 
AWS Webcast - Introduction to EBS
AWS Webcast - Introduction to EBS AWS Webcast - Introduction to EBS
AWS Webcast - Introduction to EBS
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block Store
 
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
 
Introduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceIntroduction to Amazon Relational Database Service
Introduction to Amazon Relational Database Service
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
 
Advanced EBS Snapshot Management (STG402) | AWS re:Invent 2013
Advanced EBS Snapshot Management (STG402) | AWS re:Invent 2013Advanced EBS Snapshot Management (STG402) | AWS re:Invent 2013
Advanced EBS Snapshot Management (STG402) | AWS re:Invent 2013
 
Maximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceMaximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk Performance
 
Breaking IO Performance Barriers: Scalable Parallel File System for AWS
Breaking IO Performance Barriers: Scalable Parallel File System for AWSBreaking IO Performance Barriers: Scalable Parallel File System for AWS
Breaking IO Performance Barriers: Scalable Parallel File System for AWS
 
Aws Elastic Block Storage
Aws Elastic Block StorageAws Elastic Block Storage
Aws Elastic Block Storage
 

Ähnlich wie Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013

Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017
Amazon Web Services
 

Ähnlich wie Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013 (20)

Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your Startup
 
Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015
 
Nuts and bolts of running a popular site in the aws cloud
Nuts and bolts of running a popular site in the aws cloudNuts and bolts of running a popular site in the aws cloud
Nuts and bolts of running a popular site in the aws cloud
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million users
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...
AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...
AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
Deep Dive RDS & Aurora - Pop-up Loft TLV 2017
Deep Dive RDS & Aurora - Pop-up Loft TLV 2017Deep Dive RDS & Aurora - Pop-up Loft TLV 2017
Deep Dive RDS & Aurora - Pop-up Loft TLV 2017
 
Deep Dive: Amazon Relational Database Service (March 2017)
Deep Dive: Amazon Relational Database Service (March 2017)Deep Dive: Amazon Relational Database Service (March 2017)
Deep Dive: Amazon Relational Database Service (March 2017)
 
Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...
Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...
Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...
 
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the aws
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013

  • 1. DAT209 - Scaling MongoDB on Amazon Web Services Michael Saffitz, CTO & Co-Founder, Apptentive November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. Nice to Meet You! Mike Saffitz CTO, Co-Founder, Apptentive Follow at: @msaffitz • Connect at: mike@apptentive.com Apptentive The easiest way for anyone with an app to talk with their customers Follow at: @apptentive • Connect at: info@apptentive.com
  • 4. Apptentive & AWS Route53 CloudFront IAM S3 Web Servers EC2: 6 x c1.medium api.apptentive.com www.apptentive.com (Elastic Load Balancer) apptentive.com/blog Elastic Beanstalk, RDS CloudWatch Elastic MapReduce VPN Server EC2: m1.small Stats & Logging EC2: 2x m1.medium m1.small Sharded MongoDB Cluster EC2: 9 Instances CI & Chef EC2: m1.medium m1.small Redis EC2: m1.medium Virtual Private Cloud
  • 5. Agenda • Why Scale MongoDB on AWS? • Planning • Deploying • Maintaining
  • 7. Why Scale MongoDB on AWS? Supports Diverse Set of Scenarios Rapidly Scale On Demand Simple To Administer Easy Friendly Query Syntax Well Documented Flexible Broad Language Support Competitive TCO Cost Effective Fine Grain Control Over Price & Performance
  • 8. Why Not Scale MongoDB on AWS? Your Data is Predominately Relational in Nature Don’t Want to Incur the Administrative Costs Consider RDS Hosted Alternatives Consider DynamoDB
  • 10. Planning Checklist • Topologies – MongoDB – AWS • Instance Selection • Storage
  • 11. MongoDB Topologies: Single Server mongod
  • 12. MongoDB Topologies: Single ReplicaSet w/ Arbiter Automatic Failover mongod (primary) mongod (secondary) Contains Full Copy of Data on the Primary – Can be Used for Reads mongod (arbiter) Arbiter Only Participates in Voting to Elect a New Primary (Must Have Odd #)
  • 13. MongoDB Topologies: Single ReplicaSet Automatic Failover mongod (primary) mongod (secondary) Scale Across Instance Types mongod (secondary) Data Replicated Within ReplicaSet
  • 14. MongoDB Topologies: Sharded Cluster App Server mongos App Server … mongos mongod process config config config Data Partitioned Across Shards mongod (primary) mongod (secondary) mongod (secondary) Data Replicated Within Shard … mongod (primary) mongod (secondary) mongod (secondary)
  • 15. MongoDB Topologies: Picking One • Single Server? Not For Production • Don’t Shard Prematurely – ReplicaSets can take you surprisingly far • … But Don’t Wait Too Long to Shard – Collections over 256GB may have issues migrating to shards – Rebalancing consumes IO and can be very slow • Pick the Right Instance Size for Your Topology… – We’re going to get to this in a moment
  • 16. AWS Topologies: AZs & Regions • Obvious: Distribute Across Availability Zones in a Region – No Single Point of Failure • Distributing Across Regions – Shard per Region versus Shards Across Regions – Considerations • • • • Replication Latency Data Transfer Costs Administration Costs Speedup from Geo-Based Tag Aware Sharding
  • 17. Selecting an Instance: Considerations Compute Memory EBS Optimized? Cost
  • 18. Selecting an Instance: Compute • Most Likely to Not Be A Significant Factor – Exceptions: Heavy use of Map/Reduce, Aggregation Framework – Mongo 2.4 added concurrency via V8 – Important! Only run 64-Bit ; 32-Bit is limited to ~2GB • Real World Numbers on m1.large:
  • 19. Selecting an Instance: Memory • Estimate Necessary Working Set – db.runCommand( { serverStatus: 1, workingSet: 1 } ) Is pagesInMemory * 4k approaching total RAM? Is overSeconds decreasing / small? – db.stats() • Pick the Instance that Matches • Monitor on MMS – Page Faults (abstract) – Queues (better) – Response Times (best)
  • 20. Selecting an Instance: EBS Optimization • Run EBS Optimized When Available – Especially with Provisioned IOPs • Volume Config Impacts IO Perf Far More than Instance Selection
  • 21. Storage • Instance Storage – Non-Durable – Fast But Inconsistent Performance – Can’t Use Snapshots for Backups • “Standard” EBS – Slower – Higher Variability Performance • Provisioned IOPs EBS – Consistent Performance – Don’t Under Provision -- Watch Queue Length
  • 22. Storage • RAID 10? Just use LVM on RAID 0 – More: http://blog.mongohq.com/debunking-myth-of-raid-10-asbest-practice-on-aws/ • Use XFS or Ext4 • Mount with noatime, noexec, nodiratime
  • 23. Selecting an Instance: Summary 1. Lead with Working Set Requirements 2. Validate Compute is Sufficient 3. Enable EBS Optimized if Available 4. Use Provisioned IOPS EBS 5. (Confirm Cost is Acceptable)
  • 25. It’s Easy. Let me show you.
  • 26. Scaling Deployment • DevOps: Go for ‘bilities: – Reliability, Predictability, Repeatability, and Auditability • The Result is Easy Replaceability and Scalability – Build your infrastructure so it can be treated like an appliance – The impact of your decisions during planning will be significantly mitigated
  • 27. DevOps Tools • AWS Marketplace AMIs – Preconfigured with MongoDB best practices – Do-it-yourself scaling to ReplicaSets / Shards – Helpful, but not a DevOps Solution • AWS CloudFormation – Templates for Resource Setup & Initial Configuration • Chef, Puppet, Ansible, SaltStack, & More – AWS OpsWorks, but limited by chef-solo
  • 28. Security • Run in a VPC – Complications: Cross Region, Multiple Source Ingress • Use KeyFiles & Roles – KeyFiles: Internal authentication for cluster members – Roles allow for user-level fine grain access control • Advanced: – Keberos support in MongoDB 2.4 – SSL Support in Custom Builds & MongoDB Enterprise
  • 30. Monitoring: MongoDB Monitoring Service • Very Good, Free Holistic Monitoring – – Important: ReplLag, Page Faults, Lock % Informative: OpCounters, Connections, Queue Lengths • Includes Basic Alerting of Host Failures and Metric Thresholds • Query Profiler Details Slow Queries – db.setProfilingLevel(1)
  • 31. Monitoring: Amazon CloudWatch • Detailed Resource Level Monitoring – Important: Queue Length, Read/Write Latencies • Versatile alerting based on Amazon Simple Notification Service (SNS)
  • 32. Backups • Delayed Secondary – Questionable as a primary backup strategy • Dump/Restore – Impractical for larger deployments • MongoDB Service – Managed, Secure, Point in Time. Unclear suitability for larger deployments – Expensive • Snapshots – Fast, Easy, Scalable. Pay Attention to Consistency (RAID, Shards)
  • 33. Easy Snapshot-Based Backups With Mongolly • Automatic topology detection, snapshotting, and snapshot management for EBS-backed MongoDB Databases • Easy as: $ mongolly backup • https://github.com/msaffitz/mongolly
  • 34. Conclusions • MongoDB + AWS = • Options For All Deployment / Workload Sizes – I/O typically the focal point for optimization • Investing in a DevOps Strategy + Solution Makes It Near Effortless
  • 35. Please give us your feedback on this presentation DAT209 As a thank you, we will select prize winners daily for completed surveys!