SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
MongoDB at MMF
From a DevOps Perspective

      Jan 24, 2013
Introduction

!
    MapMyFitness!was!founded!in!2007
!
    Offices!in!Denver,!CO!&!AusRn,!TX
    (w/!associates!in!SF,!Boston,!New!York,!LA,!and!Chicago)
!
    Over!13!million!registered!users
!
    ~80!million!geoadata!routes!
    (runs,!rides,!walks,!hikes,!etc)
!
    Core!sites,!mobile!apps,!API,!whitealabel
    (MapMyRun,!MapMyRide,!MapMyFitness)
MMF Platform Overview

•!Python!(django)!&!PHP!(legacy!API)

•!Although!MySQL!is!the!core!backing!db!for!Django,!the!majority!of
!!MMF!data!lives!in!various!MongoDB!datastores.!!

•!Routes!datastore!has!~120!million!objects,!currently!7TB+!of!data
!!(3!member!replica!set!backed!by!a!EMC!SAN,!48GB!RAM!each)

•!Django!sessions!converted!to!using!MongoDB!
!!(funcRonal!scaling!example,!600M!sessions!stored)

•!Live!Tracking!system!uRlizes!elasRc!replica!set!membership!to
!!handle!load!scaling!for!events

•!Granular!API!access/error!logging!via!json!to!MongoDB
Route & Elevation data example
   (Lost on the way to MongoSeattle)
Implementation Patterns

•!Standard!Datastore!a!3!member!replica!set
!!!!(small!to!med!implementaRons)

•!Big!Data!implementaRon!–!sharded!cluster!(TB+)

•!Buffering!Layer!a!high!memory!
!!!!(load!all!data!and!index!files!into!RAM)

•!Write!Heavy!a!uRlize!sharding!to!opRmize!for!writes

•!Read!Heavy!a!3+n!replica!set!configuraRon!for!rapid!read!scaling
!!!!(up!to!12!nodes)
Implementation Patterns

•!In!the!cloud,!tune!the!instance!type!to!the!mongo!implementaRon

•!On!iron,!plan!carefully!and!dedicate!servers!completely!to!mongo!to!
    avoid!memory!map!contenRon

•!For!DR,!spin!up!a!delayed,!hidden!replica!node!(preferably!in!a!
    different!datacenter)

•!AggregaRon!framework!can!be!used!in!myriad!ways,!including!
    bridging!the!gap!to!SQL!data!warehousing!via!ETL.

•!Automate!install!paSerns!for!rapid!development,!prototyping,!and!
    infrastructure!scaling.
Operational Automation
( example of automated mongodb install via puppet )
Replica Set Expansion


•   MongoDB!is!“replicaRon!made!elegant”
•   Ridiculously!simple!to!add!addiRonal!members
•   Be!sure!to!run!IniRalSync!from!a!secondary!
    rs.add(!“host”!:!“livetrack_db09”,!“ini8alSync”!:!{!“state”!:!2!}!)
•   Both!rs.add()!and!rs.remove()!can!be!scripted!and!connected!to!
    Monitoring!systems!for!autoscaling
Monitoring and Introspection

•!MMS,!10gen's!cloudabased!monitoring!service!(best!available)

•!Supported!by!Zabbix,!Nagios,!Munin,!Server!Density,!etc

•!mongostat,!mongotop,!REST!interface,!database!profiler

•!Monitoring!system!triggers!can!iniRate!node!addiRons,
!!removals,!service!restarts,!etc

•!In!addiRon!to!servicealevel!monitoring,!use!more!advanced
!!tests!to!check!for!and!alert!on!query!latency!spikes
10gen's MMS
(the one-stop shop for mongdb metrics)
Mongo in Zabbix
( Mikoomi Plugins: http://code.google.com/p/mikoomi )
mongostat
( Very useful for real-time troubleshooting )
Operational Automation
( example of automated mongodb restart action )
Security Considerations

•!MongoDB!provides!authenRcaRon!support!and!basic!permissions

•!Auth!is!turned!off!by!default!to!allow!for!opRmal!performance!

•!Always!run!databases!in!a!trusted!network!environment

•!Lock!down!host!based!firewalls!to!limit!access!to!required!clients!

•!Automate!iptables!with!puppet!or!chef,!in!EC2!use!security!groups
Network Security Automation

## Puppet Pattern for Mongodb network security


class iptables::public {

      iptables::add_rule { '001 MongoDB established':
          rule => '-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT'
      }

      iptables::add_rule { '002 MongoDB':
          rule => '-A RH-Firewall-1-INPUT -i eth1 -p tcp -m tcp --dport 27017 -j ACCEPT'
      }

      iptables::add_rule { '003 MongoDB MMF Phase II Network':
          rule => '-A RH-Firewall-1-INPUT -i eth0 -s 172.16.16.0/20 -p tcp -m tcp --dport 27017 -j ACCEPT'
      }

      iptables::add_rule { '004 MongoDB MMF Cloud Network':
          rule => '-A RH-Firewall-1-INPUT -i eth0 -s 10.178.52.0/24 -p tcp -m tcp --dport 27017 -j ACCEPT'
      }

  }
Security Considerations

•!Use!the!rule!of!leastaprivilege!to!allow!access!to!environments!

•!Data!sensiRvity!should!determine!the!extent!of!security!measures

•!For!nonasensiRve!data,!good!network!security!can!be!sufficient!

•!In!open!environments,!be!sure!experience!matches!access!level

•!Lack!of!granular!perms!allows!for!full!admin!access,!use!discreRon
Maintenance

•!Far!less!maintenance!required!than!tradiRonal!RDMBS!systems

•!Regularly!perform!query!profile!analysis!and!index!audiRng

•!Rebuild!databases!to!reclaim!space!lost!due!to!fragmentaRon

•!Automate!checks!of!log!files!for!known!redaflags

•!Regularly!review!data!throughput!rate,!storage!growth!rate,!and
!!overall!business!growth!graphs!to!inform!capacity!planning.

•!For!HA!tesRng,!periodically!stepadown!the!primary!to!force!failover
Indexing Patterns or “Know Your App”

•   Proper!indexing!criRcal!to!performance!at!scale
    (monitor!slow!queries!to!catch!nonaperformant!requests)
•   MongoDB!is!ulRmately!flexible,!being!schemaless
    (mongo!gives!you!enough!rope!to!hang!yourself,!choose!wisely)
•   Avoid!unaindexed!queries!at!all!costs!
    (it's!quickest!way!to!crater!your!app...!consider!aanotablescan)
•   Onus!on!DevOps!to!match!applicaRon!to!indexes
    (know!your!query!profile,!never!assume)
•   Shoot!for!'covered!queries'!wherever!possible
    (answer!can!be!obtained!from!indexes!only)
Capped Collections

• Use!standard!capped!collecRons!for!retaining!a!fixed!amount!
  of!data.!!Uses!a!FIFO!strategy!for!pruning.
  (based!on!data!size,!not!number!of!rows)

• TTL!CollecRons!(2.2)!age!out!data!based!on!a!retenRon!Rme!
  configuraRon.!!
  (great!for!data!retenRon!requirements!of!all!types)

  Gotcha!
  Explicitly!create!the!capped!collecRon!before!any!data!is!put!
  into!the!system!to!avoid!autoacreaRon!of!collecRon
Lessons Learned

•!Mongo!2.2!upgrade!containing!a!capped!collecRon!created!in!1.8.4.!!This!severely!impacted!
     replicaRon!(RC:!no!"_id"!index,!!FIX:!add!"_id"!index)!

•!Never!start!mongo!when!a!mount!point!is!missing!or!incorrectly!configured.!Mongo!may!
     decide!to!take!maSers!into!it's!own!hands!and!resync!itself!with!the!replica!set.!!Make!
     sure!your!devops!and!your!hos2ng!provider!admins!are!aware!of!this

•!Some!drivers!that!use!connecRon!pooling!can!freak!the!freaky!freak!when!the!primary!
     member!changes!(older!pymongo).!!Kicking!the!applicaRon!can!fix,!also:!upgrade!drivers

•!High!locked!%!is!a!big!redaflag,!and!can!be!caused!by!a!large!number!of!simultaneous!dml!
      acRons!(high!insert!rate,!high!update!rate).!Consider!this!in!the!design!phase.

•!Be!wary!of!automaRon!that!can!change!the!state!of!a!node!during!maintenance!mode.!!
      Disable!automaRon!agents!for!reduced!risk!during!criRcal!administraRve!operaRons!
      (filesystem!maint,!etc)
Thank!you!
chris@mapmyfitness.com

Weitere ähnliche Inhalte

Ähnlich wie A DevOps Perspective: MongoDB & MMF

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual Observatory
Jose Enrique Ruiz
 
MongoDB and server performance
MongoDB and server performanceMongoDB and server performance
MongoDB and server performance
Alon Horev
 

Ähnlich wie A DevOps Perspective: MongoDB & MMF (20)

IETF 90 Report – DNS, DHCP, IPv6 and DANE
IETF 90 Report – DNS, DHCP, IPv6 and DANEIETF 90 Report – DNS, DHCP, IPv6 and DANE
IETF 90 Report – DNS, DHCP, IPv6 and DANE
 
Messaging with amqp and rabbitmq
Messaging with amqp and rabbitmqMessaging with amqp and rabbitmq
Messaging with amqp and rabbitmq
 
MongoDB at MapMyFitness
MongoDB at MapMyFitnessMongoDB at MapMyFitness
MongoDB at MapMyFitness
 
Evaluation of Web Processing Service Frameworks
Evaluation of Web Processing Service FrameworksEvaluation of Web Processing Service Frameworks
Evaluation of Web Processing Service Frameworks
 
Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual Observatory
 
myHadoop 0.30
myHadoop 0.30myHadoop 0.30
myHadoop 0.30
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
 
From a student to an apache committer practice of apache io tdb
From a student to an apache committer  practice of apache io tdbFrom a student to an apache committer  practice of apache io tdb
From a student to an apache committer practice of apache io tdb
 
NGS Informatics and Interpretation - Hardware Considerations by Michael McManus
NGS Informatics and Interpretation - Hardware Considerations by Michael McManusNGS Informatics and Interpretation - Hardware Considerations by Michael McManus
NGS Informatics and Interpretation - Hardware Considerations by Michael McManus
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Beyond Phoenix
Beyond PhoenixBeyond Phoenix
Beyond Phoenix
 
MongoDB and server performance
MongoDB and server performanceMongoDB and server performance
MongoDB and server performance
 
Metrics & more
Metrics & more Metrics & more
Metrics & more
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
Amplexor drupalcamp-gent-2012 - kinepolis platform
Amplexor drupalcamp-gent-2012 - kinepolis platformAmplexor drupalcamp-gent-2012 - kinepolis platform
Amplexor drupalcamp-gent-2012 - kinepolis platform
 
Scripting and automation with the Men & Mice Suite
Scripting and automation with the Men & Mice SuiteScripting and automation with the Men & Mice Suite
Scripting and automation with the Men & Mice Suite
 

A DevOps Perspective: MongoDB & MMF