SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Using MongoDB for IGN’s Social Platform SF Bay Area MongoDB User Group Tuesday Feb 15th, 2011
About Me Manish Pandit @lobster1234 http:/about.me/mpandit
About IGN’s Social Platform An API to connect gamer community with editors, games, other gamers, and help lay the foundation for premium content discovery as well as UGC In beta since Sept 2010 5M+ activities  20K UVs a day, ~100K PVs a day
Architecture REST based API, built in Java Entities are People,MediaItems, Activities, Comments, Notifications, Status Interfaces across IGN.com as well as other social networks Caching tier based on memcached MySQL and MongoDB as persistence PHP/Zendfront end
MongoDB Usage Activity Streams : ActivityStrea.ms standard Activity Caching :(more on this later!) Activity Commenting Points : Also extend to badges Blocklists, Ban lists Notifications : System notifications Analytics : Activity snapshot for a user
Alternatives MySQL Obvious alternative, being used for storing person data, game data, relationships Did not work for activities Massive joins to filter newsfeeds, i.e. activities from friends Fairly normalized schema for activities Too many changes to the schema as requirements changed and new types of activities came into picture. Alter table started to take hours. Optimization led to large number of indexes, slowing down the writes
Alternatives Voldemort Used for the initial release, Sept 2010 Fast and simple implementation of Amazon Dynamo	 Did not work out for long We needed the ability to query the data Needed more than Key-Value pairs No in-place updates out of the box, had to write custom code to handle concurrent update conflicts (read-repair). Not a lot of developer velocity when compared to MongoDB
Other alternatives Cassandra Learning curve, lack of querying Did not want to bite more than we could chew CouchDB Map-reduce queries, views REST-based API is good, but performance gets affected by a chatty, HTTP interface for a database
Configuration Server: 1 Master, 2 Slaves (load balanced thru Netscalar) 2 extra slaves which are not queried (replicate!!) Version 1.6.1 Client: Java Driver (2.1) Ruby Driver (1.2) Mappers: Morphia for Java Connections per host : 200, #hosts = 4 Oplog Size: 1GB, about 2.5 hours Syncdelay: 60s (default) Hardware: 2 core, 6 GB virtualized machine
Maintenance Data defragmentation Slaves – by running it on different port Master – by having a downtime Collection trimming The scripts block during remove Bulk removes kills the slaves, spiking CPU 100%
Monitoring Nagios TCP Port Monitoring  Disk space monitoring CPU monitoring Munin Mongo connections  Memory usage Ops/second Write Lock % Collection Sizes (in terms of # of documents)
Backup or prepping for O Shit! NetApp Filter based, snapshots Make sure to do {fsync:1} and {lock:1} on one slave Hourly dumps via cronjob Using mongodump Incremental backup via the oplog Replay the oplog instead of relying on a snapshot Delayed slaves  Not recommended as it almost guarantees data loss proportional to the delay, which is inversely proportional to the time-to-react
Tools to be familiar with mongostat Look at queue lengths, memory, connections and operation mix db.serverStatus() Server status with sync, pagefaults, locks, index misses atop iostat db.stats() Overall info at the database level db.<coll_name>.stats() Overall info at the collection level db.printReplicationInfo() Info about the oplog size and time db.printSlaveReplicationInfo() Info about the master, the last sync timetamp, and how behind the slave is from the master
Challenges with ActivityStreams Lots of data! Large amount of data coming out as a result Reverse sorting The data has to be sorted in reverse natural order ($natural : -1), and we do not use capped collections Aggregation of similar activities Impacts pagination Fetching self activities (profile), and newsfeed (self + others) Filtering based on the activity type People want to see Game Updates or Blog updates from their friends Hydration of activities for dynamic data The thumbnail and level of the actor may change Comments  When an activity is rendered, the initial comments and count has to be pulled ($slice)  TODO: Rant about missing $size operator
ActivityStreams Each activity has an ACTOR Each actor has a TYPE Each actor performs an action, that action is called a VERB  Each VERB can act upon many Objects, called ACTIVITYOBJECTS Some VERBs may involve a Target, called ACTIVITYTARGET Every entity (Actor, ActivityObject, ActivityTarget) has links to define it Examples :  A writes ‘Hello!’ on B’s wall Actor => A,  ActivityObject=> ‘Hello!’ of type WALL_POST, ActivityTarget=> B, VERB => POST A follows a game B Actor => A, ActivityObject=> B of type MEDIA_ITEM, ActivityTarget=> null, VERB => FOLLOW ………and it gets complicated as we go down the rabbit hole!
Caching using MongoDB Caching the entire streams A bad idea (or bad implementation?) The expired objects sat in the db, bloating the database The removal did not free up space, so we ran out Use Mongo as a cache-key-index Cache the streams in Memcached For invalidation, keep the index of the memcached keys in MongoDB. Works!
What we’ve learned Keep an eye on Page Faults Index misses Queue lengths Database sizes on disk due to reuse vs. release Use .explain()  Watch for nscanned and indexBounds Use limit() when using find While updating, try to load that object in memory so that its in the working set (findAndModify) Try to keep the fields being selected at a minimum Replicate and denormalize instead of using writeconcerns
Near term Plans Move to replica sets  Move relationship graphs to MongoDB Shard the relationships based on the userId Run multiple mongo processes, splitting out collections among multiple databases
Wishlist Respect indexes in $or queries A $size operator for arrays $inc when doing $addToSet Defragmentation when removing data Concurrency – too many write lock conditions A decent start/stop script Load balancing in the driver (round robin) for reads
We are hiring Software Engineers to help us with exciting initiatives at IGN Technologies we use RoR, Java (no J2EE!), Spring, PHP/Zend, JQuery HTML5, CSS3, Sencha Touch, PhoneGap MongoDB, memcached, Solr http://corp.ign.com
Questions
References IGN’s Social Platform http://my.ign.com http://people.ign.com/ign-labs Mongo MuninPlugins https://github.com/erh/mongo-munin https://github.com/lobster1234/munin-mongo-collections Morphia http://code.google.com/p/morphia/

Weitere ähnliche Inhalte

Andere mochten auch

Katechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien GebodenKatechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien GebodenN Couperus
 
Pure Insight presentation on Innovation Search
Pure Insight presentation on Innovation SearchPure Insight presentation on Innovation Search
Pure Insight presentation on Innovation Searchdbyhundred
 
Зачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центрЗачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центрIndex - Unified Communications
 
Silicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API AntipatternsSilicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API AntipatternsManish Pandit
 
Funcionlinealyafin
FuncionlinealyafinFuncionlinealyafin
FuncionlinealyafinRodolfo A
 
Political Cartoon
Political CartoonPolitical Cartoon
Political CartoonAmy
 
Activities Done
Activities DoneActivities Done
Activities DoneIaaC
 
Filming- Day Two
Filming- Day TwoFilming- Day Two
Filming- Day Two3246
 
Jointure Naturelle3
Jointure Naturelle3Jointure Naturelle3
Jointure Naturelle3ADB2
 
Sowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful StartupSowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful Startupllumenti
 

Andere mochten auch (18)

Katechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien GebodenKatechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien Geboden
 
Proyecto ministerio adolescentes
Proyecto ministerio adolescentesProyecto ministerio adolescentes
Proyecto ministerio adolescentes
 
Pure Insight presentation on Innovation Search
Pure Insight presentation on Innovation SearchPure Insight presentation on Innovation Search
Pure Insight presentation on Innovation Search
 
Зачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центрЗачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центр
 
PROBLEMAS
PROBLEMASPROBLEMAS
PROBLEMAS
 
Story Board & Planning
Story Board & PlanningStory Board & Planning
Story Board & Planning
 
Silicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API AntipatternsSilicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API Antipatterns
 
Funcionlinealyafin
FuncionlinealyafinFuncionlinealyafin
Funcionlinealyafin
 
Makro Sunum2
Makro Sunum2Makro Sunum2
Makro Sunum2
 
Political Cartoon
Political CartoonPolitical Cartoon
Political Cartoon
 
Slideshare
SlideshareSlideshare
Slideshare
 
Activities Done
Activities DoneActivities Done
Activities Done
 
Mecatronic
MecatronicMecatronic
Mecatronic
 
Filming- Day Two
Filming- Day TwoFilming- Day Two
Filming- Day Two
 
Finance Bill 2009
Finance Bill 2009Finance Bill 2009
Finance Bill 2009
 
Acacia Research Learning Forum - Day 2
Acacia Research Learning Forum - Day 2Acacia Research Learning Forum - Day 2
Acacia Research Learning Forum - Day 2
 
Jointure Naturelle3
Jointure Naturelle3Jointure Naturelle3
Jointure Naturelle3
 
Sowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful StartupSowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful Startup
 

Mehr von Manish Pandit

Disaster recovery - What, Why, and How
Disaster recovery - What, Why, and HowDisaster recovery - What, Why, and How
Disaster recovery - What, Why, and HowManish Pandit
 
Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018Manish Pandit
 
Disaster Recovery and Reliability
Disaster Recovery and ReliabilityDisaster Recovery and Reliability
Disaster Recovery and ReliabilityManish Pandit
 
Immutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and JenkinsImmutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and JenkinsManish Pandit
 
AWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaAWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaManish Pandit
 
AWS Primer and Quickstart
AWS Primer and QuickstartAWS Primer and Quickstart
AWS Primer and QuickstartManish Pandit
 
Securing your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID ConnectSecuring your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID ConnectManish Pandit
 
Scalabay - API Design Antipatterns
Scalabay - API Design AntipatternsScalabay - API Design Antipatterns
Scalabay - API Design AntipatternsManish Pandit
 
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at NetflixOSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at NetflixManish Pandit
 
API Design Antipatterns - APICon SF
API Design Antipatterns - APICon SFAPI Design Antipatterns - APICon SF
API Design Antipatterns - APICon SFManish Pandit
 
Motivation : it Matters
Motivation : it MattersMotivation : it Matters
Motivation : it MattersManish Pandit
 
Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Manish Pandit
 
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGNIntroducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGNManish Pandit
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with ScalaManish Pandit
 
Object Oriented Programming
Object Oriented ProgrammingObject Oriented Programming
Object Oriented ProgrammingManish Pandit
 
Silicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you RESTSilicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you RESTManish Pandit
 

Mehr von Manish Pandit (20)

Disaster recovery - What, Why, and How
Disaster recovery - What, Why, and HowDisaster recovery - What, Why, and How
Disaster recovery - What, Why, and How
 
Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018
 
Disaster Recovery and Reliability
Disaster Recovery and ReliabilityDisaster Recovery and Reliability
Disaster Recovery and Reliability
 
OAuth2 primer
OAuth2 primerOAuth2 primer
OAuth2 primer
 
Immutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and JenkinsImmutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and Jenkins
 
AWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaAWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and Java
 
AWS Primer and Quickstart
AWS Primer and QuickstartAWS Primer and Quickstart
AWS Primer and Quickstart
 
Securing your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID ConnectSecuring your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID Connect
 
Scalabay - API Design Antipatterns
Scalabay - API Design AntipatternsScalabay - API Design Antipatterns
Scalabay - API Design Antipatterns
 
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at NetflixOSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
 
API Design Antipatterns - APICon SF
API Design Antipatterns - APICon SFAPI Design Antipatterns - APICon SF
API Design Antipatterns - APICon SF
 
Motivation : it Matters
Motivation : it MattersMotivation : it Matters
Motivation : it Matters
 
Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2
 
Scala at Netflix
Scala at NetflixScala at Netflix
Scala at Netflix
 
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGNIntroducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with Scala
 
IGN's V3 API
IGN's V3 APIIGN's V3 API
IGN's V3 API
 
Java and the JVM
Java and the JVMJava and the JVM
Java and the JVM
 
Object Oriented Programming
Object Oriented ProgrammingObject Oriented Programming
Object Oriented Programming
 
Silicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you RESTSilicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you REST
 

Kürzlich hochgeladen

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

SF MongoDB User Group : Using MongoDB for IGN's Social Platform

  • 1. Using MongoDB for IGN’s Social Platform SF Bay Area MongoDB User Group Tuesday Feb 15th, 2011
  • 2. About Me Manish Pandit @lobster1234 http:/about.me/mpandit
  • 3. About IGN’s Social Platform An API to connect gamer community with editors, games, other gamers, and help lay the foundation for premium content discovery as well as UGC In beta since Sept 2010 5M+ activities 20K UVs a day, ~100K PVs a day
  • 4. Architecture REST based API, built in Java Entities are People,MediaItems, Activities, Comments, Notifications, Status Interfaces across IGN.com as well as other social networks Caching tier based on memcached MySQL and MongoDB as persistence PHP/Zendfront end
  • 5. MongoDB Usage Activity Streams : ActivityStrea.ms standard Activity Caching :(more on this later!) Activity Commenting Points : Also extend to badges Blocklists, Ban lists Notifications : System notifications Analytics : Activity snapshot for a user
  • 6. Alternatives MySQL Obvious alternative, being used for storing person data, game data, relationships Did not work for activities Massive joins to filter newsfeeds, i.e. activities from friends Fairly normalized schema for activities Too many changes to the schema as requirements changed and new types of activities came into picture. Alter table started to take hours. Optimization led to large number of indexes, slowing down the writes
  • 7. Alternatives Voldemort Used for the initial release, Sept 2010 Fast and simple implementation of Amazon Dynamo Did not work out for long We needed the ability to query the data Needed more than Key-Value pairs No in-place updates out of the box, had to write custom code to handle concurrent update conflicts (read-repair). Not a lot of developer velocity when compared to MongoDB
  • 8. Other alternatives Cassandra Learning curve, lack of querying Did not want to bite more than we could chew CouchDB Map-reduce queries, views REST-based API is good, but performance gets affected by a chatty, HTTP interface for a database
  • 9. Configuration Server: 1 Master, 2 Slaves (load balanced thru Netscalar) 2 extra slaves which are not queried (replicate!!) Version 1.6.1 Client: Java Driver (2.1) Ruby Driver (1.2) Mappers: Morphia for Java Connections per host : 200, #hosts = 4 Oplog Size: 1GB, about 2.5 hours Syncdelay: 60s (default) Hardware: 2 core, 6 GB virtualized machine
  • 10. Maintenance Data defragmentation Slaves – by running it on different port Master – by having a downtime Collection trimming The scripts block during remove Bulk removes kills the slaves, spiking CPU 100%
  • 11. Monitoring Nagios TCP Port Monitoring Disk space monitoring CPU monitoring Munin Mongo connections Memory usage Ops/second Write Lock % Collection Sizes (in terms of # of documents)
  • 12. Backup or prepping for O Shit! NetApp Filter based, snapshots Make sure to do {fsync:1} and {lock:1} on one slave Hourly dumps via cronjob Using mongodump Incremental backup via the oplog Replay the oplog instead of relying on a snapshot Delayed slaves Not recommended as it almost guarantees data loss proportional to the delay, which is inversely proportional to the time-to-react
  • 13. Tools to be familiar with mongostat Look at queue lengths, memory, connections and operation mix db.serverStatus() Server status with sync, pagefaults, locks, index misses atop iostat db.stats() Overall info at the database level db.<coll_name>.stats() Overall info at the collection level db.printReplicationInfo() Info about the oplog size and time db.printSlaveReplicationInfo() Info about the master, the last sync timetamp, and how behind the slave is from the master
  • 14. Challenges with ActivityStreams Lots of data! Large amount of data coming out as a result Reverse sorting The data has to be sorted in reverse natural order ($natural : -1), and we do not use capped collections Aggregation of similar activities Impacts pagination Fetching self activities (profile), and newsfeed (self + others) Filtering based on the activity type People want to see Game Updates or Blog updates from their friends Hydration of activities for dynamic data The thumbnail and level of the actor may change Comments When an activity is rendered, the initial comments and count has to be pulled ($slice) TODO: Rant about missing $size operator
  • 15. ActivityStreams Each activity has an ACTOR Each actor has a TYPE Each actor performs an action, that action is called a VERB Each VERB can act upon many Objects, called ACTIVITYOBJECTS Some VERBs may involve a Target, called ACTIVITYTARGET Every entity (Actor, ActivityObject, ActivityTarget) has links to define it Examples : A writes ‘Hello!’ on B’s wall Actor => A, ActivityObject=> ‘Hello!’ of type WALL_POST, ActivityTarget=> B, VERB => POST A follows a game B Actor => A, ActivityObject=> B of type MEDIA_ITEM, ActivityTarget=> null, VERB => FOLLOW ………and it gets complicated as we go down the rabbit hole!
  • 16. Caching using MongoDB Caching the entire streams A bad idea (or bad implementation?) The expired objects sat in the db, bloating the database The removal did not free up space, so we ran out Use Mongo as a cache-key-index Cache the streams in Memcached For invalidation, keep the index of the memcached keys in MongoDB. Works!
  • 17. What we’ve learned Keep an eye on Page Faults Index misses Queue lengths Database sizes on disk due to reuse vs. release Use .explain() Watch for nscanned and indexBounds Use limit() when using find While updating, try to load that object in memory so that its in the working set (findAndModify) Try to keep the fields being selected at a minimum Replicate and denormalize instead of using writeconcerns
  • 18. Near term Plans Move to replica sets Move relationship graphs to MongoDB Shard the relationships based on the userId Run multiple mongo processes, splitting out collections among multiple databases
  • 19. Wishlist Respect indexes in $or queries A $size operator for arrays $inc when doing $addToSet Defragmentation when removing data Concurrency – too many write lock conditions A decent start/stop script Load balancing in the driver (round robin) for reads
  • 20. We are hiring Software Engineers to help us with exciting initiatives at IGN Technologies we use RoR, Java (no J2EE!), Spring, PHP/Zend, JQuery HTML5, CSS3, Sencha Touch, PhoneGap MongoDB, memcached, Solr http://corp.ign.com
  • 22. References IGN’s Social Platform http://my.ign.com http://people.ign.com/ign-labs Mongo MuninPlugins https://github.com/erh/mongo-munin https://github.com/lobster1234/munin-mongo-collections Morphia http://code.google.com/p/morphia/