Open source, high performance database MongoDB can be used for a pilot project. The document discusses finding a non-critical initial project, getting experience with MongoDB, benchmarking performance, and presenting the business case for broader use. It also outlines steps for moving a successful pilot to production, including using MongoDB's auto-sharding, replication, and commercial support options.
Developer Data Modeling Mistakes: From Postgres to NoSQL
How to Get Started with Your MongoDB Pilot Project
1. Open source, high performance database
How to get started with your
MongoDB Pilot Project
Jared Rosoff (@forjared)
Spring 2012
1
2. 1. Why use MongoDB?
Agenda
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
2
3. AGILE DEVELOPMENT
• Iterative & continuous
• New and emerging Apps
VOLUME AND TYPE
OF DATA
• Trillions of records
• 10’s of millions of queries NEW ARCHITECTURES
per second • Systems scaling horizontally,
• Volume of data not vertically
• Semi-structured and • Commodity servers
unstructured data • Cloud Computing
3
4. PROJECT
DENORMALIZE
START
DEVELOPER PRODUCTIVITY DECREASES DATA MODEL
STOP USING
JOINS CUSTOM
• Needed to add new software layers of ORM, Caching, CACHING LAYER
Sharding, and Message Queue CUSTOM
SHARDING
• Polymorphic, semi-structured and unstructured data INCREASES COMPLEXITY
not well supported LOWERING PRODUCTIVITY
COSTS COST OF DATABASE INCREASES
+1 YEAR
• Increased database licensing cost
• Vertical, not horizontal, scaling
• High cost of SAN
+6 MONTHS
+90 DAYS
+30 DAYS
LAUNCH
4
5. • Document-oriented Storage
• Based on JSON Documents
• Schema-less
• Scalable Architecture
• Auto-sharding
• Replication & high availability
• Open source, written in C++
• Key Features Include:
• Full featured indexes
• Query language
• Map/Reduce & aggregation
5
6. #2 on Indeed’s Fastest Growing Jobs Jaspersoft BigData Index
Demand for
MongoDB, the
document-oriented
NoSQL database, saw
the biggest spike
with over 200%
growth in 2011.
451 Research
Google Searches “MongoDB increasing its dominance”
6
7. 1. Why use MongoDB?
Agenda
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
7
8. Content Management Operational Intelligence Product Data Mgt
User Data Management High Volume Data Feeds
8
9. Characteristic Challenges MongoDB Solution
High throughput • Lots of reads Sharding + Replication
• Lots of writes
Data variability • Variable fields in objects Document Data Model
• Object fields change over time
• Hard to model in relational
High availability • Automatic failover Replica Sets + Tagging
• Multi-data center deployments
Low latency • Fast response time Memory Mapped
• Working set larger than RAM Storage
Large volumes of • Spread data over lots of disks Sharding + Replication
data • Tolerance of partial failures
9
10. • Look for non-customer facing use cases
– Log aggregation
– Counters & statistics
10
11. {
_id :
ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "roger",
date : "Sat Jul 24 2010 19:47:11",
text : "Spirited Away",
tags : [ "Tezuka", "Manga" ],
comments : [
{ author : ’’ Fred ",
date : "Sat Jul 24 2010 20:51:03",
text : "Best Movie Ever” } ,
{ author : ’’ Bill ",
date : "Sat Jul 24 2010 21:13:23",
text : ” No Way !! ” }
]
}
11
12. • Can I express them as
Blog Platform
Publish a atomic operations?
Blog Post
• Do they make sense
Moderate with my data model?
Blogger Comments
• Do I need strong
Read a Blog consistency?
Post
Submit a
Reader comment
12
13. • Can I quantify my
requirements?
• Can I benchmark my
solution?
• Do I have anything to
compare it to?
13
14. 1. Why use MongoDB?
Agenda
2. Finding a first project
3. Getting good at
MongoDB
4. Making the business case
5. Into Production
14
15. June 2012
May 2012
June 1-2 - Euruko 2012 - Amsterdam, NL (Pending talk
acceptance)
June 3-6 - International PHP Conference - Berlin, DE
July 2012
May 1-2 - Data Innovations Analyst Briefing, Atlanta, GA June 4-5 - Berlin Buzzwords - Berlin, DE July 1 - SPA Conference (London)
May 1 - Cloud Foundry Open - London, UK June 4 - Berlin MUG - Berlin, DE July 2 - PyCon Italia (Italy)
May 3 - Big Panel on Big Data - Atlanta, GA
May 3 - MongoSF Workshops - San Francisco, CA
June 4 - Django Con EU (Community Member Attending) - July 3 - MongoDB Essentials Training (London)
May 4 - MongoSF - San Francisco, CA
Italy
July 10 - Dataversity Webinar (Topic TBD)
May 7-10 - DISA Federal Event - Tampa, FL June 6-8 - NDC - Oslo, Norway
May 8 - Insight Partners Technology Forum - New York, NY June 6 - Prague MUG - Prague, CZE July 11 - MongoDB Essentials Training (China)
May 9 - Progressive NoSQL - London, UK June 7-8 - Dutch PHP Conference - Amsterdam, NL July 11 - Online Conference
May 9 - Emerging Business Tech - Boston, MA July 12 - Carahsoft Webinar
May 10 - Webinar : MongoDB's New Aggregation Framework
June 7 - PyCon Asia Pacific - Singapore
May 14 - MongoDB Oslo (Free Evening Meetup) - Oslo, Norway June 8-9 - PyGotham - New York, NY *Eliot Keynoting July 13 - MongoDB Sao Paulo (Brazil)
May 15-16 - flatMap Oslo - Oslo, Norway June 8-10 - South East Linux Fest - Charlotte, NC July 14 - Gotham.js (NYC)
May 15 - Carahsoft Webinar: Buidling your first MongoDB Application June 9-10 - PHP Conference - Moscow, RUS July 16 - MongoDB Essentials Training (Japan)
May 15 - VLAB NoSQL Panel - Palo Alto, CA
May 15 - MongoDB Pittsburgh (Free Evening Meetup) - Pittsburgh, PA
June 12 - Dataversity Webinar - Topic TBD July 16 - OSCON (Portland, OR)
May 16 - Grails Meetup - London, UK June 13 - MongoDB Paris Workshops - Paris, FR July 17 - MongoDB Essentials Training (Palo Alto)
May 16 - Open Analytics Meetup - New York, NY June 13 - Rightscale Conference - New York, NY July 19 - C# Webinar
May 16 - London Java User Group - London, UK June 13-14 - Hadoop Summit - San Jose, CA July 24 - NYC MUG
May 17 - Webinar: MongoDB for Content Management
May 18 - Walkabout NYC - New York, NY
June 14 - MongoDB Paris - Paris, FR July 24 - SF MUG
May 18-19 - PHP Day - London, UK June 14-15 - WindyCityDB - Chicago, IL July 25 - 578 Broadway Startup Tour (NYC)
May 19-20 - JSConf.ar - Buenos Aires, Argentina June 18-20 - QCon - New York, NY
July 25 - MongoDB Essentials Training (Sydney, AUS)
May 22 - MongoNYC Workshops - New York, NY June 19 - MongoDB UK Workshops - London, UK
May 23 - MongoNYC - New York, NY July 25 - MongoDB San Diego (CA)
May 24 - Glue Conference - Denver, CO *Max Keynoting
June 20 - MongoDB UK - London, UK
July 30 - MongoDB Essentials Training (Melbourne, AUS)
June 20-21 - Gigaom Structure - San Francisco, CA
May 24 - Webinar: Building Web Services with MongoDB, Node.JS, and Openshift
May 24-25 - GOTO Conference - Amsterdam, NL June 21 - Webinar: MongoDB + Hadoop: Taming the July 31 - MongoDB Essentials Training (NYC)
May 25-26 - FLOSS Conf - London, UK
Elephant in the Room TBA Last Week of Month - MongoDB Israel
May 29 - NoSQL Matters - London, UK
May 31 - Seedhack - London, UK June 23 - GoRuCo - New York, NY (Crowdtap Speaking)
June 23 - TestFest - Amsterdam, NL
June 25 - MongoDC Workshops - Washington, DC
June 26 - MongoDC - Washington, DC
June 26 - MongoDB at Big Data - Houston, TX
June 26 - Red Hat Developer Day - Boston, MA
June 26-29 - Open Source Bridge - Portland, OR
June 27 - Jazoon - Zurich, DE
June 27 - SVforum Software Architecture & Platform SIG -
Mountain View, CA
June 29-30 - Lone Star PHP Conference - Dallas, TX
15
17. New York Wednesdays 4pm-6:30pm 578 Broadway
San Francisco Every other 5pm-7pm Epicenter Café
Thursday 764 Harrison St
Palo Alto Thursdays 4pm-6pm 555 University Ave
Atlanta 2nd Tuesday of the 4pm-6pm 1736 Defoor Pl NW
month
17
29. Commercial Support
SUBSCRIPTIONS
developer and production support, commercial
license and MongoDB Subscriber Edition
CONSULTING
expertise on a project basis
TRAINING
for developers and administrators
“MediaMath is growing fast and our data volume throughput requirements are
going up very quickly. MongoDB and 10gen have been extremely helpful
partners for us in scaling our data infrastructure.”
Vince Li
29
30. MongoDB Monitoring Service
• SaaS solution providing “After adding MMS to our cluster,
instrumentation and visibility 10gen’s engineers detected an anomaly
into MongoDB systems in our production deployment and
proactively reached out to us to fix the
• Included in the 10gen problem before it became a production
commercial subscriptions incident.”
• Deployed to most customers Ray Howell,
Vice President of Architecture
• Free version released
• 6,500+ customers using service
30
31. 1. Why use MongoDB?
Agenda
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
31