Video and slides synchronized, mp3 and slide download available at http://bit.ly/17ohqr5.
Matt Asay outlines the reasons for NoSQL's existence and persistence, and will identify the trends that point to a bright future for post-relational databases. Filmed at qconlondon.com.
Matt Asay is the VP of Corporate Strategy at 10gen (the MongoDB company). Before 10gen, Matt was Vice President of Business Development at Nodeable (acquired by Appcelerator), a provider of real-time data stream processing for big data applications, and also at Strobe (acquired by Facebook). Twitter: @mjasay
What's New in Teams Calling, Meetings and Devices March 2024
The Past, Present, and Future of NoSQL
1. NoSQL: The New Normal
Matt Asay (@mjasay)
VP, Corporate Strategy, 10gen
2. Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/nosql-history-future
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
3. Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
4. 10gen Overview
180+ employees 500+ customers
Offices in New York, Palo Alto, Washington
Over $81 million in funding DC, London, Dublin, Barcelona and Sydney
2
7. Back to the Future?
What has been will be again,
what has been done will be done again;
there is nothing new under the sun.
(Ecclesiastes 1:9, NIV)
5
10. Back to the Future: PreSQL
• IBM’s IMS (1969) – Developed
as part of the Apollo Project
• IDS (Integrated Data Store),
navigational database, 1973
• High performance but:
– Forced developers to worry
about both query design and
schema design upfront
– Made it hard to change anything
mid-stream
8
11. Enter SQL
• Designed to overcome “PreSQL” deficiencies
– Decoupled query design from schema design
– Allowed developers to focus on schema design
– Could be confident that you could query the data as
you wanted later
• 30 years of dominance later…
9
17. It Makes Development Hard
Code XML Config DB Schema
Object Relational Relational
Application
Mapping Database
15
18. And Makes Things Hard to Change
New
Table New
Column
New
Table
Name Pet Phone Email
New
Column
3 months later…
16
19. RDBMS Scale = Bigger Computers
“Clients can also opt to run zEC12 without a raised
datacenter floor -- a first for high-end IBM mainframes.”
IBM Press Release 28 Aug, 2012
17
20. This Was a Problem for Google
2010 Search Index Size:
250,000+ MBP’s == 4.1 miles
100,000,000 GB
New data added per day
100,000+ GB
Databases they could use
0
18
Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
22. And for Facebook
2010: 13,000,000 queries per second
TPC #1 DB: 504161 tps
TPC Top Results
20
23. And for Facebook
2010: 13,000,000 queries per second
TPC #1 DB: 504161 tps
TPC Top Results
Top 10 combined: 1,370,368 tps
21
24. Big Data Changes Everything…
for Everyone
Variety of Data Agile Development
• Unstructured data • Iterative
• Semi-structured • Short development
data cycles
• Polymorphic data • New workloads
Volume/Velocity of Data New Architectures
• Petabytes of data • Horizontal scaling
• Trillions of records • Commodity servers
• Millions of queries per • Cloud computing
second
22
26. Living in the Post-transactional Future
Order-processing systems largely “done” (RDBMS);
primary focus on better search and recommendations
or adapting prices on the fly (NoSQL)
Vast majority of its engineering is focused on
recommending better movies (NoSQL), not
processing monthly bills (RDBMS)
Easy part is processing the credit card (RDBMS).
Hard part is making it location aware, so it knows
where you are and what you’re buying (NoSQL)
24
27. Shift in How We Develop Applications
“Systems of Engagement are built by front-line
developers using modern languages who are
driven by time to market, the need for rapid
deployment and iteration….They value solutions
that make it easy for them to deploy their
application code with as little friction as
possible.” (Forrester 2013)
25
35. Agility
RDBMS MongoDB
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
33
36. Example
Serves targeted content to users using MongoDB-
powered identity system
Problem Why MongoDB Results
• 20M+ unique visitors • Easy-to-manage • Rapid rollout of new
per month dynamic data model features
enables limitless
• Rigid relational schema growth, interactive • Customized, social
unable to evolve with conversations
content
changing data types throughout site
and new features • Support for ad hoc
• Tracks user data to
queries
• Slow development increase engagement,
cycles • Highly extensible revenue
34
37. Scalability
Auto-Sharding
• Increase capacity as you go
• Commodity and cloud architectures
• Improved operational simplicity and cost visibility
35
38. Case Study
Manages a wide range of content and services
for its web properties using MongoDB
Problem Why MongoDB Results
• Trouble dealing with a • Move from 6 billion rows • Significant performance
huge variety of content in RDBMS to simplicity gains despite big
• MySQL unable to keep of 1 document increase in volume and
up with performance variety of data
• Automated failover and
and scalability
requirements ability to add nodes • Greater agility, faster
without downtime development iteration
• Problems compounded
by integrating • “Blazingly fast” query • Saved £2m in licenses
information from T- performance: “blown and hardware
Mobile joint venture away by [MongoDB’s]
performance”
36
39. Better Total Cost of Ownership (TCO)
Developer/Ops Savings
• Ease of Use
• Agile development
• Less maintenance
Hardware Savings
• Commodity servers
• Internal storage (no SAN)
• Scale out, not up
Software/Support Savings DB Alternative
• No upfront license
• Cost visibility for usage growth
37
40. Case Study
Stores one of world’s largest record repositories and
searchable catalogues in MongoDB
Problem Why MongoDB Results
• One of world’s largest • Fast, easy scalability • Will scale to 100s of TB
record repositories by 2013, PB by 2020
• Full query language
• Move to SOA required • Searchable catalogue
new approach to data • Complex metadata
of varied data types
store storage
• Decreased SW and
• RDBMS could not • Delivers high scalability,
support costs
support centralized data fast performance, and
mgt and federation of easy maintenance, while
information services keeping support costs low
38
42. Case Study
Uses MongoDB to safeguard over 6 billion images
served to millions of customers
Problem Why MongoDB Results
• 6B images, 20TB of • JSON-based data • 5x cost reduction
data model
• 9x performance
• Brittle code base on top • Agile, high improvement
of Oracle database – performance, scalable
hard to scale, add • Faster time-to-market
features • Alignment with
• Dev cycles in weeks vs.
Shutterfly’s services-
• High SW and HW costs tens of months
based architecture
40