Database Trends for Modern Applications: Why the Database You Choose Matters

Database Trends for Modern Applications:
Why the Database You Choose Matters
Matt Kalan
Sr. SolutionArchitect
matt.kalan@mongodb.com
@matthewkalan

Learning Objectives
1. Summarize how the DB affect app building
2. Challenges created by RDBMSs
3. How MongoDB accelerates app building

Agenda
1. DB Impact on Projects
2. Impact of Relational DBs
3. How to Improve
4. Project Phases (and code!)
5. Proof Points & Metrics
6. Q&A

Sample Application Development Project

Pre-defined schema
{ CODE }
PRE-DEFINED
DB SCHEMA
APPLICATION RELATIONAL DATABASE
Dependency

Object-to-Relational Mapping
{ CODE } RELATIONAL SCHEMAXML CONFIG
APPLICATION RELATIONAL DATABASE
OBJECT RELATIONAL
MAPPING
Dependency Dependency

Schema Migration
New Table
New Table
New
Column
Name Age Phone Email
New
Column

{
“name” : “Matt Kalan”,
“company” : “MongoDB, Inc.”,
“title” : “Sr. Solution Architect”,
“twitter” : {
“name” : “MatthewKalan”, “location” : “Minneapolis, MN”},
“thumbnail” : { “mime” : “image/png”, “data” : <binary data>},
“addresses” : [
{ “type” : “work”, “street” : ”300 N. 2nd St.”,
“city” : “Minneapolis”, “state”: “MN”,
“zip_code” : “55401” }],
“phones” : [
{ “type” : “work”, “number” : “1-866-237-8815 x8015” },
{ “type” : “home”, “number” : “612-333-4444”}],
“emails” : [
{ “type” : “work”, “address” : “matt.kalan@mongodb.com” },
{ “type” : “home”, “address” : “mkalan@email.com” }]
}
Slow Reads/Writes Across Tables
Contacts
• name
• company
• title
Addresses
• type
• street
• city
• state
• zip_code
Phones
• type
• number
Emails
• type
• address
Thumbnails
• mime_type
• data
Tweets
• name
• Location
• bio

Scaling Is Increasingly Expensive
Scale Up Performance & Volume
Cost
Scale Up

Slow and/or Expensive & Complex Failover
Application Read/writes
DC1
DC2
Manual failover
SAN

What If?
Instead of You Had
Pre-defined schema Dynamic schema determined by your object
Flat data model Object data model
One schema Multiple schemas possible
Each object spread across flat tables Each object stored together
Scaling up for better performance Easy to partition & scale horizontally
SAN required and app handling failover DB & driver handle auto-failover
Manual DB operations Built-in automated DB operations
Large up-front license and add-ons (replication,
partitioning, caching)
Freemium model

Dynamic, Object Data Model Stored Together
Relational MongoDB
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [
{
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
number : “1-212-777-1213”,
type : “cell”
}]
}
Customer ID First Name Last Name City
0 John Doe New York
1 Mark Smith San Francisco
2 Jay Black Newark
3 Meagan White London
4 Edward Daniels Boston
Phone Number Type DNC Customer ID
1-212-555-1212 home T 0
1-212-555-1213 home T 0
1-212-555-1214 cell F 0
1-212-777-1212 home T 1
1-212-777-1213 cell (null) 1
1-212-888-1212 home F 2

HA & Scaling Out Built-in
Application
Driver
mongos
Primary
Secondary
Secondary
Customers
1-1000
Primary
Secondary
Secondary
Customers 1001-
1700
…
Primary
Secondary
Secondary
Customers 1701-
2500
High availability
- Replica sets
Horizontal scalability
- Sharding
… …
Query router, so data
can be auto-balanced
in background

MongoDB Compass MongoDB Connector for BI
MongoDB Enterprise Server
Automation & Productionizing
CommercialLicense
(NoAGPLCopyleftRestrictions)
Platform
Certifications
MongoDB Ops Manager
Monitoring &
Alerting
Query
Optimization
Backup &
Recovery
Automation &
Configuration
Schema Visualization
Data Exploration
Ad-Hoc Queries
Visualization
Analysis
Reporting
LDAP & Kerberos Auditing FIPS 140-2Encryption at Rest
REST APIEmergency
Patches
Customer
Success
Program
On-Demand
Online Training
Warranty
Limitation of
Liability
Indemnification
24x7Support
(1hourSLA)

Aligns with Microservices Design Patterns
API Layer
(Microservices, SQL reads,
Spark)
BI UserApp Data Scientist
Customer Info
Service1
Customer Info
Service2
Customer Info
ServiceM
…
…
…
…
MongoDB BI
Connector1
MongoDB BI
ConnectorN
Spark
Connector1
Spark
ConnectorY
…
mongos
(Query Router)
mongos
(Query Router)
mongos
(Query Router)
mongos
(Query Router)
CustInfo
Shard 1
mongos
(Query Router)
mongos
(Query Router)
DC1
DC2
DC3
CustInfo
Shard 2
CustInfo
Shard X
…
MongoDB Ops Manager
• Monitors
• Backups/restores
• Automates management
• REST API for container
orchestration integration

How
MongoDB
Affects
Project
Timelines

Implementation Phase Example (with Real Code!)
Let’s compare and contrast RDBMS/SQL to MongoDB development using
Java over the course of a few weeks.
Some ground rules:
1. Observe rules of Software Engineering 101: assume separation of application with a data access layer (DAL)
2. DAL must be able to
a. Expose simple, functional, data-only interfaces to the application
• No ORM, frameworks, compile-time bindings, special tools
b. Exploit high performance features of the persistor
3. Focus on core data handling code and avoid distractions that require the same amount of work in both
technologies
a. No exception or error handling
b. Leave out DB connection and other setup resources
4. Day counts are a proxy for progress, not actual time to complete indicated task

The Task: Saving and Fetching Contact data
Map m = new HashMap();
m.put(“name”, “matt”);
m.put(“id”, “K1”);
Start with this simple, flat
shape in the Data Access
Layer:
save(Map m)
And assume we save it in
this way:
Map m = fetch(String id)
And assume we fetch one
by primary key in this way:

Day 1: Initial Efforts for Both Technologies
DDL: create table contact ( … )
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name ) values ( ?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name from contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
}
return m;
}
SQL
DDL: none
save(Map m)
{
collection.insert(new Document(m));
}
MongoDB
{
Map m;
Document doc = new Document();
doc.put(“id”, id);
c = collection.find(doc);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}

Day 2: Add simple fields
m.put(“name”, “matt”);
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11, 1));
• Capturing title and hireDate is part of adding a new business feature
• It was pretty easy to add two fields to the structure
• …but now we have to change our persistence code

SQL Day 2 (Changes in Bold)
DDL: alter table contact add title varchar(8);
alter table contact add hireDate date;
init()
{
(“insert into contact ( id, name, title, hiredate ) values ( ?,?,?,?
)”);
(“select id, name, title, hiredate from contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
}
{
Map m = null;
if(rs.next()) {
m = new HashMap();
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
}
return m;
}
Consequences:
1. Code release schedule linked to database
upgrade (new code cannot run on old
schema)
2. Issues with case sensitivity starting to
creep in (many RDBMSs are case
insensitive for column names, but code is
case sensitive)
3. Changes require careful mods in 4 places
4. Beginning of technical debt

MongoDB Day 2
save(Map m)
{
collection.insert(m);
}
{
Map m;
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
Advantages:
1. Zero time and money spent on overhead code
2. Code and database not physically linked
3. New material with more fields can be added into existing
collections; backfill is optional
4. Names of fields in database precisely match key names
in code layer and directly match on name, not indirectly
via positional offset
5. No technical debt is created
✔ NO CHANGE

Day 3: Core App Logic for Contact Info
• m.put(“name”, “matt”);
• m.put(“id”, “K1”);
• m.put(“title”, “Mr.”);
• m.put(“hireDate”, new Date(2011, 11, 1));
• n1.put(“type”, “work”);
• n1.put(“number”, “1-800-555-1212”));
• list.add(n1);
• n2.put(“type”, “home”));
• n2.put(“number”, “1-866-444-3131”));
• list.add(n2);
• m.put(“phones”, list);
Brace yourself...

Day 3: With RDBMS
DDL: create table phones ( … )
init()
{
(“insert into contact ( id, name, title, hiredate ) values (
?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into phones (id,
type, number) values (?, ?, ?)”;
(“select id, name, title, hiredate, type, number from contact,
phones where phones.id = contact.id and contact.id = ?”);
}
save(Map m)
{
startTrans();
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
c2stmt.setString(1, m.get(“id”));
c2stmt.setString(2, onePhone.get(“type”));
c2stmt.setString(3, onePhone.get(“number”));
c2stmt.execute();
}
endTrans();
}
{
Map m = null;
int i = 0;
List list = new ArrayList();
while (rs.next()) {
if(i == 0) {
m = new HashMap();
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
m.put(“phones”, list);
}
Map onePhone = new HashMap();
onePhone.put(“type”, rs.getString(5));
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
i++;
}
return m;
}
This takes time and money

Day 3: With MongoDB
save(Map m)
{
}
{
Map m;
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
Advantages:
1. Almost zero time and money spent on
overhead code
2. No need to fear fields that are “naturally
occurring” lists containing data specific to the
parent structure and thus do not benefit from
normalization and referential integrity
✔ NO CHANGE

By Day 14, Our Structure Looks Like This:
m.put(“name”, “name”);
m.put(“id”, “K1”);
//…
n4.put(“startupApps”, new String[] { “app1”, “app2”, “app3” } );
n4.put(“geo”, “US-EAST”);
list2.add(n4);
n4.put(“startupApps”, new String[] { “app6” } );
n4.put(“geo”, “EMEA”);l
n4.put(“useLocalNumberFormats”, false):
list2.add(n4);
m.put(“preferences”, list2)
n6.put(“optOut”, true);
n6.put(“assertDate”, someDate);
seclist.add(n6);
m.put(“attestations”, seclist)
m.put(“security”, anotherMapOfData);
• It was still pretty easy to add this data
to the structure
• Want to guess what the SQL
persistence code looks like?
• How about the mongoDB persistence
code?

SQL Day 14
Error: Could not fit all the code into this space.

MongoDB Day 14 – and every other day
save(Map m)
{
}
{
Map m;
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
Advantages:
1. Zero time and money spent on overhead
code
2. Persistence is so easy and flexible and
backward compatible that the persistor does not
upward-influence the shapes we want to persist
i.e. the tail does not wag the dog
✔ NO CHANGE

Also Powerful Functionality
MongoDB
Expressive Queries
• Find anyone with phone # “1-212…”
• Check if the person with number “555…” is on the “do not
call” list
Geospatial
• Find the best offer for the customer at geo coordinates of 42nd
St. and 6th Ave
Text Search • Find all tweets that mention the firm within the last 2 days
Aggregation • Count and sort number of customers grouped by city
Native Binary
JSON support
• Add an additional phone number to Mark Smith’s without
rewriting the document
• Select just the mobile phone number in the list
• Sort on the modified date
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
{
number : “1-212-777-1213”,
type : “cell”
}]
}
Left outer join
($lookup)
• Query for all San Francisco residences, lookup their
transactions, and sum the amount by person
Graph queries
($graphLookup)
• Query for all people within 3 degrees of separation from Mark

Top 15
Global Bank
MongoDB is DB Standard
Global bank with 48M customers in 50 countries terminates Oracle
ULA & makes MongoDB database of choice
Problem Why MongoDB ResultsProblem Solution Results
Slow development cycles due to
RDBMS’ rigid data model hindering
ability to meet business demands
High TCO for hardware, licenses,
development, and support
(>$50M Oracle ULA)
Poor overall performance of customer-
facing and internal applications
Building dozens of apps on MongoDB,
both net new and migrations from
Oracle – e.g., significant portion of retail
banking, including customer-facing and
backoffice apps, fraud detection, card
activation, equity research content
mgt.)
Flexible data model to develop apps
quickly and accommodate diverse data
Ability to scale infrastructure and costs
elastically
Able to cancel Oracle ULA. Evaluating
what apps can be migrated to
MongoDB. For new apps, MongoDB is
default choice
Apps built in weeks instead of months
or years, e.g., ebanking app prototyped
in 2 weeks and in production in 4 weeks
70% TCO reduction

IoT App Running on MongoDB Atlas
Biotechnology giant uses MongoDB Atlas to allow their customers
to track experiments from any mobile device
Problem Why MongoDB ResultsProblem Solution Results
Thermo Fisher is developing Thermo Fisher
Cloud, one of the largest cloud platforms for the
scientific community on AWS
For scientific IoT applications, internal
developers need a database that could easily
handle a wide variety of fast-changing data
Each experiment produces millions of “rows” of
data, which led to suboptimal performance with
incumbent database
Thermo Fisher customers need to be able to
slice and dice their data in many different ways
MS instrument Connect allows Thermo
Fisher customers to see live experiment
results from any mobile device or browser
MongoDB’s expressive query language and
rich secondary indexes provide flexibility to
support both ad-hoc and predefined queries
to support customers’ scientific experiments
Deployed MongoDB using MongoDB Atlas, a
hosted DB service running on Amazon EC2
Thermo Fisher customers now can obtain
real-time insights from mass spectrometry
experiments from any mobile device or
browser; not possible before
Improved developer productivity with 40x less
code in testing with MongoDB when
compared to incumbent databases
Improved performance by 6x
Easy migration process & zero downtime.
Testing to production in under 2 months

ThermoFisher: Inserting data, MongoDB vs. MySQL
• Inserting 1,615 chemical compound records into two parent-child tables.
• To optimize the MySQL query, we turned off foreign keys during insert and
used a string builder to create a bulk insert SQL statement. This improved
insert performance by a factor of 360.
• Compare to MongoDB.
Database Milliseconds Lines of code
MySQL not optimized 147,600 (2.5 minutes) 21
MySQL optimized 410 40
MongoDB 68 1

For More Information
Resource Location
Atlas MongoDBaaS mongodb.com/cloud/atlas
Case Studies mongodb.com/customers
Presentations
Thermo Fisher’s Talk
mongodb.com/presentations
How Thermo Fisher Is Reducing Mass Spectrometry
Experiment Times from Days to Minutes with MongoDB
Atlas on AWS
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
Documentation docs.mongodb.com
MongoDB Downloads mongodb.com/download

Questions?
MattKalan
Sr.SolutionArchitect
matt.kalan@mongodb.com
@matthewkalan

Database Trends for Modern Applications: Why the Database You Choose Matters

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Database Trends for Modern Applications: Why the Database You Choose Matters

Ähnlich wie Database Trends for Modern Applications: Why the Database You Choose Matters (20)

Mehr von MongoDB

Mehr von MongoDB (20)

Database Trends for Modern Applications: Why the Database You Choose Matters

Hinweis der Redaktion