MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
Dataweek-Talk-2014
1.
2. BUSINESS PROBLEM
Financial Apps has a lot of great data on users. The data can change and be enhanced on the fly.
For many companies this data sits there adding no real value.
When data is actionable it can have greater value.
3. BUSINESS PROBLEM
We need a way to make this data actionable in real-time without waiting for developers.
• Drive Decisions, Workflows and Content
• Change the user experience based on what we know, Now
• about the user
• about the markets
• about the world
• Monitoring and Alerts
• Spending and Budgeting
• Cash Flow
• Fraud
• Deliver content and data
• Offers and Deals
• Advice
• Aggregated Data Sets
• Data Transformation (HTML / PDF)
4. Home Depot Transaction Analysis
May 1st, 2014 – September 2nd, 2014
Description The Numbers
Percentage of transactions for Home Depot 3,723
Average single transaction amount $72.47
Highest single transaction $10,450
Percent of users with at least one transaction 13%
Average number of visits per month 1.5
Average spend per month $108.70
Having the ability to derive and act on data,
when news breaks, is critical.
BUSINESS PROBLEM
There is knowledge in your data yet to be discovered.
5. THE SYSTEM
If we had a system that could do the following, we could accomplish our goals.
• Dynamic Data Management
• Add new user data, offer feeds and advice at real-time.
• No build of the software required to add or modify data.
• Flexibility to work with and aggregate any available data.
• Solution: MongoDB
• Flexible And Scalable Computing
• Leverage Linux PAAS technologies.
• Grow computing/users at a reasonable cost.
• Solution: Iron.io
• Rule Management API
• Add, edit and execute rules on demand via an API.
• Write rules against any collection of data in the platform.
• Join collections of data to create complex rules and data sets.
• Leverage MongoDB and Iron.IO to their fullest.
• Solution: Go Programming Language
6. WHY MONGODB – DYNAMIC DATA MANAGEMENT
MongoDB’s schemaless database provides great flexibility.
Data is stored in “Collections” as
individual documents.
Relationships can be created by using
references. This is in step with how
relational database systems store data.
http://docs.mongodb.org/manual/core/data-modeling-introduction/
7. WHY MONGODB – DYNAMIC DATA MANAGEMENT
Embedding data allows all the data for an entity to be organized in a single document.
http://docs.mongodb.org/manual/core/data-modeling-introduction/
8. WHY MONGODB – DYNAMIC DATA MANAGEMENT
We can leverage the aggregation pipeline for writing rules.
http://docs.mongodb.org/manual/core/aggregation-pipeline/
9. WHY IRON.IO – FLEXIBLE AND SCALABLE COMPUTING
Iron.IO queues and runs worker tasks on their high performance computing platform. We get
scalability out of the box and can realize all the computing we need, when we need it.
Build single processes and use the computing you
need, when you need it.
The System Is Driven By Data And Processes That Each Perform A
Single Task.
10. WHY GO – DO MORE WITH LESS
Go balances between being a low level systems language with all the features that modern
languages have today. It allows you to be incredibly productive, performant and fully in control.
• Comes with a robust standard library
• Concurrency and garbage collection
• Works on a multitude of platforms
• Code is statically compiled so deployment is trivial
• Comes with a large set of online documentation
• Tools to lint, vet, test, profile and benchmark your code
• mgo (Mango) driver for MongoDB by Gustavo Niemeyer
11. DEMO – USER BUDGET
Generate a budget for any given user, based on their transactions,
a budget model and a set of categories.
{
"user_id" : "9f6b481b-e9fd-473b-5a62-14d3f54e892d",
"account_id" : "5409fcbb6685720018000003",
"account_name" : "Bank Visa Platinum1",
"amount" : 150.50,
"type" : "debit",
"merchant_name" : "Sam's Club",
"categories" : [
{
"category_master_id" : 22200,
"type" : 2,
"amount" : 150.50
}
]
}
Transaction Data
12. DEMO – USER BUDGET
Generate a budget for any given user, based on their transactions,
a budget model and a set of categories.
{
"name" : "budget-model-pw",
"data" : [
{
"category_id" : 20900,
"category" : "Entertainment",
"percentage" : 0.03
},
{
"category_id" : 20002,
"category" : "Phone",
"percentage" : 0.02
}
]
}
Budget Model Data
13. DEMO – USER BUDGET
Generate a budget for any given user, based on their transactions,
a budget model and a set of categories.
{
"category_master_id" : 20900,
"parent_id" : 0,
"name" : "Entertainment",
"type" : 2,
"is_locked" : 0,
"modified_date" : ISODate("2014-08-27T15:13:12.657Z"),
"created_date" : ISODate("2014-08-27T15:13:12.657Z")
}
Category Data
14. DEMO – USER BUDGET
Budget Workflow
Transactions transactions
Find expenses from
transactions and sum
by category.
Find income from
transactions and sum.
Join the income to each
category expense.
Calculate the percent of
spend.
Save temp_db
Load all the expense
categories.
Save temp_db
temp_db
Save temp_db
category_master
Save temp_db
temp_db
Join the category name
to the documents.
Save temp_db
relevance_models
Load the
“Financial Apps”
Budge Model.
Save temp_db
temp_db
Join the budget percentage per category. Then calculate if the percent of spend is
over or under the budget limit.
15. DEMO – USER BUDGET
What Query you WorBy have seen combining is the the result data of flexibility the data and flexibility aggregation and aggregation capabilities capabilities of MongoDB of MongoDB, with the Go
the
language power and template of the Go framework, Programming we language have a scalable, and the redundant computing and power feature of Iron.rich io.
solution.
• Go Programming Language
• Go Language
• Systems programming language
• Compiles to binary code for target OS/Architectures
• Cross compile on many operating systems
• Access to scalable cloud computing environments
• MGO driver for Go provides excellent MongoDB support
• MongoDB
• Scalability and redundancy out of the box
• Great MongoDB hosting providers
• Schemaless database that provides great flexibility
• Aggregation pipeline to build rules and datasets
• Can search against text with good performance
• Iron.IO
• Something
• Something
• Something
• Systems programming language
• Compiles to binary code for target OS/Architectures
• Cross compile on many operating systems
• Access to scalable cloud computing environments
• mgo driver for Go provides excellent MongoDB support
• MongoDB
• Scalability and redundancy out of the box
• Great MongoDB hosting providers
• Schemaless database that provides great flexibility
• Aggregation pipeline to build rules and datasets
• Can search against text with good performance
• Iron.io - IronWorker
• High-Scale processing and scalability
• Flexible task scheduling and on demand via API
• Guaranteed reliability
• Security, Monitoring and Administration
• No maintenance or IT required
16. LEARN DEMO – MORE USER – BUDGET
GOINGGO.NET / GOINGGO TRAINING
Query WorBy How combining can you the start data building flexibility your and own aggregation engines using capabilities MongoDB of MongoDB and Go?
with the Go
language and template framework, we have a scalable, redundant and feature rich solution.
• Go Language
Getting Started With MongoDB and Go
blog.mongodb.org/post/80579086742/running-mongodb-queries-concurrently-
• Systems programming language
• Compiles to binary code for target OS/Architectures
• Cross compile on many operating systems
• Access to scalable cloud computing environments
• MGO driver for Go provides excellent MongoDB support
How to use MongoDB to analyze data in a Go program.
• MongoDB
goinggo.net/2013/07/analyze-data-with-mongodb-and-go.html
• Scalability and redundancy out of the box
• Great MongoDB hosting providers
• Schemaless database that provides great flexibility
• Aggregation pipeline to build rules and datasets
• Can search against text with good performance
How to use MongoDB and Go to make your own data actionable.
goinggo.net/2014/06/actionable-data-monogdb-go.html
• Iron.IO
• Something
• Something
• Something
with-go
Go and MongoDB Workshops and Training
GoingGoTraining.net / GoInActionBook.com
Hinweis der Redaktion
This is not exclusive to FA.
Much of this data sits there adding no real value.
BI personnel traditionally can only write/run reports.
The data truly becomes relevant when tied together.
The data must be actionable in real-time to have its greatest value.
Time is the most scare resource.
Five minutes ago is too late. Be relevant now.
You have that user right now, you might not have them again later.
Provide users relevant information based everything you know.
Protect the user when you can, be proactive not reactive.
Deliver content that is relevant.
Have the engine do as much work as it can.
On September 9th, this story breaks.
Hackers breached computer systems, leaving millions of customers potentially exposed to credit and debit-card theft.
It didn’t take long to generate these stats, but who cares if we can’t act on it.
Schedule new data feeds, with different schema.
Don’t depend on developers or new builds of the system.
MongoDB provides the right data storage flexibility.
Don’t want to manage my own computing.
Want scalability day one, not have to build it out over time.
Iron.io has the platform to scale.
Rules need to exist outside of the system.
All data must be available to rules.
Joining data between collections and decisioning is key.
Go provides the systems language features.
Not being tied down to a schema provides flexibility.
Data is stored as Collections of Documents.
Documents can still “relate” to each other between Collections.
Relevance is about finding these relationships.
Keeping data together helps with performance.
Data is easier to reason about.
Documents can change over time.
Aggregation Pipeline is the key to creating relevance.
Fastest way to filter, project and group data.
Engine leverages this technology exclusively.
Build single oriented tasks that can consume and publish data.
Run asynchronous tasks such as data feeds and refreshing accounts.
We can queue as many tasks as we need to.
Iron.io can size out the computing we need to fit demand.
Admin tools, API’s and computing out of the box.
Just about everything you need to write services, tasks and API’s
Focus on doing more with less equals performance
Windows, Mac, Linux including 386, amd64 and arm
Code on Mac and deploy to linux
mgo driver may be the best MongoDB driver out there