SlideShare ist ein Scribd-Unternehmen logo
1 von 87
Retail Reference Architecture
with MongoDB
Antoine Girbal
Principal Solutions Engineer, MongoDB Inc.
@antoinegirbal
Introduction
MongoDB Overview
4
MongoDB Strategic Advantages
Horizontally Scalable
-Sharding
Agile
Flexible
High Performance &
Strong Consistency
Application
Highly
Available
-Replica Sets
{ customer: “roger”,
date: new Date(),
comment: “Spirited Away”,
tags: [“Tezuka”, “Manga”]}
5
Documents let you build your data to fit
your application
Relational MongoDB
{ customer_id : 1,
name : "Mark Smith",
city : "San Francisco",
orders: [ {
order_number : 13,
store_id : 10,
date: “2014-01-03”,
products: [
{SKU: 24578234,
Qty: 3,
Unit_price: 350},
{SKU: 98762345,
Qty: 1,
Unit_Price: 110}
]
},
{ <...> }
]
}
CustomerID First Name Last Name City
0 John Doe New York
1 Mark Smith San Francisco
2 Jay Black Newark
3 Meagan White London
4 Edward Danields Boston
Order Number Store ID Product Customer ID
10 100 Tablet 0
11 101 Smartphone 0
12 101 Dishwasher 0
13 200 Sofa 1
14 200 Coffee table 1
15 201 Suit 2
6
Notions
RDBMS MongoDB
Database Database
Table Collection
Row Document
Column Field
Architecture Overview
8
Information
Management
Merchandising
Content
Inventory
Customer
Channel
Sales &
Fulfillment
Insight
Social
Architecture Overview
Customer
Channels
Amazon
Ebay
…
Stores
POS
Kiosk
…
Mobile
Smartphone
Tablet
Website
Contact
Center
API
Data and
Service
Integration
Social
Facebook
Twitter
…
Data
Warehouse
Analytics
Supply Chain
Management
System
Suppliers
3rd Party
In Network
Web
Servers
Application
Servers
9
Commerce Functional Components
Information
Layer
Look & Feel
Navigation
Customization
Personalization
Branding
Promotions
Chat
Ads
Customer's
Perspective
Research
Browse
Search
Select
Shopping Cart
Purchase
Checkout
Receive
Track
Use
Feedback
Maintain
Dialog
Assist
Market / Offer
Guide
Offer
Semantic
Search
Recommend
Rule-based
Decisions
Pricing
Coupons
Sell / Fullfill
Orders
Payments
Fraud
Detection
Fulfillment
Business Rules
Insight
Session
Capture
Activity
Monitoring
Customer Enterprise
Information
Management
Merchandising
Content
Inventory
Customer
Channel
Sales &
Fulfillment
Insight
Social
Merchandising
11
Merchandising
Merchandising
MongoDB
Product Variation
Product Hierarchy
Pricing
Promotions
Ratings & Reviews
Calendar
Semantic Search
Product Definition
Localization
12
• Single view of a product: Single scalable catalog service
used by all services and channels
• Read volume is high and sustained
• Write volume spikes up during catalog update, but also
allows real-time updating of a product
• Advanced indexing and querying is a requirement: find
product by SKU, category, color, etc
• Geographical distribution and low latency achieved
through replication
• Scaling achieved through sharding
Merchandising - principles
13
Merchandising - requirements
Requirement Example Challenge MongoDB
Single-view of product Blended description and
hierarchy of product to
ensure availability on all
channels
Flexible document-oriented
storage
High sustained read
volume with low latency
Constant querying from
online users and sales
associates, requiring
immediate response
Fast indexed querying,
replication allows local copy
of catalog, sharding for
scaling
Spiky and real-time write
volume
Bulk update of full catalog
without impacting
production, real-time touch
update
Fast in-place updating, real-
time indexing, , sharding for
scaling
Advanced querying Find product based on
color, size, description
Ad-hoc querying on any
field, advanced secondary
and compound indexing
14
Merchandising - Product Page
Product
images
General
Informatio
n
List of
Variations
External
Informatio
n
Localized
Description
15
> db.definitions.findOne()
{ productId: "301671", // main product id
department: "Shoes",
category: "Shoes/Women/Pumps",
brand: "Guess",
thumbnail: "http://cdn…/pump.jpg",
image: "http://cdn…/pump1.jpg", // larger version of thumbnail
title: "Evening Platform Pumps",
description: "Those evening platform pumps put the perfect
finishing touches on your most glamourous night-on-the-town
outfit",
shortDescription: "Evening Platform Pumps",
style: "Designer",
type: "Platform",
rating: 4.5, // user rating
lastUpdated: Date("2014/04/01"), // last update time
… }
Merchandising - Product Definition
16
• Get item from Product Id
db.definition.findOne( { productId: "301671" } )
• Get item from Product Ids
db.definition.findOne( { productId: { $in: ["301671", "301672" ] } } )
• Get items by department
db.definition.find({ department: "Shoes" })
• Get items by category prefix
db.definition.find( { category: /^Shoes/Women/ } )
• Indices
productId, department, category, lastUpdated
Merchandising - Product Definition
17
> db.variations.findOne()
{
_id: "730223104376", // the sku
productId: "301671", // references product id
thumbnail: "http://cdn…/pump-red.jpg",
image: "http://cdn…/pump-red.jpg", // larger version of
thumbnail
size: 6.0,
color: "Red",
width: "B",
heelHeight: 5.0,
lastUpdated: Date("2014/04/01"), // last update time
…
}
Merchandising - Product Variation
18
• Get Variation from SKU
db.variation.find( { _id: "730223104376" } )
• Get all variations for a product, sorted by SKU
db.variation.find( { productId: "301671" } ).sort( { _id: 1 } )
• Indices
productId, lastUpdated
Merchandising - Product Variation
20
Price: {
_id: "sku730223104376_store123",
currency: "USD",
price: 89.95,
lastUpdated: Date("2014/04/01"), // last update time
…
}
_id: concatenation of item and store.
Store: can be a store group or store id.
Item: can be an item id or sku
Indices: lastUpdated
Merchandising – Pricing
21
• Get all prices for a given item
db.prices.find( { _id: /^p301671_/ )
• Get all prices for a given sku (price could be at item level)
db.prices.find( { _id: { $in: [ /^sku730223104376_/, /^p301671_/ ])
• Get minimum and maximum prices for a sku
db.prices.aggregate( { match }, { $group: { _id: 1, min: { $min: price },
max: { $max : price} } })
• Get price for a sku and store id (returns up to 4 prices)
db.prices.find( { _id: { $in: [ "sku730223104376_store1234",
"sku730223104376_sgroup0",
"p301671_store1234",
"p301671_sgroup0"] , { price: 1 })
Merchandising - Pricing
22
• The hierarchy of items typically follows:
• Company
– Division:
• Department: Women's shoe store
– Class: Pumps
»Item: Guess classic pump
• Variation: size 6 black
Merchandising – Product Hierarchy
24
Merchandising – Browse and Search products
Browse by
category
Special
Lists
Filter by
attributes
Lists hundreds
of item
summaries
Ideally a single query is issued to the database
to obtain all items and metadata to display
25
The previous page presents many challenges:
• Response is needed within milliseconds for hundreds of
items
• Faceted search on many attributes of an item:
department, brand, category, etc
• Attributes to match may be at the variation level: color,
size, etc, in which case the variation should be shown
• One item may have thousands of variations. Only one
item should be displayed even if many variations match
• Efficient sorting on several attributes: price, popularity
• Pagination feature which requires deterministic ordering
Merchandising – Browse and Search products
26
Merchandising – Browse and Search products
Hundreds
of sizes
One Item
Dozens of
colors
A single item may have thousands of variations
27
Merchandising – Browse and Search products
Images of the matching
variations are displayed
Hierarchy
Sort
parameter
Faceted
Search
28
Merchandising – Traditional Architecture
Relational DB
System of Records
Full Text Search
Engine
Indexing
#1 obtain
search
results IDs
ApplicationCache
#2 obtain
objects by
ID
Pre-joined
into objects
29
The traditional architecture presents issues:
• 3 different systems to maintain: RDBMS, Search
engine, Caching layer
• A search returns a list of IDs which then are looked up in
the cache as a batch or one by one. It significantly
increases latency of response
• RDBMS schema is complex and static
• The search index needs to be refreshed at intervals
• Setup does not allow efficient pagination
Merchandising – Traditional Architecture
30
MongoDB Data Store
Merchandising - Architecture
Product
Summaries
Product
Definitions
Pricing
Promotions
Product
Variations
Ratings &
Reviews
#1 Obtain
results
31
The product index relies on the following parameters:
• The department (required): the main component of category, e.g. "Shoes"
• An indexed attribute (optional)
– Category path, e.g. "Shoes/Women/Pumps"
– Price range (based on online prices)
– List of Item Attributes, e.g. Brand = Guess
– List of Variation Attributes, e.g. Color = red
• A non-indexed attribute (optional)
– List of Item Secondary Attributes, e.g. Style = Designer
– List of Variation Secondary Attributes, e.g. heel height = 5.0
• As well as Sorting, e.g. Price Low to High
Merchandising – Product Summaries
32
> db.summaries.findOne()
{ "_id": "p39",
"title": "Evening Platform Pumps 39",
"department": "Shoes", "category": "Shoes/Women/Pumps",
"thumbnail": "http://cdn…/pump-small-39.jpg", "image": "http://cdn…/pump-39.jpg",
"price": 145.99,
"rating": 0.95,
"attrs": [ { "brand" : "Guess"}, … ],
"sattrs": [ { "style" : "Designer"} , { "type" : "Platform"}, …],
"vars": [
{ "sku": "sku2441",
"thumbnail": "http://cdn…/pump-small-39.jpg.Blue",
"image": "http://cdn…/pump-39.jpg.Blue",
"attrs": [ { "size": 6.0 }, { "color": "Blue" }, …],
"sattrs": [ { "width" : "B"} , { "heelHeight" : 5.0 }, …],
}, … Many more skus …
] }
Indices: vars.sku, department + attr + category, department + vars.attrs + category,
department + category, department + price, department + rating
Merchandising – Product Summaries
33
• Get summary from item id
db.variation.find({ _id: "p301671" })
• Get summary's specific variation from SKU
db.variation.find( { "vars.sku": "730223104376" }, { "vars.$": 1 } )
• Get summary by department, sorted by rating
db.variation.find( { department: "Shoes" } ).sort( { rating: 1 } )
• Get summary with mix of parameters
db.variation.find( { department : "Shoes" ,
"vars.attrs" : { "color" : "Gray"} ,
"category" : ^/Shoes/Women/ ,
"price" : { "$gte" : 65.99 , "$lte" : 180.99 } } )
Merchandising - Product Summaries
34
Merchandising – Query stats
Department Category Price Primary
attribute
Time
Average
(ms)
90th (ms) 95th (ms)
1 0 0 0 2 3 3
1 1 0 0 1 2 2
1 0 1 0 1 2 3
1 1 1 0 1 2 2
1 0 0 1 0 1 2
1 1 0 1 0 1 1
1 0 1 1 1 2 2
1 1 1 1 0 1 1
1 0 0 2 1 3 3
1 1 0 2 0 2 2
1 0 1 2 10 20 35
1 1 1 2 0 1 1
Content
36
Content
Content
MongoDB
Metadata
Asset Repository
Digital Right Mgt
Access Control
Processing /
Encoding
Inventory
38
Inventory
Inventory
MongoDB
External Inventory
Internal Inventory
Regional Inventory
Purchase Orders
Fulfillment
Promotions
39
Demonstration Document Model
Definitions
• id: p0
Variations
• id: sku0
• pId: p0
Summary
• id: p0
• vars: [sku0,
sku1, …]
Stores
• id: s1
• Loc: [22, 33]
Inventory
• store: s1
• pId: p0
• vars:
[{sku: sku0, q: 3},
{sku: sku2, q: 2}]
Product
40
db.stores.findOne()
{ "_id" : ObjectId("53549fd3e4b0aaf5d6d07f35"),
"className" : "catalog.Store",
"storeId" : "store0",
"name" : "Bessemer store",
"address" : {
"addr1" : "1st Main St",
"city" : "Bessemer",
"state" : "AL",
"zip" : "12345",
"country" : "US"
},
"location" : [
-86.95444,
33.40178
]
… }
Inventory - Stores
41
• Get a store by storeId
db.stores.find({ productId: "301671" })
• Get nearby stores sorted by distance
db.stores.runCommand({ "geoNear" : "stores" , "near" : [ -82.800672 ,
40.090844] , "maxDistance" : 10.0 , "spherical" : true}
Inventory - Stores
42
> db.inventory.findOne()
{ "_id": "5354869f300487d20b2b011d",
"storeId": "store0",
"location": [
-86.95444,
33.40178
],
"productId": "p0",
"vars": [
{ "sku": "sku1", "q": 14 },
{ "sku": "sku3", "q": 7 },
{ "sku": "sku7", "q": 32 },
{ "sku": "sku14", "q": 65 },
...
]
}
Inventory - Quantities
43
• Get all items in a store
db.inventory.find({ storeId: "store100" })
• Get quantity for an item at a store
db.inventory.find({ storeId: "store100", productId: "p200" })
• Get quantity for a sku at a store
db.inventory.find(
{ storeId: "store100", productId: "p200", "vars.sku": "sku11736" },
{ "vars.$": 1 })
• Increment / decrement inventory for an item at a store
db.inventory.update(
{ storeId: "store100", productId: "p200", "vars.sku": "sku11736" },
{ $inc: { "vars.$.q": 20 } })
• Indices: productId, storeId + productId, location (geo) + productId
Inventory - Stores
44
• Aggregate total quantity for an item
db.inventory.aggregate([
{ $match: { productId: "p200" }},
{ $unwind: "$vars" },
{ $group: { _id: "result", count: {$sum: 1} } }])
{ "_id" : "result", "count" : 101752 }
• Aggregate total quantity for a store
db.inventory.aggregate([
{ $match: { storeId: "store100" }},
{ $unwind: "$vars" },
{ $group: { _id: "result", count: {$sum: 1} } }])
{ "_id" : "result", "count" : 29347 }
Inventory - Stores
45
• Get inventory for an item near a point
db.runCommand(
{ "geoNear" : "inventory" , "near" : [ -82.800672 , 40.090844] ,
"maxDistance" : 10.0 , "spherical" : true, limit: 10,
query: { productId: "p200", "vars.sku": "sku11736" }})
• Get closest store with available sku
db.runCommand(
{ "geoNear" : "inventory" , "near" : [ -82.800672 , 40.090844] ,
"maxDistance" : 10.0 , "spherical" : true, limit: 10,
query: { productId: "p200",
vars: { $elemMatch: { "sku": "sku11736", q: { $gt: 0 } }}}}})
Inventory - Stores
Customer
47
Customer
Customer
MongoDB
Profile
Market Segment
Demographics
Wish List
Preference
Inbox
Sales / Support
Chat
Content
Subscription
Channels
49
Channels
Channels
MongoDB
Location
Store
Assortment
Point of Sale
Channel Definition
Planogram
Sales & Fulfillment
51
Sales & Fulfillment
Sales &
Fulfillment
MongoDB
Sales Transaction
Shipping
Tracking
Return & Exchange
Business Rule
Audit
Shopping Cart
Insight
53
Insight
Insight
MongoDB
Advertising metrics
Clickstream
Recommendations
Session Capture
Activity Logging
Geo Tracking
Product Analytics
Customer Insight
Application Logs
54
• Many user activities can be of interest:
– Search
– Product view, like or wish
– Shopping cart add / remove
– Sharing on social network
– Ad impression, Clickstream
• Those will be used to compute:
– Product Map (relationships, etc)
– User Preferences
– Recommendations
– Trends
Activity Logging – Data of interest
55
Activity logging - Architecture
MongoDB
HVDF
API
Activity Logging
User History
External
Analytics:
Hadoop,
Spark,
Storm,
…
User Preferences
Recommendations
Trends
Product Map
Apps
Internal
Analytics:
Aggregation,
MR
All user activity
is recorded
MongoDB –
Hadoop
Connector
Personalization
56
Activity Logging
57
• You need to store and manage an incoming stream of data
samples (views, impressions, orders, …)
– High arrival rate of data from many sources
– Variable schema of arriving data
– You need to control retention period of data
• You need to compute derivative data sets based on these
samples
– Aggregations and statistics based on data
– Roll-up data into pre-computed reports and summaries
• You need low latency access to up-to-date data (user history)
– Flexible indexing of raw and derived data sets
– Rich querying based on time + meta-data fields in samples
Activity Logging – Problem statement
58
Activity logging - Requirements
Requirement MongoDB
Ingestion of 100ks of
writes / sec
Fast C++ process, multi-threads, multi-locks. Horizontal
scaling via sharding. Sequential IO via time partitioning.
Flexible schema Dynamic schema, each document is independent. Data is
stored the same format and size as it is inserted.
Fast querying on varied
fields, sorting
Secondary Btree indexes can lookup and sort the data in
milliseconds.
Easy clean up of old data Deletes are typically as expensive as inserts. Getting free
deletes via time partitioning.
59
Activity Logging using HVDF
HVDF (High Volume Data Feed):
• Open source reference implementation of high
volume writing with MongoDB
• Rest API server written in Java with most
popular libraries
• Public project, issues can be logged
• Can be run as-is, or customized as needed
60
Feed
High volume data feed architecture
Channel
Sample Sample Sample Sample
Source
Source
Processor
Inline
Processing
Batch
Processing
Stream
Processing
The Channel is the
sequence of data
samples that a sensor
sends into the
platform.
Sources send
samples into
the Channel
Processors generate
derivative Channels from
other Channel data
61
HVDF -- High Volume Data Feed engine
HVDF – Reference implementation
REST
Service API
Processor
Plugins
Inline
Batch
Stream
Channel Data Storage
Raw
Channel
Data
Aggregated
Rollup T1
Aggregated
Rollup T2
Query Processor Streaming spout
Custom Stream
Processing Logic
Incoming Sample Stream
POST /feed/channel/data
GET
/feed/channeldata?time=XX
X&range=YYY
Real-time Queries
62
{ _id: ObjectId(),
geoCode: 1, // used to localize write operations
sessionId: "2373BB…",
device: { id: "1234",
type: "mobile/iphone",
userAgent: "Chrome/34.0.1847.131"
}
type: "VIEW|CART_ADD|CART_REMOVE|ORDER|…", // type of activity
itemId: "301671",
sku: "730223104376",
order: { id: "12520185",
… },
location: [ -86.95444, 33.40178 ],
tags: [ "smartphone", "iphone", … ], // associated tags
timeStamp: Date("2014/04/01 …")
}
User Activity - Model
63
Dynamic schema for sample data
Sample 1
{
deviceId: XXXX,
time: Date(…)
type: "VIEW",
…
}
Channel
Sample 2
{
deviceId: XXXX,
time: Date(…)
type: "CART_ADD",
cartId: 123, …
}
Sample 3
{
deviceId: XXXX,
time: Date(…)
type: “FB_LIKE”
}
Each sample
can have
variable fields
64
Channels are sharded
Shard
Shard
Shard
Shard
Shard
Shard Key:
Customer_id
Sample
{
customer_id: XXXX,
time: Date(…)
type: "VIEW",
}
Channel
You choose how
to partition
samples
Samples can
have dynamic
schema
Scale
horizontally by
adding shards
Each shard is
highly available
65
Channels are time partitioned
Channel
Sample Sample Sample Sample Sample Sample Sample Sample
- 2 days - 1 Day Today
Partitioning
keeps indexes
manageable
This is where all
of the writes
happen
Older partitions
are read only for
best possible
concurrency
Queries are routed
only to needed
partitions
Partition 1 Partition 2 Partition N
Each partition is
a separate
collection
Efficient and
space reclaiming
purging of old
data
66
Dynamic queries on Channels
Channel
Sample Sample Sample Sample
App
App
App
Indexes
Queries Pipelines Map-Reduce
Create custom
indexes on
Channels
Use full mongodb
query language to
access samples
Use mongodb
aggregation
pipelines to access
samples
Use mongodb
inline map-reduce
to access samples
Full access to
field, text, and geo
indexing
67
North America - West
North America - East
Europe
Geographically distributed system
Channel
Sample Sample Sample Sample
Source
Source
Source
Source
Source
Source
Sample
Sample
Sample
Sample
Geo shards per
location
Clients write
local nodes
Single view of
channel available
globally
68
Insight
69
Insight – Useful Data
• Useful data for better shopping:
– User history (e.g. recently seen products)
– User statistics (e.g. total purchases, visits)
– User interests (e.g. likes videogames and SciFi)
– User social network
– Cross-selling: people who bought this item had
tendency to buy those other items (e.g. iPhone, then
bought iPhone case)
– Up-selling: people who looked at this item eventually
bought those items (alternative product that may be
better)
70
Example of real-time aggregation with Agg Framework
User Activity – Computing User Stats
71
Example of real-time aggregation with Agg Framework
User Activity – Computing User Stats
72
Let's simplify each activity recorded as the following:
{ userId: 123, type: order, itemId: 2, time }
{ userId: 123, type: order, itemId: 3, time }
{ userId: 234, type: order, itemId: 7, time }
To calculate items bought by a user for a period of time, let's use
MongoDB's Map Reduce:
- Match activities of type "order" for the past 2 weeks
- map: emit the document by userId
- reduce: push all itemId in a list
- Output looks like { _id: userId, items: [2, 3, 8] }
User Activity –
Items frequently bought together
73
Then run a 2nd mapreduce job that for each of the previous results:
- map: emits every combination of 2 items, starting with lowest
itemId
- reduce: sum up the total.
- output looks like { _id: { a: 2, b: 3 } , count: 36 }
User Activity –
Items frequently bought together
74
The output collection can then be queried per item Id and sorted by
count, and cutoff at a threshold.
Need of index on { _id.a, count } and { _id.b, count }
You then obtain an affiliation collection with docs like:
{ itemId: 2, affil: [ { id: 3, weight: 36}, { id: 8, weight: 23} ] }
User Activity –
Items frequently bought together
75
Example of Hadoop integration
User Activity – Hadoop integration
Social
77
Social
Social
MongoDB
Social Channels
User Network
Activity
Chat
Social Profiles
Community Mgt
Rewards /
Gamification
Conclusion
Appendix
83
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Single View of Product Cluster Topology
84
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DCPrimary node replicates data
to all secondaries in the shard
as fast as possible
Single View of Product Cluster Topology
85
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Center Shard contains
all the data for stores
in Center region
Single View of Product Cluster Topology
86
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Center Shard contains
all the data for stores
in Center region
Local writes enable
very high throughput
of updates
Single View of Product Cluster Topology
87
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Each region is able to
see the data of all
stores from its “local”
DC.
Single View of Product Cluster Topology
88
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Two nodes in each DC
for painless maintenance
with zero downtime
Single View of Product Cluster Topology
89
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Even if a DC goes out, the
database remains fully available
thanks to automated failover
Single View of Product Cluster Topology
90
West DC
Primary
Primary
Primary
Shard
“West”
Shard
“Center”
Shard
“East”
Center DC East DC
Data set can grow, shards can
add up, without any rewrite of the
application code
Single View of Product Cluster Topology
Thank You!
Antoine Girbal
Senior Solutions Engineer, MongoDB Inc.
@antoinegirbal
Retail Reference Architecture Part 3: Scalable Insight Component Providing User History, Recommendations and Personalization

Weitere ähnliche Inhalte

Andere mochten auch

Retail Industry Enterprise Architecture Review
Retail Industry Enterprise Architecture ReviewRetail Industry Enterprise Architecture Review
Retail Industry Enterprise Architecture ReviewLakshmana Kattula
 
MongoDB at eBay
MongoDB at eBayMongoDB at eBay
MongoDB at eBayMongoDB
 
[@IndeedEng] Imhotep Workshop
[@IndeedEng] Imhotep Workshop[@IndeedEng] Imhotep Workshop
[@IndeedEng] Imhotep Workshopindeedeng
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayMongoDB
 
Shopzilla On Concurrency
Shopzilla On ConcurrencyShopzilla On Concurrency
Shopzilla On ConcurrencyRodney Barlow
 
Bizrate Insights iMedia Conference Presentation
Bizrate Insights iMedia Conference PresentationBizrate Insights iMedia Conference Presentation
Bizrate Insights iMedia Conference PresentationConnexity
 
LA Salesforce.com User Group: Shopzilla and Informatica Cloud
LA Salesforce.com User Group: Shopzilla and Informatica CloudLA Salesforce.com User Group: Shopzilla and Informatica Cloud
LA Salesforce.com User Group: Shopzilla and Informatica CloudDarren Cunningham
 
Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...
Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...
Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...Joshua Long
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignJuliet Hougland
 
Retail operation in Reliance Trends and its impact on customer satisfaction
Retail operation in Reliance Trends and its impact on customer satisfactionRetail operation in Reliance Trends and its impact on customer satisfaction
Retail operation in Reliance Trends and its impact on customer satisfactionSubhajit Sar
 
Shopzilla - Performance By Design
Shopzilla - Performance By DesignShopzilla - Performance By Design
Shopzilla - Performance By DesignTim Morrow
 
5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...
5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...
5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...Internet Marketing Software - WordStream
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolLaurent Kinet
 
5 Strategies For Profitable Retail
5 Strategies For Profitable Retail5 Strategies For Profitable Retail
5 Strategies For Profitable RetailCharanpreet Singh
 
Big data retail_industry_by VivekChutke
Big data retail_industry_by VivekChutkeBig data retail_industry_by VivekChutke
Big data retail_industry_by VivekChutkevchutke
 
Big Data in Retail: too big to ignore
Big Data in Retail: too big to ignoreBig Data in Retail: too big to ignore
Big Data in Retail: too big to ignorevalantic NL
 
Continuous Performance Testing and Monitoring in Agile Development
Continuous Performance Testing and Monitoring in Agile DevelopmentContinuous Performance Testing and Monitoring in Agile Development
Continuous Performance Testing and Monitoring in Agile DevelopmentDynatrace
 
The Big Data Journey at Connexity - Big Data Day LA 2015
The Big Data Journey at Connexity - Big Data Day LA 2015The Big Data Journey at Connexity - Big Data Day LA 2015
The Big Data Journey at Connexity - Big Data Day LA 2015Will Gage
 
Competetive Differentiation - The past, present and future of newspapers
Competetive Differentiation - The past, present and future of newspapersCompetetive Differentiation - The past, present and future of newspapers
Competetive Differentiation - The past, present and future of newspapersHans-Erik Hamid Lydecker (MSc)
 

Andere mochten auch (20)

Retail Industry Enterprise Architecture Review
Retail Industry Enterprise Architecture ReviewRetail Industry Enterprise Architecture Review
Retail Industry Enterprise Architecture Review
 
MongoDB at eBay
MongoDB at eBayMongoDB at eBay
MongoDB at eBay
 
[@IndeedEng] Imhotep Workshop
[@IndeedEng] Imhotep Workshop[@IndeedEng] Imhotep Workshop
[@IndeedEng] Imhotep Workshop
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
 
Shopzilla On Concurrency
Shopzilla On ConcurrencyShopzilla On Concurrency
Shopzilla On Concurrency
 
Bizrate Insights iMedia Conference Presentation
Bizrate Insights iMedia Conference PresentationBizrate Insights iMedia Conference Presentation
Bizrate Insights iMedia Conference Presentation
 
LA Salesforce.com User Group: Shopzilla and Informatica Cloud
LA Salesforce.com User Group: Shopzilla and Informatica CloudLA Salesforce.com User Group: Shopzilla and Informatica Cloud
LA Salesforce.com User Group: Shopzilla and Informatica Cloud
 
Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...
Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...
Better Living Through Messaging - Leveraging the HornetQ Message Broker at Sh...
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
 
Retail operation in Reliance Trends and its impact on customer satisfaction
Retail operation in Reliance Trends and its impact on customer satisfactionRetail operation in Reliance Trends and its impact on customer satisfaction
Retail operation in Reliance Trends and its impact on customer satisfaction
 
Shopzilla - Performance By Design
Shopzilla - Performance By DesignShopzilla - Performance By Design
Shopzilla - Performance By Design
 
5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...
5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...
5 Conversion Rate Hacks That Yield Massive 3-5x Conversion Rate Improvements ...
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business School
 
5 Strategies For Profitable Retail
5 Strategies For Profitable Retail5 Strategies For Profitable Retail
5 Strategies For Profitable Retail
 
Big data retail_industry_by VivekChutke
Big data retail_industry_by VivekChutkeBig data retail_industry_by VivekChutke
Big data retail_industry_by VivekChutke
 
The Big Data Revolution in Retail
The Big Data Revolution in RetailThe Big Data Revolution in Retail
The Big Data Revolution in Retail
 
Big Data in Retail: too big to ignore
Big Data in Retail: too big to ignoreBig Data in Retail: too big to ignore
Big Data in Retail: too big to ignore
 
Continuous Performance Testing and Monitoring in Agile Development
Continuous Performance Testing and Monitoring in Agile DevelopmentContinuous Performance Testing and Monitoring in Agile Development
Continuous Performance Testing and Monitoring in Agile Development
 
The Big Data Journey at Connexity - Big Data Day LA 2015
The Big Data Journey at Connexity - Big Data Day LA 2015The Big Data Journey at Connexity - Big Data Day LA 2015
The Big Data Journey at Connexity - Big Data Day LA 2015
 
Competetive Differentiation - The past, present and future of newspapers
Competetive Differentiation - The past, present and future of newspapersCompetetive Differentiation - The past, present and future of newspapers
Competetive Differentiation - The past, present and future of newspapers
 

Ähnlich wie Retail Reference Architecture Part 3: Scalable Insight Component Providing User History, Recommendations and Personalization

Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Creating a Single View: Overview and Analysis
Creating a Single View: Overview and AnalysisCreating a Single View: Overview and Analysis
Creating a Single View: Overview and AnalysisMongoDB
 
Creating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisMongoDB
 
Salesforce Analytics Cloud - Explained
Salesforce Analytics Cloud - ExplainedSalesforce Analytics Cloud - Explained
Salesforce Analytics Cloud - ExplainedCarl Brundage
 
Calculating ROI with Innovative eCommerce Platforms
Calculating ROI with Innovative eCommerce PlatformsCalculating ROI with Innovative eCommerce Platforms
Calculating ROI with Innovative eCommerce PlatformsMongoDB
 
Amazon Machine Learning #AWSLoft Berlin
Amazon Machine Learning #AWSLoft BerlinAmazon Machine Learning #AWSLoft Berlin
Amazon Machine Learning #AWSLoft BerlinAWS Germany
 
Microsoft Adverting Shopping Campaigns
Microsoft Adverting Shopping CampaignsMicrosoft Adverting Shopping Campaigns
Microsoft Adverting Shopping CampaignsMSFTAdvertising
 
Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessGiuseppe Gaviani
 
Data_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfData_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfjill734733
 
eBusiness Website Database Design
eBusiness Website Database DesigneBusiness Website Database Design
eBusiness Website Database DesignMeng (Meg) Wang
 
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer ProfilesBig Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer ProfilesMongoDB
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessyalisassoon
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013
SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013
SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013Ryan McIntyre
 
Application architecture jumpstart
Application architecture jumpstartApplication architecture jumpstart
Application architecture jumpstartClint Edmonson
 
Application Architecture Jumpstart
Application Architecture JumpstartApplication Architecture Jumpstart
Application Architecture JumpstartClint Edmonson
 

Ähnlich wie Retail Reference Architecture Part 3: Scalable Insight Component Providing User History, Recommendations and Personalization (20)

Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
Creating a Single View: Overview and Analysis
Creating a Single View: Overview and AnalysisCreating a Single View: Overview and Analysis
Creating a Single View: Overview and Analysis
 
Creating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data Analysis
 
Salesforce Analytics Cloud - Explained
Salesforce Analytics Cloud - ExplainedSalesforce Analytics Cloud - Explained
Salesforce Analytics Cloud - Explained
 
Calculating ROI with Innovative eCommerce Platforms
Calculating ROI with Innovative eCommerce PlatformsCalculating ROI with Innovative eCommerce Platforms
Calculating ROI with Innovative eCommerce Platforms
 
Amazon Machine Learning #AWSLoft Berlin
Amazon Machine Learning #AWSLoft BerlinAmazon Machine Learning #AWSLoft Berlin
Amazon Machine Learning #AWSLoft Berlin
 
Microsoft Adverting Shopping Campaigns
Microsoft Adverting Shopping CampaignsMicrosoft Adverting Shopping Campaigns
Microsoft Adverting Shopping Campaigns
 
1030 track2 komp
1030 track2 komp1030 track2 komp
1030 track2 komp
 
Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your business
 
Data_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfData_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdf
 
eBusiness Website Database Design
eBusiness Website Database DesigneBusiness Website Database Design
eBusiness Website Database Design
 
Business Eye 360 EN
Business Eye 360 ENBusiness Eye 360 EN
Business Eye 360 EN
 
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer ProfilesBig Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
1120 track2 komp
1120 track2 komp1120 track2 komp
1120 track2 komp
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013
SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013
SharePoint 2013 Search - A Developer’s Perspective - SPSSV 2013
 
Application architecture jumpstart
Application architecture jumpstartApplication architecture jumpstart
Application architecture jumpstart
 
Application Architecture Jumpstart
Application Architecture JumpstartApplication Architecture Jumpstart
Application Architecture Jumpstart
 
Week 03.pdf
Week 03.pdfWeek 03.pdf
Week 03.pdf
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Kürzlich hochgeladen (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Retail Reference Architecture Part 3: Scalable Insight Component Providing User History, Recommendations and Personalization

  • 1. Retail Reference Architecture with MongoDB Antoine Girbal Principal Solutions Engineer, MongoDB Inc. @antoinegirbal
  • 4. 4 MongoDB Strategic Advantages Horizontally Scalable -Sharding Agile Flexible High Performance & Strong Consistency Application Highly Available -Replica Sets { customer: “roger”, date: new Date(), comment: “Spirited Away”, tags: [“Tezuka”, “Manga”]}
  • 5. 5 Documents let you build your data to fit your application Relational MongoDB { customer_id : 1, name : "Mark Smith", city : "San Francisco", orders: [ { order_number : 13, store_id : 10, date: “2014-01-03”, products: [ {SKU: 24578234, Qty: 3, Unit_price: 350}, {SKU: 98762345, Qty: 1, Unit_Price: 110} ] }, { <...> } ] } CustomerID First Name Last Name City 0 John Doe New York 1 Mark Smith San Francisco 2 Jay Black Newark 3 Meagan White London 4 Edward Danields Boston Order Number Store ID Product Customer ID 10 100 Tablet 0 11 101 Smartphone 0 12 101 Dishwasher 0 13 200 Sofa 1 14 200 Coffee table 1 15 201 Suit 2
  • 6. 6 Notions RDBMS MongoDB Database Database Table Collection Row Document Column Field
  • 8. 8 Information Management Merchandising Content Inventory Customer Channel Sales & Fulfillment Insight Social Architecture Overview Customer Channels Amazon Ebay … Stores POS Kiosk … Mobile Smartphone Tablet Website Contact Center API Data and Service Integration Social Facebook Twitter … Data Warehouse Analytics Supply Chain Management System Suppliers 3rd Party In Network Web Servers Application Servers
  • 9. 9 Commerce Functional Components Information Layer Look & Feel Navigation Customization Personalization Branding Promotions Chat Ads Customer's Perspective Research Browse Search Select Shopping Cart Purchase Checkout Receive Track Use Feedback Maintain Dialog Assist Market / Offer Guide Offer Semantic Search Recommend Rule-based Decisions Pricing Coupons Sell / Fullfill Orders Payments Fraud Detection Fulfillment Business Rules Insight Session Capture Activity Monitoring Customer Enterprise Information Management Merchandising Content Inventory Customer Channel Sales & Fulfillment Insight Social
  • 11. 11 Merchandising Merchandising MongoDB Product Variation Product Hierarchy Pricing Promotions Ratings & Reviews Calendar Semantic Search Product Definition Localization
  • 12. 12 • Single view of a product: Single scalable catalog service used by all services and channels • Read volume is high and sustained • Write volume spikes up during catalog update, but also allows real-time updating of a product • Advanced indexing and querying is a requirement: find product by SKU, category, color, etc • Geographical distribution and low latency achieved through replication • Scaling achieved through sharding Merchandising - principles
  • 13. 13 Merchandising - requirements Requirement Example Challenge MongoDB Single-view of product Blended description and hierarchy of product to ensure availability on all channels Flexible document-oriented storage High sustained read volume with low latency Constant querying from online users and sales associates, requiring immediate response Fast indexed querying, replication allows local copy of catalog, sharding for scaling Spiky and real-time write volume Bulk update of full catalog without impacting production, real-time touch update Fast in-place updating, real- time indexing, , sharding for scaling Advanced querying Find product based on color, size, description Ad-hoc querying on any field, advanced secondary and compound indexing
  • 14. 14 Merchandising - Product Page Product images General Informatio n List of Variations External Informatio n Localized Description
  • 15. 15 > db.definitions.findOne() { productId: "301671", // main product id department: "Shoes", category: "Shoes/Women/Pumps", brand: "Guess", thumbnail: "http://cdn…/pump.jpg", image: "http://cdn…/pump1.jpg", // larger version of thumbnail title: "Evening Platform Pumps", description: "Those evening platform pumps put the perfect finishing touches on your most glamourous night-on-the-town outfit", shortDescription: "Evening Platform Pumps", style: "Designer", type: "Platform", rating: 4.5, // user rating lastUpdated: Date("2014/04/01"), // last update time … } Merchandising - Product Definition
  • 16. 16 • Get item from Product Id db.definition.findOne( { productId: "301671" } ) • Get item from Product Ids db.definition.findOne( { productId: { $in: ["301671", "301672" ] } } ) • Get items by department db.definition.find({ department: "Shoes" }) • Get items by category prefix db.definition.find( { category: /^Shoes/Women/ } ) • Indices productId, department, category, lastUpdated Merchandising - Product Definition
  • 17. 17 > db.variations.findOne() { _id: "730223104376", // the sku productId: "301671", // references product id thumbnail: "http://cdn…/pump-red.jpg", image: "http://cdn…/pump-red.jpg", // larger version of thumbnail size: 6.0, color: "Red", width: "B", heelHeight: 5.0, lastUpdated: Date("2014/04/01"), // last update time … } Merchandising - Product Variation
  • 18. 18 • Get Variation from SKU db.variation.find( { _id: "730223104376" } ) • Get all variations for a product, sorted by SKU db.variation.find( { productId: "301671" } ).sort( { _id: 1 } ) • Indices productId, lastUpdated Merchandising - Product Variation
  • 19. 20 Price: { _id: "sku730223104376_store123", currency: "USD", price: 89.95, lastUpdated: Date("2014/04/01"), // last update time … } _id: concatenation of item and store. Store: can be a store group or store id. Item: can be an item id or sku Indices: lastUpdated Merchandising – Pricing
  • 20. 21 • Get all prices for a given item db.prices.find( { _id: /^p301671_/ ) • Get all prices for a given sku (price could be at item level) db.prices.find( { _id: { $in: [ /^sku730223104376_/, /^p301671_/ ]) • Get minimum and maximum prices for a sku db.prices.aggregate( { match }, { $group: { _id: 1, min: { $min: price }, max: { $max : price} } }) • Get price for a sku and store id (returns up to 4 prices) db.prices.find( { _id: { $in: [ "sku730223104376_store1234", "sku730223104376_sgroup0", "p301671_store1234", "p301671_sgroup0"] , { price: 1 }) Merchandising - Pricing
  • 21. 22 • The hierarchy of items typically follows: • Company – Division: • Department: Women's shoe store – Class: Pumps »Item: Guess classic pump • Variation: size 6 black Merchandising – Product Hierarchy
  • 22. 24 Merchandising – Browse and Search products Browse by category Special Lists Filter by attributes Lists hundreds of item summaries Ideally a single query is issued to the database to obtain all items and metadata to display
  • 23. 25 The previous page presents many challenges: • Response is needed within milliseconds for hundreds of items • Faceted search on many attributes of an item: department, brand, category, etc • Attributes to match may be at the variation level: color, size, etc, in which case the variation should be shown • One item may have thousands of variations. Only one item should be displayed even if many variations match • Efficient sorting on several attributes: price, popularity • Pagination feature which requires deterministic ordering Merchandising – Browse and Search products
  • 24. 26 Merchandising – Browse and Search products Hundreds of sizes One Item Dozens of colors A single item may have thousands of variations
  • 25. 27 Merchandising – Browse and Search products Images of the matching variations are displayed Hierarchy Sort parameter Faceted Search
  • 26. 28 Merchandising – Traditional Architecture Relational DB System of Records Full Text Search Engine Indexing #1 obtain search results IDs ApplicationCache #2 obtain objects by ID Pre-joined into objects
  • 27. 29 The traditional architecture presents issues: • 3 different systems to maintain: RDBMS, Search engine, Caching layer • A search returns a list of IDs which then are looked up in the cache as a batch or one by one. It significantly increases latency of response • RDBMS schema is complex and static • The search index needs to be refreshed at intervals • Setup does not allow efficient pagination Merchandising – Traditional Architecture
  • 28. 30 MongoDB Data Store Merchandising - Architecture Product Summaries Product Definitions Pricing Promotions Product Variations Ratings & Reviews #1 Obtain results
  • 29. 31 The product index relies on the following parameters: • The department (required): the main component of category, e.g. "Shoes" • An indexed attribute (optional) – Category path, e.g. "Shoes/Women/Pumps" – Price range (based on online prices) – List of Item Attributes, e.g. Brand = Guess – List of Variation Attributes, e.g. Color = red • A non-indexed attribute (optional) – List of Item Secondary Attributes, e.g. Style = Designer – List of Variation Secondary Attributes, e.g. heel height = 5.0 • As well as Sorting, e.g. Price Low to High Merchandising – Product Summaries
  • 30. 32 > db.summaries.findOne() { "_id": "p39", "title": "Evening Platform Pumps 39", "department": "Shoes", "category": "Shoes/Women/Pumps", "thumbnail": "http://cdn…/pump-small-39.jpg", "image": "http://cdn…/pump-39.jpg", "price": 145.99, "rating": 0.95, "attrs": [ { "brand" : "Guess"}, … ], "sattrs": [ { "style" : "Designer"} , { "type" : "Platform"}, …], "vars": [ { "sku": "sku2441", "thumbnail": "http://cdn…/pump-small-39.jpg.Blue", "image": "http://cdn…/pump-39.jpg.Blue", "attrs": [ { "size": 6.0 }, { "color": "Blue" }, …], "sattrs": [ { "width" : "B"} , { "heelHeight" : 5.0 }, …], }, … Many more skus … ] } Indices: vars.sku, department + attr + category, department + vars.attrs + category, department + category, department + price, department + rating Merchandising – Product Summaries
  • 31. 33 • Get summary from item id db.variation.find({ _id: "p301671" }) • Get summary's specific variation from SKU db.variation.find( { "vars.sku": "730223104376" }, { "vars.$": 1 } ) • Get summary by department, sorted by rating db.variation.find( { department: "Shoes" } ).sort( { rating: 1 } ) • Get summary with mix of parameters db.variation.find( { department : "Shoes" , "vars.attrs" : { "color" : "Gray"} , "category" : ^/Shoes/Women/ , "price" : { "$gte" : 65.99 , "$lte" : 180.99 } } ) Merchandising - Product Summaries
  • 32. 34 Merchandising – Query stats Department Category Price Primary attribute Time Average (ms) 90th (ms) 95th (ms) 1 0 0 0 2 3 3 1 1 0 0 1 2 2 1 0 1 0 1 2 3 1 1 1 0 1 2 2 1 0 0 1 0 1 2 1 1 0 1 0 1 1 1 0 1 1 1 2 2 1 1 1 1 0 1 1 1 0 0 2 1 3 3 1 1 0 2 0 2 2 1 0 1 2 10 20 35 1 1 1 2 0 1 1
  • 34. 36 Content Content MongoDB Metadata Asset Repository Digital Right Mgt Access Control Processing / Encoding
  • 36. 38 Inventory Inventory MongoDB External Inventory Internal Inventory Regional Inventory Purchase Orders Fulfillment Promotions
  • 37. 39 Demonstration Document Model Definitions • id: p0 Variations • id: sku0 • pId: p0 Summary • id: p0 • vars: [sku0, sku1, …] Stores • id: s1 • Loc: [22, 33] Inventory • store: s1 • pId: p0 • vars: [{sku: sku0, q: 3}, {sku: sku2, q: 2}] Product
  • 38. 40 db.stores.findOne() { "_id" : ObjectId("53549fd3e4b0aaf5d6d07f35"), "className" : "catalog.Store", "storeId" : "store0", "name" : "Bessemer store", "address" : { "addr1" : "1st Main St", "city" : "Bessemer", "state" : "AL", "zip" : "12345", "country" : "US" }, "location" : [ -86.95444, 33.40178 ] … } Inventory - Stores
  • 39. 41 • Get a store by storeId db.stores.find({ productId: "301671" }) • Get nearby stores sorted by distance db.stores.runCommand({ "geoNear" : "stores" , "near" : [ -82.800672 , 40.090844] , "maxDistance" : 10.0 , "spherical" : true} Inventory - Stores
  • 40. 42 > db.inventory.findOne() { "_id": "5354869f300487d20b2b011d", "storeId": "store0", "location": [ -86.95444, 33.40178 ], "productId": "p0", "vars": [ { "sku": "sku1", "q": 14 }, { "sku": "sku3", "q": 7 }, { "sku": "sku7", "q": 32 }, { "sku": "sku14", "q": 65 }, ... ] } Inventory - Quantities
  • 41. 43 • Get all items in a store db.inventory.find({ storeId: "store100" }) • Get quantity for an item at a store db.inventory.find({ storeId: "store100", productId: "p200" }) • Get quantity for a sku at a store db.inventory.find( { storeId: "store100", productId: "p200", "vars.sku": "sku11736" }, { "vars.$": 1 }) • Increment / decrement inventory for an item at a store db.inventory.update( { storeId: "store100", productId: "p200", "vars.sku": "sku11736" }, { $inc: { "vars.$.q": 20 } }) • Indices: productId, storeId + productId, location (geo) + productId Inventory - Stores
  • 42. 44 • Aggregate total quantity for an item db.inventory.aggregate([ { $match: { productId: "p200" }}, { $unwind: "$vars" }, { $group: { _id: "result", count: {$sum: 1} } }]) { "_id" : "result", "count" : 101752 } • Aggregate total quantity for a store db.inventory.aggregate([ { $match: { storeId: "store100" }}, { $unwind: "$vars" }, { $group: { _id: "result", count: {$sum: 1} } }]) { "_id" : "result", "count" : 29347 } Inventory - Stores
  • 43. 45 • Get inventory for an item near a point db.runCommand( { "geoNear" : "inventory" , "near" : [ -82.800672 , 40.090844] , "maxDistance" : 10.0 , "spherical" : true, limit: 10, query: { productId: "p200", "vars.sku": "sku11736" }}) • Get closest store with available sku db.runCommand( { "geoNear" : "inventory" , "near" : [ -82.800672 , 40.090844] , "maxDistance" : 10.0 , "spherical" : true, limit: 10, query: { productId: "p200", vars: { $elemMatch: { "sku": "sku11736", q: { $gt: 0 } }}}}}) Inventory - Stores
  • 49. 51 Sales & Fulfillment Sales & Fulfillment MongoDB Sales Transaction Shipping Tracking Return & Exchange Business Rule Audit Shopping Cart
  • 51. 53 Insight Insight MongoDB Advertising metrics Clickstream Recommendations Session Capture Activity Logging Geo Tracking Product Analytics Customer Insight Application Logs
  • 52. 54 • Many user activities can be of interest: – Search – Product view, like or wish – Shopping cart add / remove – Sharing on social network – Ad impression, Clickstream • Those will be used to compute: – Product Map (relationships, etc) – User Preferences – Recommendations – Trends Activity Logging – Data of interest
  • 53. 55 Activity logging - Architecture MongoDB HVDF API Activity Logging User History External Analytics: Hadoop, Spark, Storm, … User Preferences Recommendations Trends Product Map Apps Internal Analytics: Aggregation, MR All user activity is recorded MongoDB – Hadoop Connector Personalization
  • 55. 57 • You need to store and manage an incoming stream of data samples (views, impressions, orders, …) – High arrival rate of data from many sources – Variable schema of arriving data – You need to control retention period of data • You need to compute derivative data sets based on these samples – Aggregations and statistics based on data – Roll-up data into pre-computed reports and summaries • You need low latency access to up-to-date data (user history) – Flexible indexing of raw and derived data sets – Rich querying based on time + meta-data fields in samples Activity Logging – Problem statement
  • 56. 58 Activity logging - Requirements Requirement MongoDB Ingestion of 100ks of writes / sec Fast C++ process, multi-threads, multi-locks. Horizontal scaling via sharding. Sequential IO via time partitioning. Flexible schema Dynamic schema, each document is independent. Data is stored the same format and size as it is inserted. Fast querying on varied fields, sorting Secondary Btree indexes can lookup and sort the data in milliseconds. Easy clean up of old data Deletes are typically as expensive as inserts. Getting free deletes via time partitioning.
  • 57. 59 Activity Logging using HVDF HVDF (High Volume Data Feed): • Open source reference implementation of high volume writing with MongoDB • Rest API server written in Java with most popular libraries • Public project, issues can be logged • Can be run as-is, or customized as needed
  • 58. 60 Feed High volume data feed architecture Channel Sample Sample Sample Sample Source Source Processor Inline Processing Batch Processing Stream Processing The Channel is the sequence of data samples that a sensor sends into the platform. Sources send samples into the Channel Processors generate derivative Channels from other Channel data
  • 59. 61 HVDF -- High Volume Data Feed engine HVDF – Reference implementation REST Service API Processor Plugins Inline Batch Stream Channel Data Storage Raw Channel Data Aggregated Rollup T1 Aggregated Rollup T2 Query Processor Streaming spout Custom Stream Processing Logic Incoming Sample Stream POST /feed/channel/data GET /feed/channeldata?time=XX X&range=YYY Real-time Queries
  • 60. 62 { _id: ObjectId(), geoCode: 1, // used to localize write operations sessionId: "2373BB…", device: { id: "1234", type: "mobile/iphone", userAgent: "Chrome/34.0.1847.131" } type: "VIEW|CART_ADD|CART_REMOVE|ORDER|…", // type of activity itemId: "301671", sku: "730223104376", order: { id: "12520185", … }, location: [ -86.95444, 33.40178 ], tags: [ "smartphone", "iphone", … ], // associated tags timeStamp: Date("2014/04/01 …") } User Activity - Model
  • 61. 63 Dynamic schema for sample data Sample 1 { deviceId: XXXX, time: Date(…) type: "VIEW", … } Channel Sample 2 { deviceId: XXXX, time: Date(…) type: "CART_ADD", cartId: 123, … } Sample 3 { deviceId: XXXX, time: Date(…) type: “FB_LIKE” } Each sample can have variable fields
  • 62. 64 Channels are sharded Shard Shard Shard Shard Shard Shard Key: Customer_id Sample { customer_id: XXXX, time: Date(…) type: "VIEW", } Channel You choose how to partition samples Samples can have dynamic schema Scale horizontally by adding shards Each shard is highly available
  • 63. 65 Channels are time partitioned Channel Sample Sample Sample Sample Sample Sample Sample Sample - 2 days - 1 Day Today Partitioning keeps indexes manageable This is where all of the writes happen Older partitions are read only for best possible concurrency Queries are routed only to needed partitions Partition 1 Partition 2 Partition N Each partition is a separate collection Efficient and space reclaiming purging of old data
  • 64. 66 Dynamic queries on Channels Channel Sample Sample Sample Sample App App App Indexes Queries Pipelines Map-Reduce Create custom indexes on Channels Use full mongodb query language to access samples Use mongodb aggregation pipelines to access samples Use mongodb inline map-reduce to access samples Full access to field, text, and geo indexing
  • 65. 67 North America - West North America - East Europe Geographically distributed system Channel Sample Sample Sample Sample Source Source Source Source Source Source Sample Sample Sample Sample Geo shards per location Clients write local nodes Single view of channel available globally
  • 67. 69 Insight – Useful Data • Useful data for better shopping: – User history (e.g. recently seen products) – User statistics (e.g. total purchases, visits) – User interests (e.g. likes videogames and SciFi) – User social network – Cross-selling: people who bought this item had tendency to buy those other items (e.g. iPhone, then bought iPhone case) – Up-selling: people who looked at this item eventually bought those items (alternative product that may be better)
  • 68. 70 Example of real-time aggregation with Agg Framework User Activity – Computing User Stats
  • 69. 71 Example of real-time aggregation with Agg Framework User Activity – Computing User Stats
  • 70. 72 Let's simplify each activity recorded as the following: { userId: 123, type: order, itemId: 2, time } { userId: 123, type: order, itemId: 3, time } { userId: 234, type: order, itemId: 7, time } To calculate items bought by a user for a period of time, let's use MongoDB's Map Reduce: - Match activities of type "order" for the past 2 weeks - map: emit the document by userId - reduce: push all itemId in a list - Output looks like { _id: userId, items: [2, 3, 8] } User Activity – Items frequently bought together
  • 71. 73 Then run a 2nd mapreduce job that for each of the previous results: - map: emits every combination of 2 items, starting with lowest itemId - reduce: sum up the total. - output looks like { _id: { a: 2, b: 3 } , count: 36 } User Activity – Items frequently bought together
  • 72. 74 The output collection can then be queried per item Id and sorted by count, and cutoff at a threshold. Need of index on { _id.a, count } and { _id.b, count } You then obtain an affiliation collection with docs like: { itemId: 2, affil: [ { id: 3, weight: 36}, { id: 8, weight: 23} ] } User Activity – Items frequently bought together
  • 73. 75 Example of Hadoop integration User Activity – Hadoop integration
  • 75. 77 Social Social MongoDB Social Channels User Network Activity Chat Social Profiles Community Mgt Rewards / Gamification
  • 79. 84 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DCPrimary node replicates data to all secondaries in the shard as fast as possible Single View of Product Cluster Topology
  • 80. 85 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DC Center Shard contains all the data for stores in Center region Single View of Product Cluster Topology
  • 81. 86 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DC Center Shard contains all the data for stores in Center region Local writes enable very high throughput of updates Single View of Product Cluster Topology
  • 82. 87 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DC Each region is able to see the data of all stores from its “local” DC. Single View of Product Cluster Topology
  • 83. 88 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DC Two nodes in each DC for painless maintenance with zero downtime Single View of Product Cluster Topology
  • 84. 89 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DC Even if a DC goes out, the database remains fully available thanks to automated failover Single View of Product Cluster Topology
  • 85. 90 West DC Primary Primary Primary Shard “West” Shard “Center” Shard “East” Center DC East DC Data set can grow, shards can add up, without any rewrite of the application code Single View of Product Cluster Topology
  • 86. Thank You! Antoine Girbal Senior Solutions Engineer, MongoDB Inc. @antoinegirbal

Hinweis der Redaktion

  1. Fix stream box. Add validator box.
  2. Would be useful to have diagram that mixes shards and time partitions