2. What to Expect from the Session
• Brief History of Data Processing (Why NoSQL?)
• Overview of DynamoDB
• Scaling characteristics, hot keys, and design considerations
• NoSQL Data Modeling
• Normalized versus Unstructured schema
• Common NoSQL Design Patterns
• Time Series, Write Sharding, MVCC, etc.
• DynamoDB in the Serverless Ecosystem
4. Data volume since 2011
DataVolume
Historical Current
90 percent of stored data
generated in last 2 years
1 terabyte of data in 2011
equals 6.5 petabytes today
Linear correlation between
data pressure and technical
innovation
No reason these trends will
not continue over time
6. Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
7. Amazon DynamoDB
Document or Key-Value Scales to Any WorkloadFully Managed NoSQL
Access Control Event Driven ProgrammingFast and Consistent
9. 00 55 A954 FFAA00 FF
Partition Keys
Id = 1
Name = Jim
Hash (1) = 7B
Id = 2
Name = Andy
Dept = Eng
Hash (2) = 48
Id = 3
Name = Kim
Dept = Ops
Hash (3) = CD
Key Space
Partition Key uniquely identifies an item
Partition Key is used for building an unordered hash index
Allows table to be partitioned for scale
10. Partition 3
Partition:Sort Key uses two attributes together to uniquely identify an Item
Within unordered hash index, data is arranged by the sort key
No limit on the number of items (∞) per partition key
Except if you have local secondary indexes
Partition:Sort Key
00:0 FF:∞
Hash (2) = 48
Customer# = 2
Order# = 10
Item = Pen
Customer# = 2
Order# = 11
Item = Shoes
Customer# = 1
Order# = 10
Item = Toy
Customer# = 1
Order# = 11
Item = Boots
Hash (1) = 7B
Customer# = 3
Order# = 10
Item = Book
Customer# = 3
Order# = 11
Item = Paper
Hash (3) = CD
55 A9:∞54:∞ AAPartition 1 Partition 2
11. Partitions are three-way replicated
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Replica 1
Replica 2
Replica 3
Partition 1 Partition 2 Partition N
12. Eventually Consistent Reads
Partition A
1000 RCUs
Partition A
1000 RCUs
Partition A
1000 RCUs
Host A Host B Host C
Rand(1,3)
Facility 1 Facility 2 Facility 3
13. Strongly Consistent Reads
Partition A
1000 RCUs
Partition A
1000 RCUs
Partition A
1000 RCUs
Primary
Host A Host B Host C
Locate Primary
Facility 1 Facility 2 Facility 3
15. Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(item key)
A1
(partition)
A2
(sort)
A3 A4 A5
LSIs
A1
(partition)
A4
(sort)
A2
(item key)
A3
(projected)
Table
KEYS_ONLY
INCLUDE A3
A1
(partition)
A5
(sort)
A2
(item key)
A3
(projected)
A4
(projected)
ALL
10 GB max per partition key, i.e.
LSIs limit the # of range keys!
16. Global secondary index (GSI)
Alternate partition and/or sort key
Index is across all partition keys
Use composite sort keys for compound indexes
A1
(partition)
A2 A3 A4 A5
A5
(partition)
A4
(sort)
A1
(item key)
A3
(projected)
INCLUDE A3
A4
(partition)
A5
(sort)
A1
(item key)
A2
(projected)
A3
(projected)
ALL
A2
(partition)
A1
(itemkey)
KEYS_ONLY
GSIs
Table
RCUs/WCUs provisioned
separately for GSIs
Online indexing
17. How do GSI updates work?
Table
Primary
table
Primary
table
Primary
table
Primary
table
Global
Secondary
Index
Client
2. Asynchronous
update (in progress)
If GSIs don’t have enough write capacity, table writes will be throttled!
18. LSI or GSI?
LSI can be modeled as a GSI
If data size in an item collection > 10 GB, use GSI
If eventual consistency is okay for your scenario,
use GSI!
20. Throughput
Provisioned at the table level
Write capacity units (WCUs) are measured in 1 KB per second
Read capacity units (RCUs) are measured in 4 KB per second
- RCUs measure strictly consistent reads
- Eventually consistent reads cost 1/2 of consistent reads
Read and write throughput limits are independent
WCURCU
21. Partitioning Math
In the future, these details might change…
Number of Partitions
By Capacity (Total RCU / 3000) + (Total WCU / 1000)
By Size Total Size / 10 GB
Total Partitions CEILING(MAX (Capacity, Size))
22. Partitioning Example
Table size = 8 GB, RCUs = 5000, WCUs = 500
RCUs per partition = 5000/3 = 1666.67
WCUs per partition = 500/3 = 166.67
Data/partition = 10/3 = 3.33 GB
RCUs and WCUs are uniformly
spread across partitions
Number of Partitions
By Capacity (5000 / 3000) + (500 / 1000) = 2.17
By Size 8 / 10 = 0.8
Total Partitions CEILING(MAX (2.17, 0.8)) = 3
23. What causes throttling?
If sustained throughput goes beyond provisioned throughput per partition
Non-uniform workloads
Hot keys/hot partitions
Very large items
Mixing hot data with cold data
Use a table per time period
From the example before:
Table created with 5000 RCUs, 500 WCUs
RCUs per partition = 1666.67
WCUs per partition = 166.67
If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition, DynamoDB
may throttle requests
- Solution: Increase provisioned throughput
25. Getting the most out of DynamoDB throughput
“To get the most out of DynamoDB throughput, create tables where the
partition key element has a large number of distinct values, and values
are requested fairly uniformly, as randomly as possible.”
—DynamoDB Developer Guide
Space: access is evenly spread over the key-space
Time: requests arrive evenly spaced in time
30. 1:1 relationships or key-values
Use a table or GSI with an alternate partition key
Use GetItem or BatchGetItem API
Example: Given an SSN or license number, get attributes
Users Table
Partiton key Attributes
SSN = 123-45-6789 Email = johndoe@nowhere.com, License = TDL25478134
SSN = 987-65-4321 Email = maryfowler@somewhere.com, License = TDL78309234
Users-License-GSI
Partition key Attributes
License = TDL78309234 Email = maryfowler@somewhere.com, SSN = 987-65-4321
License = TDL25478134 Email = johndoe@nowhere.com, SSN = 123-45-6789
31. 1:N relationships or parent-children
Use a table or GSI with partition and sort key
Use Query API
Example:
• Given a device, find all readings between epoch X, Y
Device-measurements
Partition Key Sort key Attributes
DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90
DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
32. N:M relationships
Use a table and GSI with partition and sort key elements
switched
Use Query API
Example: Given a user, find all games. Or given a game,
find all users.
User-Games-Table
Partition Key Sort key
UserId = bob GameId = Game1
UserId = fred GameId = Game2
UserId = bob GameId = Game3
Game-Users-GSI
Partition Key Sort key
GameId = Game1 UserId = bob
GameId = Game2 UserId = fred
GameId = Game3 UserId = bob
34. Hierarchical data structures as items
Use composite sort key to define a hierarchy
Highly selective result sets with sort queries
Index anything, scales to any size
Primary Key
Attributes
ProductID type
Items
1 Book
title author genre publisher datePublished ISBN
Ringworld Larry Niven Science Fiction Ballantine Oct-70 0-345-02046-4
2
Album
title artist genre label studio released producer
Dark Side of the Moon Pink Floyd Progressive Rock Harvest Abbey Road 3/1/73 Pink Floyd
Track-0
title length music vocals
Speak to Me 1:30 Mason Instrumental
Track-1
title length music vocals
Breathe 2:43 Waters, Gilmour, Wright Gilmour
Track-2
title length music vocals
On the Run 3:30 Gilmour, Waters Instrumental
3
Movie
title genre writer producer
Idiocracy Scifi Comedy Mike Judge 20th Century Fox
Actor-0
name character image
Luke Wilson Joe Bowers img2.jpg
Actor-1
name character image
Maya Rudolph Rita img3.jpg
Actor-2
name character image
Dax Shepard Frito Pendejo img1.jpg
35. … or as documents (JSON)
JSON data types (M, L, BOOL, NULL)
Document SDKs available
Indexing only by using DynamoDB Streams or AWS Lambda
400 KB maximum item size (limits hierarchical data structure)
Primary Key
Attributes
ProductID
Items
1
id title author genre publisher datePublished ISBN
bookID Ringworld Larry Niven Science Fiction Ballantine Oct-70 0-345-02046-4
2
id title artist genre Attributes
albumID
Dark Side of the
Moon
Pink Floyd Progressive Rock
{ label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink
Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals:
"Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour,
Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music: ”Gilmour,
Waters", vocals: "Instrumental"}]}
3
id title genre writer Attributes
movieID Idiocracy Scifi Comedy Mike Judge
{ producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob: "9/21/71",
character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya Rudolph", dob:
"7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax Shepard", dob:
"1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
44. Trade off read cost for write scalability
Consider throughput per partition key
Shard write-heavy partition keys
Your write workload is not horizontally
scalable
46. Active Tickets Table
Customer
(Partition)
Ticket
(Sort)
GSIKey
(0-N)
…. Timestamp
Time Based Workflows
Expired Tickets GSI
GSIKey
(Partition)
Timestamp
(Sort)
Daily Archive Table
GSIKey
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
Current table
HotdataColddata
Scatter query GSI for any item on the table matching the sort condition
47. TTL
job
Time-To-Live (TTL)
Amazon DynamoDB
Table
CustomerActiveO
rder
OrderId: 1
CustomerId: 1
MyTTL:
1492641900
DynamoDB Streams
Stream
Amazon
Kinesis
Amazon
Redshift
An epoch timestamp
marking when an item can
be deleted by a
background process,
without consuming any
provisioned capacity
Time-To-Live
Removes data that is no longer relevant
48. Use Lambda to execute workflows on
expired items
Use a write sharded GSI to selectively query items across
the table space
Create a Lambda “stored procedure” to execute workflows
Creating dense indexes
50. Partition 1
2000 RCUs
Partition K
2000 RCUs
Partition M
2000 RCUs
Partition 50
2000 RCU
Scaling bottlenecks
Product A Product B
Shoppers
ProductCatalog Table
SELECT Id, Description, ...
FROM ProductCatalog
WHERE Id="POPULAR_PRODUCT"
51.
52. Partition 1 Partition 2
ProductCatalog Table
User
DynamoDB
User
Cache
popular items
SELECT Id, Description, ...
FROM ProductCatalog
WHERE Id="POPULAR_PRODUCT"
55. Secondary index
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
BobPartition key Sort key
Multivalue sorts and filters
56. Secondary Index
Approach 1: Query filter
Bob
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Game
WHERE Opponent='Bob'
ORDER BY Date DESC
FILTER ON Status='PENDING'
(filtered out)
58. Secondary Index
Approach 2: Composite key
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Partition key Sort key
59. Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Secondary index
Approach 2: Composite key
Bob
SELECT * FROM Game
WHERE Opponent='Bob'
AND StatusDate BEGINS_WITH 'PENDING'
60. Concatenate attributes to form useful
secondary index keys
Take advantage of sparse indexes
Use composite keys to model data
You want to optimize a query as much
as possible
Status + Date
63. How OLTP Apps Use Data
Mostly Hierarchical
Structures
Entity Driven Workflows
Data Spread Across Tables
Requires Complex Queries
Primary driver for ACID
64. Reading item partitions
Get all items for a product
assembly
SELECT *
FROM Assemblies
WHERE ID=1
Order
Item
Picker
ID PartID Vendor Bin
1
1 ACME 27Z19
2 ACME 39M97
3 ACME 75B25
(Many more assemblies)
Assemblies table
Partition key Sort key
65. ID PartID Vendor Bin
1
1 ACME 27Z19
2 ACME 39M97
3 ACME 75B25
4 UAC 53G56
5 UAC 64B17
6 UAC 48J19
Updates are problematic
(Many more assemblies)
Assemblies table
Old items
SELECT *
FROM Assemblies
WHERE ID=1
Order Item Picker
New items
66. ID PartID Vendor Bin
1
1 ACME 27Z19
2 ACME 39M97
3 ACME 75B25
4 UAC 53G56
5 UAC 64B17
6 UAC 48J19
Creating partition “locks”
(Many more assemblies)
Assemblies table
• Use metadata item for versioning
SET #ItemList.i = list append(:newList, #ItemList.i ), #Lock =
1
• Obtain locks with conditional writes
--condition-expression “Lock = 0”
• Remove old items or add version info to Sort Key
Query composite keys for specific versions
• Readers can determine how to handle “fat” reads
• Add additional metadata to manage transactional
workflow as needed
Current State, Timeout, etc.
ID PartID ItemList Lock
1 0 {i:[[1,2,3],[4,5,6]]} 0
67. Use item partitions to manage transactional
workflows
Manage versioning across items with metadata
Tag attributes to maintain multiple versions
Code your app to recognize when updates are in progress
Implement app layer error handling and recovery logic
Transactional writes across items is
required
68. • Partition table on nodeId, add edges
to define adjacency list
• Define a default edge for every node
type to describe the node itself
• Use partitioned GSI’s to query large
nodes (dates, places, etc)
• Use Streams/Lambda/EMR for
graph query projections
• Neighbor Entity State
• Subtree Aggregations
• Breadth First Search
• Node Ranking
Materialized Graph Modeling
GSI Primary Key Attributes
GSIkey Data Target Type Node Projection
0-N
Jason Bourne 1
Person
1 …
James Bond 4 4 …
20170418 2
Birthdate
1 …
4 …
Date 2 …
Finland 3
Birthplace
1 …
4 …
Place 3 …
GSI Primary Key Attributes
GSIkey Type Target Data Node Projection
0-N
Person
1 Jason Bourne 1 …
4 James Bond 4 …
Birthdate 2 20170418
1 …
4 …
Birthplace 3 Finland
1 …
4 …
Date 2 20170418 2 …
Place 3 Finland 3 …
Table Primary Key Attributes
Node Type Target Data GSIkey Projection
1
Person 1 Jason Bourne
HASH(Person.Data)
Edge/spanning
tree rollups
Birthdate 2 20170418
Birthplace 3 Finland
2 Date 2 20170418
HASH(Data) (Summary Stats)
3 Place 3 Finland
4
Person 4 James Bond
HASH(Person.Data
Edge/spanning
tree rollups
Birthdate 2 20170418
Birthplace 3 Finland
consistent, single-digit millisecond latency at any scale
Make more splashy, add Icons from detail page
Did you know EMR will let you put a read ratio of 1.5?
Did you know EMR will let you put a read ratio of 1.5?
Think of this as a parallel table asynchronously populated by DynamoDB
Eventually consistent
1 Table update = 0, 1 or 2 GSI updates
Picture from: http://www.amazon.com/Black-Aluminum-Control-Amplifier-Wheel/dp/B005HU1ZHA
Scan and Query
Cumulative size of processed items – ceiling (Ʃ(item sizes)/4KB)
Batch GetItem
ceiling [(Ʃ(item1 size)/4KB) + …ceiling (Ʃ(itemN size)/4KB)]
Consumed throughput is measured per operation
Provisioned throughput is divided between all partitions
Expand on what are WCU and RCU?
Clean up fonts and equations
Uneven access across key space, result in throttling
Often some items in your table are accessed more frequently than others. For example, this graph illustrates how many requests per second were made for each item in your table. For collections like a product catalog, some items are substantially more popular than others. The same probably goes for tweets sent from a celebrity, and that sort of thing.
Displaying those “hot items” on everyone’s twitter feed is problematic since they cause an uneven request distribution.
(ask audience) Can anyone think of some things that we can do to “cool off” those hot items?
One thing we can do is cache those reads in the application.
Since a tweet doesn’t change once you post it, you can cache it in memory or in something like Amazon ElastiCache.
This is a technique used in high throughput applications with a traditional database as well.