DynamoDB In-depth & Developer Drill Down

DynamoDB In-Depth & Developer Drill Down
Peter-Mark Verwoerd
Solutions Architect
Ace Hotel, New York. May 22nd, 2014

Overview
• Local Secondary Indexes
• Global Secondary Indexes
• Design Patterns
– User Data
• Demo
• Break
• Design Patterns (continued…)
– Game State
– Save Games
– Global Leaderboard
– High throughput voting
• Data design patterns

Local Secondary Indexes
• Alternate Range Key for your table
• More flexible Query patterns
• Local to the Hash Key
local secondary indexes (LSI)
index and table data is co-located (same partition)

Use case for Local Secondary Indexes
• Find the recent DynamoDB forum posts
• Table sorted by range key only
Forum Subject LastReplyTime Views Replies Answered
S3 How to set permissions? 2013-04-02 100 20 1
DynamoDB Creating secondary indexes? 2013-02-12 100 20 0
DynamoDB I get an error 2012-11-05 98 3 1
DynamoDB Setting row permissions 2012-06-17 100 8 0
DynamoDB Signature not working 2012-03-28 12 1 1
DynamoDB Transaction support 2013-04-01 5 10 0

Use case for Local Secondary Indexes
• Create a local secondary index on LastReplyTime
Forum LastReplyTime Subject Views Replies Answered
S3 2013-04-02 How to set permissions? 100 20 1
DynamoDB 2012-03-28 Signature not working 12 1 1
DynamoDB 2012-06-17 Setting row permissions 100 8 0
DynamoDB 2012-11-05 I get an error 98 3 1
DynamoDB 2013-02-12 Creating secondary indexes? 100 20 0
DynamoDB 2013-04-01 Transaction support 5 10 0

Write example (behind the scenes)
• Updating the LastReplyTime for a Post
– from “2013-03-17” to “2013-04-02”
DynamoDB
ForumPost
Partition 1
UpdateItem
ReplyTime
Index
Table
ForumPost
Partition 2

• Update the attribute(s) in the item in the table
• Update the attribute(s) in the index if necessary

– Update “How..” date from 2 to 5
Table Index
Forum Q’n Date
S3 Ask… 1
DDB Ask… 5
DDB Help.. 1
DDB How… 2
DDB Using… 3
Forum Date Q’n
S3 1 Ask...
DDB 1 Help…
DDB 2 How…
DDB 3 Using…
DDB 5 Ask…

Table Index
Forum Q’n Date
S3 Ask… 1
DDB Ask… 5
DDB Help.. 1
DDB How… 2
DDB Using… 3
Forum Date Q’n
S3 1 Ask...
DDB 1 Help…
DDB 2 How…
DDB 3 Using…
DDB 5 Ask…
5

• Update the attribute(s) in the index
Table Index
Forum Q’n Date
S3 Ask… 1
DDB Ask… 5
DDB Help.. 1
DDB How… 2
DDB Using… 3
Forum Date Q’n
S3 1 Ask...
DDB 1 Help…
DDB 2 How…
DDB 3 Using…
DDB 5 Ask…
DDB 5 How…
5

User
(hash)
Date
(range)
File
(key)
User
(hash)
File
(range)
Date Type Size S3Key
Date-index
User
(hash)
Type
(range)
File
(key)
Date
(projected)
Table
KEYS_ONLY
INCLUDE Date
User
(hash)
Size
(range)
File
(key)
Date
(projected)
Type
(projected)
S3key
(projected) ALL
Local Secondary Index Projections
Type-index
Size-index

Projections
• Pick which attributes are “copied” into the index
• Pros:
– Improves Query performance when querying projected attributes
• Cons:
– Increases write cost when:
• Projected attributes are frequently updated
• Projected attributes are > 1KB

Provisioned throughput cost (reads)
• If querying only for projected attributes:
– Query costs the same as a Query on a table
• If querying for non-projected attributes
– Query costs the same as a Query on a table
– Plus, the cost of retrieving each item from the table independently
• (similar to Query + BatchGetItem)

Queries that Fetch
• Index: Project KEYS_ONLY
• Query: (“DDB”, “Date >= 3”, “ALL_ATTRIBUTES”)
Table Index
Forum Q’n Date Answered
S3 Ask… 1 1
DDB Ask… 5 0
DDB Help.. 1 1
DDB How… 2 1
DDB Using… 3 0
Forum Date Q’n
S3 1 Ask...
DDB 1 Help…
DDB 2 How…
DDB 3 Using…
DDB 5 Ask…

Queries that Fetch
• Index: Project KEYS_ONLY
• Query: (“DDB”, “Date >= 3”, “ALL_ATTRIBUTES”)
Table Index
Forum Q’n Date Answered
S3 Ask… 1 1
DDB Ask… 5 0
DDB Help.. 1 1
DDB How… 2 1
DDB Using… 3 0
Forum Date Q’n
S3 1 Ask...
DDB 1 Help…
DDB 2 How…
DDB 3 Using…
DDB 5 Ask…
2. Fetch items 1. Query Index

Queries that Fetch
DynamoDB
ForumPost
Partition 1
1. Query
ReplyTime
Index
Table
ForumPost
Partition 2
2. DynamoDB Queries Index
3. DynamoDB
fetches each item
from the table

Sparse indexes
• “Unanswered” entries are very interesting
Forum Subject LastReplyTime Views Replies Answered
S3 How to set permissions? 2013-04-02 100 20 1
DynamoDB Creating secondary indexes? 2013-02-12 100 20 0
DynamoDB I get an error 2013-04-01 98 3 1
DynamoDB Setting row permissions 2013-04-01 100 8 0
DynamoDB Signature not working 2013-04-01 12 1 0
DynamoDB Using the SDK 2013-04-01 5 10 1

Sparse indexes
• The “Unanswered” index contains only unanswered replies
Forum Unans
wered
Subject LastReplyTime Views Replies
DynamoDB 1 Setting row permissions 2013-04-01 100 8
DynamoDB 1 Signature not working 2013-04-01 12 1
DynamoDB 1 Creating secondary indexes? 2013-02-12 100 20

Sparse indexes
• Tip: To get useful sort order, populate Unanswered with LastReplyDateTime
Forum Unanswered Subject LastReplyTime Views Replies
DynamoDB 2013-02-12 Creating secondary indexes? 2013-02-12 100 20
DynamoDB 2013-04-01 Setting row permissions 2013-04-01 100 8
DynamoDB 2013-04-01 Signature not working 2013-04-01 12 1

Global Secondary Indexes
• Alternate Hash and/or Range Key for your table
• Even more flexible Query patterns

Global Secondary Index Projections
Urgent
(hash)
Id
(key)
Table
GSIs
INCLUDE
To
(hash)
Date
(range)
Id
(key)
Message
(projected)
From
(projected)
ALL
To
(hash)
From
(range)
Id
(key)
23
Id
(hash)
Date From To Message Urgent
From
(hash)
To
(range)
Id
(key) KEYS_ONLY
From
(hash)
Date
(range)
Id
(key)
To

GSI Query Pattern
• Query covered by GSI
– Query GSI & get the attributes
• Query not covered by GSI
– Query GSI get the table key(s)
– BatchGetItem/GetItem from table
– 2 or more round trips to DynamoDB
Tip: If you need very low latency then project all required attributes into GSI
24

How do GSI updates work
Table
Primary
table
Primary
table
Primary
table
Primary
table
Global
Secondary
Index
Client
2. Asynchronous
update (in progress)
25

1 Table update = 0, 1 or 2 GSI updates
Table Operation No. of GSI index
updates
• Item not in Index before or after update 0
• Update introduces a new indexed-attribute
• Update deletes the indexed-attribute
1
• Updated changes the value of an indexed
attribute from A to B
2
26

Local Secondary Index Global Secondary Index
1 Key = hash key and a range key Key = hash or hash-and-range
2
Hash same attribute as that of the table. Range key can be
any scalar table attribute
The index hash key and range key (if present) can be any
scalar table attributes
3
For each hash key, the total size of all indexed items must
be 10 GB or less
No size restrictions for global secondary indexes
4
Query over a single partition, as specified by the hash key
value in the query
Query over the entire table, across all partitions
5 Eventual consistency or strong consistency Eventual consistency only
6 Read and write capacity units consumed from the table.
Every global secondary index has its own provisioned
read and write capacity units
7
Query will automatically fetch non-projected attributes
from the table
Query can only request projected attributes. It will not
fetch any attributes from the table

LSI or GSI?
• LSI can be modeled as a GSI
• If date size in a item collection > 10GB use GSI
• If GSI will work for your scenario use GSI!
– 2 round trips (unless you include)
– Eventual consistency

Best Practices
• Provision enough throughput for GSI
– one update to the table may result in two writes to an index
• If GSIs do not have enough write capacity, table writes will eventually be
throttled down to what the "slowest" index can consume

User Data
Fine-grained access control

User Data
Users
Your App
Amazon
DynamoDB

User Data
Users
(Cost, Ops, Latency)
Your App
Amazon
DynamoDB

User Data
Users
Amazon
DynamoDB

User Data
Users
(Access control?)
Amazon
DynamoDB

Web Identity Federation
Users
AWS IAM
Web identity federation
Amazon
DynamoDB

Web Identity Federation
Users
AWS IAM
Web identity federation
(Fine-grained access control)
Amazon
DynamoDB

Fine-Grained Access Control
• Limit access to particular hash key values
• Limit access to specific attributes
• Use policy substitution variables to write the policy once

Images Table
User Image Date Link
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
“Allow all authenticated Facebook
users to Query the Images table,
but only on items where their
Facebook ID is the hash key”

Images Table
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
Bob
Bob “logs in” using
web identity federation
AWS
IAM

Images Table
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
Bob
Bob can Query for Images
where User=“Bob”

Images Table
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
Bob
Bob cannot Query for Images
where User=“Alice”

Two-tier Architecture Tradeoffs
• Pros:
– Lower latency
– Lower cost
– Lower operational complexity
• Cons:
Users
– Less visibility into application behavior
– More difficult to make changes to persistence layer
– Requires “scoping” items to a given user Amazon
DynamoDB

Tagging App Query Patterns
• Image Table:
– How many votes does this image(URL) have?
– Does an item already exist for this URL?
• Tag Table:
– How many images are tagged with this tag?
• ImageTag table:
– All images with a given tag
– All tags for a given image
– How may votes does this tag have?

Image Table
Id DateAdded VoteCount
"http://tag-pics.s3.amazonaws.com/aws-icons/cloudsearch.png" "2014-05-06T05:50:06.371Z" 0
"http://tag-pics.s3.amazonaws.com/aws-icons/dynamodb.png" "2014-05-06T05:03:16.582Z" 3
Attribute Type Value
Id (Hash Key)
String "http://tag-pics.s3.amazonaws.com/aws-icons/cloudsearch.png"
DateAdded String "2014-05-06T05:50:06.371Z"
VoteCount Number 0

Tag Table
Tag ImageCount
"new" 2
"database" 1
"nosql" 1
"cloudsearch" 1
"dynamodb" 1
Tag (Hash Key)
String "database"
ImageCount Number 1

ImageTag Table
Tag ImageId LastUpdateTime
Vote
Count
"new" "http://tag-pics.s3.amazonaws.com/aws-icons/cloudsearch.png" "2014-05-06T05:50:06.371Z" 0
"new" "http://tag-pics.s3.amazonaws.com/aws-icons/dynamodb.png" "2014-05-06T05:50:36.452Z" 3
"database" "http://tag-pics.s3.amazonaws.com/aws-icons/dynamodb.png" "2014-05-06T05:03:51.964Z" 3
"nosql" "http://tag-pics.s3.amazonaws.com/aws-icons/dynamodb.png" "2014-05-06T05:03:45.489Z" 3
"cloudsearch" "http://sivar-pics.s3.amazonaws.com/aws-icons/cloudsearch.png" "2014-05-06T05:50:19.364Z" 0
Tag (Hash Key) String "database"
"dynamodb" "http://sivar-pics.s3.amazonaws.com/aws-icons/dynamodb.png" "2014-05-06T05:03:35.655Z" 3
ImageId (Range Key) String "http://tag-pics.s3.amazonaws.com/aws-icons/dynamodb.png"
LastUpdateTime String "2014-05-06T05:03:51.964Z"
VoteCount Number 3

ImageTag Table Indexes
Local Secondary Index
Index Name Hash Key Range Key Projected Attributes Index Size (Bytes)* Item Count*
VoteCount-index Tag (String) VoteCount (Number) All 369 3
Global Secondary Index
Index
Name
Hash Key
Range
Key
Projected
Attributes
Status
Read
Capacity
Units
Write
Capacity
Units
Last
Decr
ease
Time
Last
Increas
e Time
Index
Size
(Bytes)*
Item
Coun
t*
ImageId-index
ImageId (String) Tag (String) Tag, ImageId Active 1 1 222 3

Conditional Update
dynamodb.putItem({
'TableName': 'Image',
'Item': {
'Id': {'S': imageId},
'VoteCount': {'N': "0"},
'DateAdded': {'S': dateStr}
},
'Expected': {
'Id': {
'Exists': false
}
}

UpdateItem Increment
// Upserts a new Tag into the Tag table, incrementing its ImageCount.
function insertTag(tag, incrementBy, callback) {
console.log("Insert / increment Tag. Tag: " + tag);
dynamodb.updateItem({
'TableName': 'Tag',
'Key': {
'Tag': {'S': tag}
},
'AttributeUpdates': {
'ImageCount': {
'Action': 'ADD',
'Value': {'N': "1" }
}
}
}, callback);
}

Summary: Image Tagging Demo
• Modeling applications on DynamoDB is similar to with databases
• Need to plan your schema and indexes around how you are going to query your
data

Basic Game State
Conditional Writes

Tic Tac Toe
Alice Bob
Your App
DynamoDB

Tic Tac Toe Table
Game Table
Id Players O State IsTie Winner Data
abecd [ Alice, Bob ] Alice DONE 1 …
fbdcc [ Alice, Bob ] Alice DONE Alice …
dbace [ Alice, Bob ] Alice STARTED …

Tic Tac Toe Table
{
"Data" : [ [ "X", null, "O" ],
[ null, "O", null],
[ "O", null, "X" ]
]
}

State Transitions with Conditional Writes
Alice Bob
DynamoDB

DynamoDB
UpdateItem:
Top-Right = O
Turn = Bob
Alice Bob

DynamoDB
UpdateItem:
Top-Left = X
Turn = Alice
Alice Bob

Alice Bob (1)
DynamoDB
Bob (2) Bob (3)

Bob (1)
Bob (2)
DynamoDB
Bob (3)
State : STARTED,
Turn : Bob,
Top-Right : O

Bob (1)
Bob (2)
DynamoDB
Bob (3)
Update:
Turn : Alice
Top-Left : X
Update:
Turn : Alice
Low-Right : X
Update:
Turn : Alice
Mid : X
State : STARTED,
Turn : Bob,
Top-Right : O

Bob (1)
Bob (2)
DynamoDB
Bob (3)
Update:
Turn : Alice
Top-Left : X
Update:
Turn : Alice
Low-Right : X
Update:
Turn : Alice
Mid : X
State : STARTED,
Turn : Alice,
Top-Right : O,
Top-Left : X,
Mid: X,
Low-Right: X

Conditional Writes
• Apply an update only if values are as expected
• Otherwise reject the write

Conditional Writes
{
Id : abecd,
Players : [ Alice, Bob ],
State : STARTED,
Turn : Bob,
Top-Right: O
}
UpdateItem Id=abecd
Game Item Updates: {
Turn : Alice,
Top-Left: X
}
Expected: {
Turn : Bob,
Top-Left : null,
State : STARTED
}

Bob (1)
Bob (2)
DynamoDB
Bob (3)
Update:
Turn : Alice
Top-Left : X
Expect:
Turn : Bob
Top-Left : null
State : STARTED,
Turn : Bob,
Top-Right : O
Update:
Turn : Alice
Low-Right : X
Expect:
Turn : Bob
Low-Right : null
Update:
Turn : Alice
Mid : X
Expect:
Turn : Bob
Mid : null

Bob (1)
Bob (2)
DynamoDB
Bob (3)
Update:
Turn : Alice
Top-Left : X
Expect:
Turn : Bob
Top-Left : null
State : STARTED,
Turn : Alice,
Top-Right : O,
Top-Left : X
Update:
Turn : Alice
Low-Right : X
Expect:
Turn : Bob
Low-Right : null
Update:
Turn : Alice
Mid : X
Expect:
Turn : Bob
Mid : null

Primary Key Schemas
Primary Key
Hash Key Schema

Primary Key Schemas
Id Turn Players Turn State IsTie Winner Data
abecd 0 [ Alice, Bob ] Alice STARTED …
abecd 1 [ Alice, Bob ] Bob STARTED …
abecd 4 [ Alice, Bob ] Alice DONE Alice …
dbace 0 [ Alice, Bob ] Bob STARTED
dbace 1 [ Alice, Bob ] Alice STARTED …
Primary Key
Hash and Range Key Schema

Primary Key Schemas
Primary Key

Primary Key Schemas
• Hash-only
– Key/value lookups only
• Hash and Range
– Given a hash key value, query for items by range key
– Items are sorted by range key within each hash key

Primary Key Schemas
Primary Key
Query WHERE Id=abecd, ORDER BY Turn DESC, LIMIT 2

Global Leaderboard
Scatter-gather

Game-Wide Leaderboard
• Find the top 10 scores game-wide
HighScore User
1000 Alice
850 Dave
580 Erin
470 Bob
30 Chuck

HighScore User
1000 Alice
850 Dave
580 Erin
470 Bob
30 Chuck
Table Schemas must begin
with a Hash Key

Cannot be Queried
the way we want
User HighScore
Chuck 20
Alice 1000
Bob 470
Dave 850
Erin 580

• Use a constant Hash key?
Constant HighScore-User
1 0001000-Alice
1 0000850-Dave
1 0000580-Erin
1 0000470-Bob
1 0000030-Chuck
Zero-pad strings for sort
stability

• Use a constant Hash key?
Constant HighScore-User
1 0001000-Alice
1 0000850-Dave
1 0000580-Erin
1 0000470-Bob
1 0000030-Chuck
Extremely non-uniform
workload

Scatter-Gather Leading Range Key
HighScores Table
Shard HighScore-User
1 0001000-Alice
1 0000850-Dave
1 0000580-Erin
3 0000900-Dan
3 0000850-Wendy
3 0000080-Trent
2 0000980-Eve
2 0000600-Frank
2 0000581-Trent
4 0000500-Merlin
4 0000350-Carole
4 0000280-Paul
5 0000999-Oscar
5 0000700-Craig
5 0000030-Chuck

HighScores Table
1 0001000-Alice
1 0000850-Dave
1 0000580-Erin
3 0000900-Dan
3 0000850-Wendy
3 0000080-Trent
2 0000980-Eve
2 0000600-Frank
2 0000581-Trent
4 0000500-Merlin
4 0000350-Carole
4 0000280-Paul
5 0000999-Oscar
5 0000700-Craig
5 0000030-Chuck
1. Periodically Query each Shard DESC, LIMIT N

HighScores Table
1 0001000-Alice
1 0000850-Dave
1 0000580-Erin
3 0000900-Dan
3 0000850-Wendy
3 0000080-Trent
2 0000980-Eve
2 0000600-Frank
2 0000581-Trent
4 0000500-Merlin
4 0000350-Carole
4 0000280-Paul
5 0000999-Oscar
5 0000700-Craig
5 0000030-Chuck
2. Keep only the top N,
Store somewhere
HighScore User
1000 Alice
999 Oscar

HighScores Table
1 0001000-Alice
1 0000850-Dave
1 0000580-Erin
3 0000900-Dan
3 0000850-Wendy
3 0000080-Trent
2 0000980-Eve
2 0000600-Frank
2 0000581-Trent
4 0000500-Merlin
4 0000350-Carole
4 0000280-Paul
5 0000999-Oscar
5 0000700-Craig
5 0000030-Chuck
Store the Shard id by User for high score updates
User Shard
Alice 1
Oscar 5
Carole 4

High-Throughput Voting
Write sharding

Voting
Voter
Votes Table
Candidate A
Votes: 20
Candidate B
Votes: 30

Voting
Voter
UpdateItem
ADD 1 to “Candidate A”
(aka Atomic Increment)
Votes Table
Candidate A
Votes: 21
Candidate B
Votes: 30

Scaling on DynamoDB
You
Votes Table
Need to scale
for the election

Scaling on DynamoDB
You
Provision 1200 Write Capacity Units
Votes Table

Scaling on DynamoDB
You
Partition 1 Partition 2
600 Write Capacity Units (each)
Votes Table

Scaling on DynamoDB
You
Partition 1 Partition 2
(no sharing)
Votes Table

Scaling on DynamoDB
You
Provision 200,000 Write Capacity Units
Votes Table
Partition 1
(600 WCU)
Partition K
(600 WCU)
Partition M
(600 WCU)
Partition N
(600 WCU)

Scaling bottlenecks
Votes Table
Partition 1
(600 WCU)
Candidate A
Partition K
(600 WCU)
Partition M
(600 WCU)
Partition N
(600 WCU)
Candidate B
Voters

Best Practice: Uniform Workloads
“To achieve the full amount of request
throughput you have provisioned for a table,
keep your workload spread evenly across the
hash key values.”
– DynamoDB Developer Guide

Scaling on DynamoDB
Voter
Votes Table
Candidate A_2
Candidate B_1
Candidate B_4
Candidate B_3
Candidate B_2
Candidate B_5
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_4
Candidate A_3
Candidate A_7 Candidate B_8
Candidate A_5
Candidate A_6 Candidate A_8

Scaling on DynamoDB
Voter
UpdateItem: “CandidateA_” + rand(0, 10)
ADD 1 to Votes
Votes Table
Candidate A_2
Candidate B_1
Candidate B_4
Candidate B_3
Candidate B_2
Candidate B_5
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_4
Candidate A_3
Candidate A_5

Scaling on DynamoDB
Votes Table
Candidate A_2
Candidate B_1
Candidate B_4
Candidate B_3
Candidate B_2
Candidate B_5
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_4
Candidate A_3
Candidate A_5
Periodic
Process
Candidate A
Total: 2.5M
1. Sum
2. Store Voter

Reference Architecture
…for a classic 3-tier
application

Amazon
RDS
Amazon
CloudSearch
Reference Architecture
Amazon
DynamoDB
Amazon
ElastiCache
Amazon
EMR
Amazon
S3
Amazon
Redshift
AWS Data Pipeline
Amazon
Glacier

Use Case: A Video Streaming App – Upload
Amazon
DynamoDB
Amazon
RDS
Amazon
CloudSearch
Amazon
S3

A Video Streaming App – Discovery
X
Amazon
Glacier
Amazon
ElastiCache
CloudFront
Amazon
DynamoDB
Amazon
RDS
Amazon
CloudSearch
Amazon
S3

Use Case: A Video Streaming App – Recs
Amazon
S3
Amazon
Glacier
Amazon
DynamoDB
Amazon
EMR

How do I choose the right date store?
Date Structure &
Query Pattern
Service
characteristics
Cost

Data Structure & Query Pattern
Structured – Complex Query
• SQL
– Amazon RDS
(MySQL, Oracle, SQL Server, Postgres)
• Data Warehouse
– Amazon Redshift
• Search
– Amazon
CloudSearch
Unstructured – Custom Query
• Hadoop
– Amazon Elastic MapReduce (EMR)
Structured – Simple Query
• NoSQL
– Amazon DynamoDB
• Cache
– Amazon ElastiCache
(Memcached, Redis)
Unstructured – No Query
• Cloud Storage
– Amazon S3
– Amazon Glacier

Data Characteristics: Hot, Warm, Cold
Hot Warm Cold
Volume MB–GB GB–TB PB
Item size B–KB KB–MB KB–TB
Latency ms ms, sec min, hrs
Durability Low–High High Very High
Request rate Very High High Low
Cost/GB $$-$ $-¢¢ ¢

Low
We are sincerely eager to hear
Amazon
ElastiCache
your feedback on this
presentation and on re:Invent.
Amazon
RDS
Please fill out an evaluation form
High Low
when you have a chance.
Amazon
Redshift
Amazon S3
Request rate
High Low
Cost/GB
Low High
Latency
Low High
Data Volume
Amazon
Glacier
Amazon
EMR
Structure
High
Amazon
DynamoDB

What data store should I use?
Elasti-
Cache
Amazon
DynamoDB
Amazon
RDS
Cloud
Search
Amazon Redshift Amazon
EMR (Hive)
Amazon S3 Amazon
Glacier
Average
latency
ms ms ms,sec ms,sec sec,min sec,min,h
rs
ms,sec,min
(~ size)
hrs
Data volume GB GB–TBs
(no limit)
GB–TB
(3 TB Max)
GB–TB TB–PB
(1.6 PB max)
GB–PB
(~nodes)
GB–PB
(no limit)
GB–PB
(no limit)
Item size B-KB KB
(64 KB max)
KB
(~rowsize)
KB
(1 MB
max)
KB
(64 K max)
KB-MB KB-GB
(5 TB max)
GB
(40 TB
max)
Request rate Very High Very High High High Low Low Low–
Very High
(no limit)
Very Low
(no limit)
Storage cost
$/GB/month
$$ ¢¢ ¢¢ $ ¢ ¢ ¢ ¢
Durability Low -
Moderate
Very High High High High High Very High Very High
Hot Data Warm Data Cold Data

Use the right tool for the job!
Data Tier
Amazon
CloudSearch
Amazon RDS
Amazon
ElastiCache
Amazon DynamoDB
Amazon
Elastic MapReduce
Amazon S3
Amazon
Glacier
Amazon Redshift AWS Data Pipeline

When to use
• Fast and predictable performance
• Seamless/massive scale
• Autosharding
• Consistent/low latency
• No size or throughput limits
• Very high durability
• Key-value or simple queries
When not to use
• Need multi-item/row or cross table
transactions
• Need complex queries, joins
• Need real-time analytics on
historic data
• Storing cold data
Amazon DynamoDB

Questions?
Peter-Mark Verwoerd
verwoerd@amazon.com
@petermark
Derek Chiles
derekch@amazon.com
@derekchiles

Social Gaming
Local secondary indexes

Social Gaming
• Host games
• Invite friends to play
• Find friends’ games to play
• See history of games

Social Gaming
HostedGame
Table
Hash: UserId
Range: GameId
Attributes: OpponentId, Date, (rest of game state)
UserId GameId Date OpponentId …
Carol e23f5a 2013-10-08 Charlie …
Alice d4e2dc 2013-10-01 Bob …
Alice e9cba3 2013-09-27 Bob …
Alice f6a3bd 2013-10-08

Social Gaming: find recent games
Alice f6a3bd 2013-10-08
Query UserId=Alice

Query cost
• Provisioned Throughput: Work / sec allowed on your table
• Capacity Units: Amount of provisioned throughput consumed by an operation

Query cost
Alice f6a3bd 2013-10-08
(1 item = 600 bytes)
(397 more games for Alice)

Query cost
Alice f6a3bd 2013-10-08
(1 item = 600 bytes)
(397 more games for Alice)
(Items evaluated by Query) (KB per Read Capacity Unit)
400 X 600 / 1024 / 4 = 60 Read Capacity Units
(bytes per item) (bytes per KB)

Local Secondary Indexes
• An alternate range key on a table
HostedGame Table LocalSecondaryIndex on Date
UserId GameId Date
Carol e23f5a 2013-10-08
Alice d4e2dc 2013-10-01
Alice e9cba3 2013-09-27
Alice f6a3bd 2013-10-01
UserId Date GameId
Carol 2013-10-08 e23f5a
Alice 2013-09-27 e9cba3
Alice 2013-10-01 d4e2dc
Alice 2013-10-01 f6a3bd

Query cost on Local Secondary Indexes
UserId Date GameId …
Carol 2013-10-08 e23f5a …
Alice (397 older games)
Alice 2013-09-27 e9cba3 …
Alice 2013-10-01 d4e2dc …
Alice 2013-10-01 f6a3bd …
Query for the 10 most recent games

Query cost on Local Secondary Indexes
UserId Date GameId …
Carol 2013-10-08 e23f5a …
Alice (397 older games)
Alice 2013-09-27 e9cba3 …
Alice 2013-10-01 d4e2dc …
Alice 2013-10-01 f6a3bd …
Query for the 10 most recent games
(Items evaluated by Query)(KB per Read Capacity Unit)
10 X 600 / 1024 / 4 = 2 Read Capacity Units
(bytes per item) (bytes per KB)

Example Local Secondary Indexes
• Find 10 recent matches between Alice and Bob

Example Local Secondary Indexes
• Find 10 recent matches between Alice and Bob
– Hash: UserId
– Range: OpponentId + Date
Query WHERE UserId=Alice AND OpponentAndDate STARTS_WITH “Bob-”
LIMIT 10 DESC

More example Local Secondary Indexes
• Find a host’s matches without an opponent

More example Local Secondary Indexes
• Find a host’s matches without an opponent
– Hash: UserId
– Range: UnmatchedDate
(sparse index)
Query WHERE UserId=Alice LIMIT 10 DESC

Local Secondary Index Projections
• Choose what attributes are copied into the index
– ALL, SPECIFIC, KEYS
• Substantially cheaper to Query only projection
• Project the attributes that your use case requires
• Can make writes cheaper too

Write cost for Local Secondary Index
• Insert new item
– 1 additional write
• Setting index range key to / from null
• Updating a projected attribute
• Updating a non-projected attribute
– 0 additional writes
• Updating the index range key
– 2 additional writes

Read cost for Query of non-projected attributes
• Regular Query cost
+
• Single-item Get cost for each evaluated item

Example Local Secondary Index Projections
• Query Alice’s 10 most recent Games
Alice f6a3bd 2013-10-08

Example Local Secondary Index Projections
• Query Alice’s 10 most recent Games
– Opponent, Winner, (UserId, GameId, Date)
– Projected item size from 600 bytes to 40 bytes
• Write cost:
– 1 Write Capacity Unit for insert, opponent joining, and completion
– 0 Write Capacity Units for other state transitions

Social Gaming: Friends
• Query who you are friends with
• Ask to be friends with someone
• Acknowledge (or decline) friend request

Social Gaming: Friends
Friends
Table
Hash: UserId
Range: FriendId
Attributes: Status, Date, etc
UserId FriendId Status Date …
Alice Bob FRIENDS 2013-08-20 …
Bob Alice FRIENDS 2013-08-20 …
Bob Chuck INCOMING 2013-10-08 …
Chuck Bob SENT 2013-10-08 …

Becoming Friends: Multi-item Atomic Writes
UserId FriendId Status
Alice Bob FRIENDS
Bob Alice FRIENDS
Bob Chuck INCOMING
Chuck Bob SENT
Bob
A friend request!

Alice Bob FRIENDS
Bob Alice FRIENDS
Bob Chuck INCOMING
Chuck Bob SENT
Bob
1. Update Bob/Chuck record
2. Update Chuck/Bob record

Alice Bob FRIENDS
Bob Alice FRIENDS
Bob Chuck FRIENDS
Chuck Bob SENT
Bob
UpdateItem
Status=FRIENDS

Alice Bob FRIENDS
Bob Alice FRIENDS
Bob Chuck FRIENDS
Chuck Bob FRIENDS
Bob
UpdateItem
Status=FRIENDS

Becoming Friends: When things go wrong
Alice Bob FRIENDS
Bob Alice FRIENDS
Bob Chuck INCOMING
Chuck Bob SENT
Bob
A friend request!

Becoming Friends: When things go wrong
Alice Bob FRIENDS
Bob Alice FRIENDS
Bob Chuck FRIENDS
Chuck Bob SENT
Bob
UpdateItem
Status=FRIENDS

Multi-item transaction in DynamoDB
• Scan for “stuck” transactions
• Use the Client Transactions Library on the AWS SDK for Java
• Roll your own scheme

Replayable state machines
INCOMING ACCEPTING FRIENDS
SENDING SENT FRIENDS
Bob/
Chuck
Chuck/
Bob

Client Transactions Library
Transaction
Client
Transactions Friends Table
Table
Transaction
Images
Table
Bob

Client Transactions Usage
• Low contention only
• Don’t mix Tx Client writes with normal writes
• No Query support
• Expensive, slower
• But, easy to use

Specialized Transactions
Id Status V1 V2
UserId FriendId Status V
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck INCOMING 2
Chuck Bob SENT 2
Bob
1. Read items
2. Write to Tx table
3. Apply writes
4. Delete from Tx table
Transactions Table
A friend request!

Id Status V1 V2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Chuck Bob SENT 2
Bob BatchGetItem
1. Read items
3. Apply writes
Transactions Table

Id Status V1 V2
Bob-Chuck Bob: FRIENDS
Chuck: FRIENDS
2 2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Chuck Bob SENT 2
Bob PutItem,
Expect not exists
1. Read items
3. Apply writes
Transactions Table

Id Status V1 V2
Chuck: FRIENDS
2 2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck FRIENDS 3
Chuck Bob FRIENDS 3
Bob UpdateItem,
Expect V=Vprev
1. Read items
3. Apply writes
Transactions Table

Id Status V1 V2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck FRIENDS 3
Chuck Bob FRIENDS 3
Bob DeleteItem,
Expect V1=V1prev,
V2=V2prev,
1. Read items
3. Apply writes
Transactions Table

Id Status V1 V2
Chuck: FRIENDS
2 2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck FRIENDS 3
Chuck Bob SENT 2
Bob UpdateItem,
Expect V=Vprev
1. Read items
3. Apply writes
Transactions Table

Id Status V1 V2
Chuck: FRIENDS
2 2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck FRIENDS 3
Chuck Bob SENT 2
Sweeper Scan
1. Scan for stuck Tx
2. Apply writes
Transactions Table

Id Status V1 V2
Chuck: FRIENDS
2 2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck FRIENDS 3
Chuck Bob FRIENDS 3
UpdateItem,
Expect V=Vprev
Transactions Table
Sweeper
2. Apply writes

Id Status V1 V2
Alice Bob FRIENDS 3
Bob Alice FRIENDS 3
Bob Chuck FRIENDS 3
Chuck Bob FRIENDS 3
DeleteItem,
Expect V1=V1prev,
V2=V2prev,
Transactions Table
Sweeper
2. Apply writes

Transaction advice
• Lock items before modifying
– Including items that don’t exist yet
• Don’t stomp on future writes (use versions)
• Sweep for stuck transactions
• Avoid deadlock

DynamoDB In-depth & Developer Drill Down

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie DynamoDB In-depth & Developer Drill Down

Ähnlich wie DynamoDB In-depth & Developer Drill Down (20)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

DynamoDB In-depth & Developer Drill Down

Hinweis der Redaktion