Efficient data access is key to a high-performing application. Amazon Web Services provides several database options to support modern data-driven apps and software frameworks to make developing against them easy. We look at the design of a modern serverless web app using Amazon DynamoDB, the DynamoDB Mapper, Amazon Lambda, Amazon API Gateway and the SDKs and tackle the move from relational to NoSQL data models.
Speaker: Clayton Brown, Solutions Architect, Amazon Web Services
2. Modelled Around
Business Domain
Culture of
Automation
Hide
Implementation
Details
Highly
Observable
Decentralise All
The Things
Isolate Failure
Deploy
Independantly
Principles of
Microservices
4. Did You Even Consider NoSQL?
Optimised for Storage Optimised for Compute
Normalised/relational Denormalised/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
5. Why NOSQL? Data Volume Since 2010
Data Volume
Historical Current
§ 90% of stored data generated
in last 2 years
§ 1 terabyte of data in 2010
equals 6.5 petabytes today
§ Linear correlation between
data pressure and technical
innovation
§ No reason these trends will
not continue over time
6. To Infinity And Beyond
The Zero One or Infinity (ZOI) rule is a
rule of thumb in software design
originated by early computing pioneer
Willem van der Poel.
• Arbitrary limits on the number of
instances of a particular entity should
not be allowed.
• Specifically;; an entity should either be
forbidden entirely, one should be
allowed, or any number (presumably,
to the limit of available storage) of
them should be allowed.
7. Monolithic Persistence Example
Database
Database …DatabaseN
Constrained
Memory
Compute
Storage
Network IO
Disk IO
Consolidation
Versions
Data
Dependencies
Capabilities
Database/SVC001
service1/password1
Database/SVC002
service2/password2
Database/SVC003
service3/password3
….
Database/SVC…N
serviceN/passwordN
API
client
mobile
voice
13. Common Use Cases for Amazon DynamoDB
Large Scale Websites
• Session state
• User data used for personalization
• Access control
Application Monitoring
• Storing application logs
• Storing event data JSON data
Internet of Things
• Sensor data and log ingestion
......Microservices
• DDD Isolation / encapsulation / scaling
Ad-Tech
• Capturing browser cookie state
Mobile Applications
• Storing application data and session state
Gaming Applications
• Storing user preferences and application
state
• Storing players’ game state
Consumer “Voting” Applications
• Reality TV contests, Superbowl
commercials
15. Getting the Most out of Amazon DynamoDB Throughput
“To get the most out of DynamoDB
throughput, create tables where
the hash key element has a large
number of distinct values, and
values are requested fairly
uniformly, as randomly as
possible.”
—DynamoDB Developer Guide
Time: Requests arrive evenly
spaced in time
Space: Access is evenly spread
over the key-space
16. Amazon DynamoDB Partitioning Primer
Throughput Unlimited*
• Provision any amount of throughput to a table
Size Unlimited*
• Add any number of items to a table
• Max item is 400 KB
• LSIs limit number of range keys (10 GB)
Sharp Edges?
• Dilution RCUs and WCUs are uniformly spread across partitions
• Hot Keys Scaling is achieved through partitioning
Number of Partitions
By capacity (Total RCU / 3000) + (Total WCU /
1000)
By size Total Size / 10 GB
Total
partitions
CEILING(MAX (Capacity, Size))
Table
size
=
8
GB,
RCUs
=
5000,
WCUs
=
500
RCUs
per
partition
=
5000/3000
=
1.6
WCUs
per
partition
=
500/1000
=
0.5
Data/partition
=
8/10=
0.8
By
Capacity
1.6
+
0.5
=
2.1
(Round
to
3)
RCUs
per
partition
=
5000/3
=
1666
WCUs
per
partition
=
500/3
=
166
Data/partition
=
10/3
=
3.33
GB
Partition
Time
Heat /
Pressure
*(YMMV)
18. Types, Indexes and Queries
Store info about users
• Hash schema for straight key-
value lookups (PrimaryKey)
Query for a user’s friends
• Hash and Range schema for
Query / Filter (Key + RangeKey)
Users Table
Friends Table
19. Query for Users Images
Amazon DynamoDB indexes provide partition and sort
SELECT * FROM Images
WHERE User='Bob‘
ORDER BY Date DESC
(hash / partition key)
(sorted range key)
20. Query for Inflight Processing Stages
Amazon DynamoDB indexes provide partition and sort
What about queries for two equalities and a sort?
SELECT * FROM Images
WHERE User='Bob‘
AND Status=‘PENDING'
ORDER BY Date DESC
(hash / partition key)
(sorted range key)
(?)
21. Images
Query
Bob
User Date ImageId Status Tags
uuid-Alice 2014-10-02 d9bl3 DONE David
uuid-Carol 2014-10-08 o2pnb IN_PROGRESS Bob
uuid-Bob 2014-09-30 72f49 PENDING Alice
uuid-Bob 2014-10-03 b932s PENDING Carol
uuid-Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Images
WHERE User='Bob'
ORDER BY Date DESC
FILTER ON Status=’PENDING'
(Partition Key)
x
22. Images
Query, Sort
Bob
User Date ImageId Status Tags
uuid-Alice 2014-10-02 d9bl3 DONE David
uuid-Carol 2014-10-08 o2pnb IN_PROGRESS Bob
uuid-Bob 2014-09-30 72f49 PENDING Alice
uuid-Bob 2014-10-03 b932s PENDING Carol
uuid-Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Images
WHERE User='Bob'
ORDER BY Date DESC
FILTER ON Status=’PENDING'
(Sorted RangeKey)
x
(Partition Key)
23. Secondary index
Query, Sort & Filter
Bob
User Date ImageId Status Tags
uuid-Alice 2014-10-02 d9bl3 DONE David
uuid-Carol 2014-10-08 o2pnb IN_PROGRESS Bob
uuid-Bob 2014-09-30 72f49 PENDING Alice
uuid-Bob 2014-10-03 b932s PENDING Carol
uuid-Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Images
WHERE User='Bob'
ORDER BY Date DESC
FILTER ON Status=’PENDING'
x
(Sorted RangeKey)(Partition Key)
Filters
5x
24. Local index
Local Secondary Indexes (5)
Bob
User Date ImageId Status Tags
uuid-Alice 2014-10-02 d9bl3 DONE David
uuid-Carol 2014-10-08 o2pnb IN_PROGRESS Bob
uuid-Bob 2014-09-30 72f49 PENDING Alice
uuid-Bob 2014-10-03 b932s PENDING Carol
uuid-Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT Status FROM Images
WHERE ImageId=’o2pnb'
(Hash / Local Secondary Index)
Filters
5x
25. Global Secondary Indexes (5)
ImageTags Table
Alice
Images tagged Alice
Tag Image User …
Alice aed4c id-Alice ..
Alice f93bae id-Bob ..
Alice aed4c id-Bob ..
Bob f93bae id-Carol ..
ByUser Global Secondary Index
User Image Tag …
id-Alice aed4c Alice ..
id-Bob aed4c Alice ..
id-Bob f93bae Alice ..
id-Carol f93b2e Bob ..
Alternate Hash and Range Keys
Global Secondary Index on User, Image
35. Fine-Grained Access Control
Images Table
User Image Date Link
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
Bob
AWS
IAM
Bob “logs in” using
web identity federation
36. Fine-Grained Access Control
Images Table
User Image Date Link
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
Bob
Bob can Query for Images
where User=“Bob”
37. Fine-Grained Access Control
Images Table
User Image Date Link
Bob aed4c 2013-10-01 s3://…
Bob 5f2e2 2013-09-05 s3://…
Bob f93bae 2013-10-08 s3://…
Alice ca61a 2013-09-12 s3://…
Bob
Bob cannot Query for Images
where User=“Alice”
39. Stream of updates to a table
Asynchronous
Exactly once
Strictly ordered
• Per item
Updated/New
Highly durable
• Scale with table
24-hour lifetime
Subsecond latency
Amazon DynamoDB Streams