Webinar | Introduction to Amazon DynamoDB

Introducing
DynamoDB

20th March, 2012
Dr. Matt Wood - matthew@amazon.com

Storage

Tools &
Compute
Support

Databases

Storage
Databases

Tools &
Compute
Support

Databases

Relational “NoSQL”
databases databases

Any database on Amazon EC2
MySQL, DB2, Oracle, PostgreSQL...

Relational Database Service
Managed MySQL and Oracle databases

Rapid High
provisioning availability
Scalable Scalable
storage compute

Relational Database Service
Managed MySQL and Oracle databases

High performance databases

Increase throughput

Increase availability

Reduce latency

Read replicas
Push-button
scaling Increase throughput
ElastiCache

Reduce latency


Increase throughput

Multi-AZ Increase availability

Reduce latency


Increase throughput


Reduce latency ElastiCache

Rich query semantics
Joins, transactions, query optimisation

Problem
Complexity. Performance decreases at scale.

Performance

Predictable, consistent

Scale

Performance

Predictable, consistent

Degraded performance
with scale

Scale

Data caching
Provisioning!
Data sharding

= more problems
Cluster management
Fault management

Undifferentiated
heavy lifting

Fully managed
NoSQL database
service

Ofﬂoad admin and
operational burden

AGENDA
Getting to know DynamoDB

Gu ided tour of service highlights
Provisioned throughput
Data model
DynamoDB in practice
Analytics with Elastic MapReduce

H I G H L I G H T S

Low latency Flexible

Large scale Durable storage

Seamless scaling Zero admin

Predictable performance

H I G H L I G H T S

SSD backed

Low latency
Single digit millisecond

< 5 ms reads < 10 ms writes

H I G H L I G H T S

Massive scale
No table size limits. Unlimited storage.

H I G H L I G H T S

Seamless scale
Live repartitioning. Zero admin.

H I G H L I G H T S

Flexible data model
Key/attribute store for evolving models

H I G H L I G H T S

Predictable performance

H I G H L I G H T S

Durable and available
Consistent, disk-only writes

H I G H L I G H T S

Zero administration

What is provisioned
throughput?

Reserve required IOPS
Per table. Set at creation. Scale via API.

Scale at any time
No downtime

Per 1kb item:

$0.01 per hour for every 10 writes/second

$0.01 per hour for every 50 strongly
consistent reads/second

Per 1kb item:

$0.28 per million writes

$0.056 per million strongly consistent reads

Pay for storage
$1.00 per Gb per month of indexed storage

Data model
Flexible. Schema-less.

Simple key/value pairs
title => “Introduction to DynamoDB”
date => “20120320”

Associative array,
or Hash
[ title => “Introduction to DynamoDB”,
date => “20120320” ]

Attributes

date => “20120320” ]

[ title => “Disaster Recovery with AWS”,
date => “20120320”,
format => “webinar”,
presenter => “Jeff Barr” ]

Attributes

date => “20120320” ]

date => “20120320”,

Items

date => “20120320” ]

date => “20120328”,

Table

date => “20120320” ]

Item

“ImageID” = “1”

“Date” =
“20100915”

“Title” = “ﬂower”

“Tags” = “ﬂower”,
“jasmine”, “white”

“ImageID” = “1” “ImageID” =”2” “ImageID” =”3”

“Date” = “Date” = “Date” =
“20100915” “20100916” “20100917”

“Title” = “ﬂower” “Title” = “ferrari” “Title” = “coffee”

“Tags” = “ﬂower”, “Tags” = “car”, “Tags” = “drink”,
“jasmine”, “white” “italian” “delicious”

“ImageID” = “1” Primary or hash key

“Date” = “20100915”




“Date” = “20100915” Composite or range key




“Date” = “20100915” Composite or range key


“Tags” = “ﬂower”, Sets of strings
or numbers

Best practice
Well balanced, ﬁne grained hash keys.
Customer, order, item, etc. rather than store_id.

Simple API
Only 12 operations.

Consistency
Writes are always consistent.
Reads are consistent or eventually consistent.

Durability
Writes occur to disk, not memory.
Writes are acknowledged once they have been
made in two physical data centres.

Availability
Region speciﬁc (not AZ)
Continuously replicated across multiple AZs

Let’s take a look!
Building a simple DynamoDB powered web application

Threaded discussions

NP-Complete.me
Book reviews for programmers

Page view counts Tagging

Book

Thread

Thread

Thread

Reply

Reply

Book table
Book metadata

Hash key
asin => 0980576830

Book table
Book metadata

asin => 0980576830
title => “Host Your Website on the Cloud”
pages => “364”
list-price => “£31.49”

Book table
Book metadata, page views

asin => 0980576830
pages => “364”

views => 145

Book table
Book metadata, page views, book tags

asin => 0980576830
pages => “364”

views => 145

tags => [“php”, “aws”]

Thread table
Conversation thread

Hash key
Range key asin => 0980576830

subject => “Very informative”

Thread table
Conversation thread

Hash key
Range key asin => 0980576830

subject => “Very informative”

content => “This is a first class book...”

name => “Matt Wood”

Reply table
Conversation replies

Hash key
id => 0980576830:very-informative
Range key datetime => “20120320”

Reply table
Conversation replies

Hash key
id => 0980576830:very-informative
Range key datetime => “20120320”

reply => “I agree!”

name => “Werner Vogels”

DynamoDB tables

Books Threads Replies
(asin) (asin, subject) (id, datetime)

Book Logical model
(asin)

Thread
(asin, subject)

Thread
(asin, subject)

Thread
(asin, subject)

Reply
(id, datetime)

Reply
(id, datetime)

Conditional writes

Client #1

DynamoDB asin => 1934356
pages => 384

Client #2

Time

Conditional writes

asin => 1934356 asin => 1934356
Client #1 pages => 384 pages => 502

DynamoDB asin => 1934356 asin => 1934356
pages => 384 pages => 502

asin => 1934356
Client #2 pages => 384

Time

Conditional writes

asin => 1934356 asin => 1934356

DynamoDB asin => 1934356 asin => 1934356 ?

asin => 1934356 asin => 1934356

Time

Conditional writes

asin => 1934356 asin => 1934356

Failed condition
DynamoDB asin => 1934356 asin => 1934356

asin => 1934356 asin => 1934356

Time

Atomic increment/decrement

asin => 0980576830
views => 145

tables[‘books’].items[‘0980576830’].attributes.add(:views => 1)

asin => 0980576830
views => 146

Tagging: many to many

Book
(asin, tags = [“php”, “aws”])

Query by key, retrieve tag collection
Add tags conditionally

No secondary indexes
Retrieve all books by tag

Tagging: many to many

Book
(asin, tags = [“php”, “aws”])

Tag
(tag, asin =
[“1449393683”, “0596515812”])

Query by book, retrieve tag collection
Query by tag, retrieve book collection

Autoscaling via SNS

$Res = $DDB->describe_table(array('TableName' => 'books'));

$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;
$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;

$Read *= 2;
$Write *= 2;

$PT = array('ReadCapacityUnits' => (string) $Read,
'WriteCapacityUnits' => (string) $Write);

$Res = $ddb->update_table(array('TableName' => 'books',
'ProvisionedThroughPut' => $PT));

Considerations
Limited index and query model

Throughput is provisioned in 1K operations

Maximum 64K item size

Backup and restore via Elastic MapReduce

Elastic MapReduce
Built for data. Designed for humans.

Collection Computation Collaboration

Collection Computation Collaboration
DynamoDB Elastic MapReduce

Amazon S3 Amazon EC2

DynamoDB

Data

Code Elastic
MapReduce

DynamoDB

Data

Code Elastic Name
MapReduce node

DynamoDB

Data

Code Elastic Name
MapReduce node

Elastic
cluster

DynamoDB

Data

Code Elastic Name
MapReduce node

HDFS

Elastic
cluster

DynamoDB

Data

Code Elastic Name Output
MapReduce node S3

HDFS

Elastic
cluster

DynamoDB

Data

Output
S3

Export to S3
CREATE EXTERNAL TABLE orders_s3_new_export ( order_id
string, customer_id string, order_date int, total
double )
PARTITIONED BY (year string, month string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION 's3://export_bucket';

INSERT OVERWRITE TABLE
orders_s3_new_export
PARTITION (year='2012', month='01')
SELECT * from orders_ddb_2012_01;

Live data in DynamoDB

SELECT customer_id, sum(total) spend, count(*)
order_count
FROM orders_ddb_2012_01
WHERE order_date >= unix_timestamp('2012-01-01', 'yyyy-
MM-dd')
AND order_date < unix_timestamp('2012-01-08', 'yyyy-MM-
dd')
GROUP BY customer_id
ORDER BY spend desc
LIMIT 5;

AGENDA
Getting to know DynamoDB

Gu ided tour of service highlights
Data model
DynamoDB in practice
Analytics with Elastic MapReduce

Slides available shortly.

DynamoDB free tier
5 writes/second
10 consistent reads/second
100Mb storage

Developer Guide
aws.amazon.com/documentation/dynamodb

Drop us a line!
aws.amazon.com/contact-us

SimpleDB
Zero maintenance, NoSQL datastore

Flexible queries

10Gb / 1 billion No native
attributes per tabel data sharding

SimpleDB
Zero maintenance, NoSQL datastore

Webinar | Introduction to Amazon DynamoDB

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Webinar | Introduction to Amazon DynamoDB

Ähnlich wie Webinar | Introduction to Amazon DynamoDB (20)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Webinar | Introduction to Amazon DynamoDB