SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Migrating from Cassandra to
Amazon DynamoDB
Edin Zulich
NoSQL Solutions Architect
May 2018
Agenda
• Cassandra/DynamoDB overview
• Why customers migrate from Cassandra
• Benefits of migrating to DynamoDB
• Migration process
• FAQ
• Summary
• References
Cassandra/DynamoDB Overview
• Both NoSQL, inspired by Dynamo
• Both used for similar use cases and requirements
• Scalable NoSQL store
• Performance
• Time-series data
• Global reach
• Both API-driven
• Different Capacity and cost model
• Instance based vs. request based
• Manage your cluster vs. consume a fully-managed service
Why Customers Migrate from Cassandra
“ The database maintenance overhead slowed down the overall progress.
Additional time spent scaling, upgrading and maintaining a database cluster
removed time away from adding features or improving service code.
”
“ …when the company achieved a moderate traffic scale, the database started
becoming unstable and presented us with occasional mood swings (here I refer
to long running heavy compactions, infrastructure failures, buggy software
patching etc.) leading to outages.
Anirban Roy
GumGum
https://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra
Zack Owens
Cloud Architect, Nike
https://medium.com/nikeengineering/becoming-a-nimble-giant-how-dynamo-db-serves-nike-at-scale-4cc375dbb18e
Why Customers Migrate from Cassandra
Operational challenges grow with scale
• Resulting in increased complexity, resources and cost
• This is true for all distributed/clustered databases
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why DynamoDB? Here’s an example:
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
Fully managed nonrelational database for any scale
Scale and performance
Fast, consistent performance
Virtually unlimited throughput
Virtually unlimited storage
Secure
Encryption at rest and in transit
VPC Endpoints
Fine-grained access control
PCI, HIPAA, FIPS140-2 eligible
Fully managed
Maintenance-free
Serverless
Auto scaling
Backup and restore
Global tables
DynamoDB over the last year
VPC
Endpoints
April 2017
Auto
Scaling
June 2017
DynamoDB
Accelerator (DAX)
April 2017
Time to
Live (TTL)
February 2017
Global Tables
Backup and
recovery
Encryption at rest
February 2018December 2017 December 2017, March 2018
+ Adaptive Capacity
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DynamoDB Global Tables
F u l l y m a n a g e d , m u l t i - m a s t e r, m u l t i - re g i o n d a ta b a s e
Build high performance, globally distributed applications
Low latency reads & writes to locally available tables
Disaster proof with multi-region redundancy
Easy to setup and no application re-writes required
Benefits of migrating to DynamoDB
• Stability, performance at scale
• Single-digit millisecond latency for reads and writes at any scale
• DynamoDB elastic provisioning
• Zero maintenance and operations overhead
• 30% to 70% cost savings over Cassandra
Migration Process
A Phased Approach to Migration
Planning
Data
Analysis
Data
Modeling
Testing Migration
App
DevOps
It’s more of an application migration than a data migration
Samsung Cloud Service
• Backup and restore and key value
store for mobile app
• 300 million users
• Entire migration process took ~12
mo.
• Almost 1 PB in DynamoDB, 130 million
daily API requests
• Consistent performance and 70% cost
savings (TCO)
“DynamoDB provided consistent high
performance at a drastically lower cost
than Cassandra.”
—Seongkyu Kim
Samsung
Example:
Planning
• Define requirements and goals of the migration
• Application specific
• E.g.: Is downtime allowed for cutover? If so, how long?
• Opportunity for “spring cleaning”
• Document per-table requirements and challenges
• Define and document backup and restore strategies
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
Data Analysis
Source DB
Source Data Analysis
Key data attributes:
• Number of items to be migrated
• Distribution of the item sizes
• Multiplicity of values to be used as
partition or sort keys
• Data lifecycle
Application access patterns
• Writes and updates
• LightWeight transactions
• Queries
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
Data Modeling and
Capacity Planning
Data Modeling
• Work from access patterns
• Define instantiated views for your access patterns
• Define a partitioning (scaling) scheme
• Likely the same as in Cassandra
• Define data structure
• What are records (items in DynamoDB) going to look like?
• May not be the same as in Cassandra
• Streams and Triggers
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
DynamoDB fundamentals: Table
Table
Items
Attributes
Partition
Key
Sort
Key
Mandatory
Key-value access pattern
Determines data distribution
Optional
Model 1:N relationships
Enables rich query capabilities
All items for key
==, <, >, >=, <=
“begins with”
“between”
“contains”
“in”
sorted results
counts
top/bottom N values
DynamoDB Sort key  Cassandra clustering key
DynamoDB fundamentals: data types
Type DynamoDB Type
String String
Integer, Float Number
Timestamp Number or String
Blob Binary
Boolean Bool
Null Null
List List
Set
Set of String,
Number, or Binary
Map Map
DynamoDB: provisioned throughput capacity
Per table/GSI
Read Capacity Unit (RCU)
1 RCU returns 4KB of data for strongly
consistent reads, or double the data
for eventually consistent reads
Capacity is per second, rounded up to
the next whole number
Write Capacity Unit (WCU)
1 WCU writes 1KB of data, and each
item consumes 1 WCU minimum
Modeling hierarchical data: item hierarchies…
• Use composite sort key to define a hierarchy
• Highly selective result sets
Primary Key
Attributes
ProductID type
Items
1 bookID
title author genre publisher datePublished ISBN
Some Book John Smith Science Fiction Ballantine Oct-70 0-345-02046-4
2 albumID
title artist genre label studio released producer
Some Album Some Band Progressive Rock Harvest Abbey Road 3/1/73 Somebody
2 albumID:trackID
title length music vocals
Track 1 1:30 Mason Instrumental
2 albumID:trackID
title length music vocals
Track 2 2:43 Mason Mason
2 albumID:trackID
title length music vocals
Track 3 3:30 Smith Johnson
3 movieID
title genre writer producer
Some Movie Scifi Comedy Joe Smith 20th Century Fox
3 movieID:actorID
name character image
Some Actor Joe img2.jpg
3 movieID:actorID
name character image
Some Actress Rita img3.jpg
3 movieID:actorID
name character image
Some Actor Frito img1.jpg
… or documents (JSON)
• JSON data types (M, L, BOOL, NULL)
• Document SDKs available
• 400 KB maximum item size (limits hierarchical data structure)
Primary Key
Attributes
ProductID
Items
1
id title author genre publisher datePublished ISBN
bookID Some Book Some Guy Science Fiction Ballantine Oct-70 0-345-02046-4
2
id title artist genre Attributes
albumID Some Album Some Band Progressive Rock
{ label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink
Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals:
"Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour,
Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music:
”Gilmour, Waters", vocals: "Instrumental"}]}
3
id title genre writer Attributes
movieID Some Movie Scifi Comedy Joe Smith
{ producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob:
"9/21/71", character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya
Rudolph", dob: "7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax
Shepard", dob: "1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
Data Modeling and
Capacity Planning
Capacity Planning
• Reads, writes, and storage for tables and GSI’s
• Cost considerations
• Streams
• Migration vs. post-migration capacity
• Is there an initial data import phase?
• Provision capacity for import during migration, then for normal operation
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
Application Development
and Operations
Development and Testing
• Data access layer for DynamoDB
• Dynamic application configuration
• Write/read to/from both source and target database
• Support gradual switchover
• Test rollback, backup/restore
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
Example: GumGum
Source: https://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra
DynamoDB API
Admin CRUD
Create Table Put/Get Item
Update Table Batch Put/Get Item
Delete Table Update Item
Describe Table Delete Item
Query
Scan
DynamoDB Streams
ListStreams
DescribeStream
GetShardIterator
GetRecords
Streams APITable and Item API
DynamoDB API Notes
• Conditional writes/updates
• ConditionalCheckFailedException
• ConsistentRead parameter
• On a per request basis when using GetItem, Scan, Query
• Filtering (FilterExpression)
• Sort order (ScanIndexForward)
• Limit
• Pagination (LastEvaluatedKey, ExclusiveStartKey)
• ThrottlingException and
ProvisionedThroughputExceededException
• ReturnConsumedCapacity
Rich Expressions
• Projection expression
• Query/Get/Scan: ProductReviews.FiveStar[0]
• Filter expression
• Query/Scan: #V > :num (#V is a place holder for keyword VIEWS)
• Conditional expression
• Put/Update/DeleteItem: attribute_not_exists (#pr.FiveStar)
• Update expression
• UpdateItem: set Replies = Replies + :num
DynamoDB Error Handling
• HTTP 400
• A problem with request
• Common: ProvisionedThroughputExceededException
• Use exponential backoff to retry
• ConditionalCheckFailedException
• HTTP 500
• A problem on the service side
• OK to retry immediately or after a short delay
• Retries and exponential backoff
• Enabled and handled by SDK by default
• Understand the backoff strategies – e.g. in Java SDK:
PredefinedBackoffStrategies.java
• Consider disabling the default and implementing your own
• Log and monitor failed requests
Application Development
and Operations
Operations
• Security/access control via AWS IAM
• Deployment via AWS CloudFormation
• Monitoring via Amazon CloudWatch, AWS CloudTrail, AWS Config
• Or use third-party offerings
• Custom application metrics
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
Data Migration
• Approach depends on application
• Examples
• Data lifecycle-based: data stored for a set amount of time
• Phase 1: write to both, read from Cassandra
• Phase 2: after a full cycle start reading from DynamoDB
• Incremental
• E.g. user-based
• Migrate users over time, each user as quickly as possible
• Batch export/import + live updates
• Use EMR to import data into DynamoDB
• Provision enough write capacity
• Keep read capacity down
• Tune EMR cluster instance type and size, DynamoDB write ratio
Planning
Data
Analysis
Data
Modeling
App
DevOps
MigrationTesting
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example: Samsung
• Online migration
• Full migration is not possible (several 100s of TB sized tables) : Per user migration
• Some users are in Cassandra while others are in DynamoDB : Storage path DB
• To minimize impact for each user, migrate as soon as possible : Accelerate migration
Cassandra
cluster
DynamoDB
App Servers
user storage path
User A <path_to_cassandra>
User B <path_to_dynamodb>
Mobile
clients Storage
path DB
User
A
User
B
Normal data flow
for migrated users
Normal data flow
for not migrated users
Migration data flow
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of migrating to DynamoDB:
• Successfully launched Samsung Cloud service supporting massive scale workloads for Samsung
Galaxy Smart Phones
• 40% cost saving in NoSQL infrastructure cost
• No capacity planning for Peta-byte Scale Storage Capacity with on-demand capacity
• Consistent performance at 10s of Millions of operations
• Zero administration for Hundreds of Tables with DynamoDB Auto Scaling
• No failures during 2 years Operation
• No data corruption or loss for Billions of Items
• Enterprise Level Security and Compliance using VPC Endpoints for DynamoDB
Benefits of migrating to DynamoDB:
• Stability, performance at scale
• ~2-3ms read and ~4-6ms write latency
• DynamoDB elastic provisioning via auto scaling
• Zero maintenance and operations overhead
• 65-70% TCO savings over Cassandra
”
“
Anirban Roy
GumGum
https://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra
Cassandra to DynamoDB FAQ
• Q: We are using Cassandra to store time-series data, and DynamoDB is a key-value store, so it’s
not efficient for time-series data.
DynamoDB is as efficient at storing time-series data as Cassandra is. DynamoDB
supports composite primary keys (partition and sort key), and when the sort key is a
timestamp, DynamoDB stores data grouped and sorted by time, allowing for efficient
access by time and time ranges.
• Q: Does DynamoDB have tombstones like Cassandra?
No, DynamoDB does not use tombstones so there aren’t performance issues arising
from them.
• Q: Does DynamoDB do compactions like Cassandra?
No, DynamoDB does not do compactions, and so there is no risk of performance
degradation due to those.
• Q: Can DynamoDB latency be affected by JVM garbage collection?
No. With respect to all three questions above, the key thing to know is that, as a fully-
managed service, DynamoDB manages the resources for you to deliver consistent
Cassandra to DynamoDB FAQ
• Q: We had a hard time with data consistency with Cassandra. Won’t we have the same problem
with DynamoDB?
No. DynamoDB manages consistency for you and provides straightforward options for
reads, as well as conditional update API that can be used to implement transactional
behavior.
• Q: We have rows in Cassandra with many columns, and they are large in size. Doesn’t DynamoDB
have a limit on row size?
Yes, DynamoDB limits each row (“item”) size to 400KB. Wide rows from Cassandra can
be modeled as item hierarchies (multi-row collections) in DynamoDB, or JSON
documents, depending on access patterns. Also, consider using compression for large
fields/data that is not queried, and/or storing large objects in S3.
• DynamoDB does not have a date/time data type. What should we use instead?
Date/time values should be stored as numbers (in epoch format) if they will be used
with DynamoDB TTL. They can also be stored as strings (e.g. in ISO 8601 to allow for
sorting).
From Cassandra to DynamoDB: Summary
• Trading one set of idiosyncrasies for another…
• From thinking about instances to thinking about read/write capacity
• From Cassandra column families to DynamoDB tables and item hierarchies
• DynamoDB item size is limited to 400KB
• Store date/time data as Numbers or Strings in DynamoDB
• Benefits of DynamoDB
• No database maintenance, no clusters to manage
• Effortless scaling
• Stability and consistent performance at any scale
• In all AWS regions around the world
• Agility
• Cost savings
From Cassandra to DynamoDB: Summary
• Keys to Success
• Understand the source data and access patterns
• Understand capacity and cost and how scaling affects it
• Test thoroughly and often
• Plan on an iterative migration process
• Learn from documented cases
References
• DynamoDB Best Practices
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html
• Samsung Cassandra to DynamoDB Migration
• https://www.youtube.com/watch?v=Z-2UIrI9feQ#t=01m44s
• Moving to Amazon DynamoDB from Cassandra: A Leap Towards 60% Cost
Saving per Year
• http://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra
• Becoming a Nimble Giant: How DynamoDB Serves Nike at Scale
• https://medium.com/nikeengineering/becoming-a-nimble-giant-how-dynamo-db-serves-nike-at-scale-4cc375dbb18e
• Why Druva Moved Away from Cassandra
• https://www.druva.com/blog/why-druva-moved-away-from-cassandra/
• https://aws.amazon.com/solutions/case-studies/druva/
• Why Tellybug Moved from Cassandra to Amazon DynamoDB
• https://attentionshard.wordpress.com/2013/09/30/why-tellybug-moved-from-cassandra-to-amazon-dynamodb/
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...
AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...
AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...Amazon Web Services Korea
 
HSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and ServerlessHSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and ServerlessAmazon Web Services
 
Serverless data and analytics on AWS for operations
Serverless data and analytics on AWS for operations Serverless data and analytics on AWS for operations
Serverless data and analytics on AWS for operations CloudHesive
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...Amazon Web Services
 
Deep Dive on Amazon QuickSight - January 2017 AWS Online Tech Talks
Deep Dive on Amazon QuickSight - January 2017 AWS Online Tech TalksDeep Dive on Amazon QuickSight - January 2017 AWS Online Tech Talks
Deep Dive on Amazon QuickSight - January 2017 AWS Online Tech TalksAmazon Web Services
 
AWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfAWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfSrinjoySaha12
 
AWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence PillarAWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence PillarJonathan LaCour
 
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...Amazon Web Services
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Web Services Korea
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon Web Services
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudAmazon Web Services
 
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈 Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈 Amazon Web Services Korea
 
Moving your Desktops to the Cloud with Amazon WorkSpaces
Moving your Desktops to the Cloud with Amazon WorkSpacesMoving your Desktops to the Cloud with Amazon WorkSpaces
Moving your Desktops to the Cloud with Amazon WorkSpacesAmazon Web Services
 
Getting Started with AWS Compute Services
Getting Started with AWS Compute ServicesGetting Started with AWS Compute Services
Getting Started with AWS Compute ServicesAmazon Web Services
 
So you want to be Well-Architected?
So you want to be Well-Architected?So you want to be Well-Architected?
So you want to be Well-Architected?Amazon Web Services
 
Disaster Recovery using Azure Services
Disaster Recovery using Azure ServicesDisaster Recovery using Azure Services
Disaster Recovery using Azure ServicesAnoop Nair
 

Was ist angesagt? (20)

AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...
AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...
AWS 비용 효율화를 고려한 Reserved Instance + Savings Plan 옵션 - 박윤 어카운트 매니저 :: AWS Game...
 
HSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and ServerlessHSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and Serverless
 
Serverless data and analytics on AWS for operations
Serverless data and analytics on AWS for operations Serverless data and analytics on AWS for operations
Serverless data and analytics on AWS for operations
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
 
Deep Dive on Amazon QuickSight - January 2017 AWS Online Tech Talks
Deep Dive on Amazon QuickSight - January 2017 AWS Online Tech TalksDeep Dive on Amazon QuickSight - January 2017 AWS Online Tech Talks
Deep Dive on Amazon QuickSight - January 2017 AWS Online Tech Talks
 
AWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfAWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdf
 
AWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence PillarAWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence Pillar
 
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈 Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
Amazon RDS 살펴보기 (김용우) - AWS 웨비나 시리즈
 
Moving your Desktops to the Cloud with Amazon WorkSpaces
Moving your Desktops to the Cloud with Amazon WorkSpacesMoving your Desktops to the Cloud with Amazon WorkSpaces
Moving your Desktops to the Cloud with Amazon WorkSpaces
 
Getting Started with AWS Compute Services
Getting Started with AWS Compute ServicesGetting Started with AWS Compute Services
Getting Started with AWS Compute Services
 
Amazon QuickSight
Amazon QuickSightAmazon QuickSight
Amazon QuickSight
 
Cost Optimization on AWS
Cost Optimization on AWSCost Optimization on AWS
Cost Optimization on AWS
 
So you want to be Well-Architected?
So you want to be Well-Architected?So you want to be Well-Architected?
So you want to be Well-Architected?
 
Disaster Recovery using Azure Services
Disaster Recovery using Azure ServicesDisaster Recovery using Azure Services
Disaster Recovery using Azure Services
 

Ähnlich wie How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAmazon Web Services
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Tech Triveni
 
DAT320_Moving a Galaxy into Cloud
DAT320_Moving a Galaxy into CloudDAT320_Moving a Galaxy into Cloud
DAT320_Moving a Galaxy into CloudAmazon Web Services
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 
SRV415 NEW LAUNCH! DynamoDB just got faster: Deep Dive on DAX and more
SRV415 NEW LAUNCH!  DynamoDB just got faster: Deep Dive on DAX and moreSRV415 NEW LAUNCH!  DynamoDB just got faster: Deep Dive on DAX and more
SRV415 NEW LAUNCH! DynamoDB just got faster: Deep Dive on DAX and moreAmazon Web Services
 
Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18MongoDB
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
 
How & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit DublinHow & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit DublinAmazon Web Services
 
AWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best PracticesAWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best PracticesAmazon Web Services
 
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...Amazon Web Services
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentationadvaitdeo
 
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개Amazon Web Services Korea
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 
La big datacamp-2014-aws-dynamodb-overview-michael_limcaco
La big datacamp-2014-aws-dynamodb-overview-michael_limcacoLa big datacamp-2014-aws-dynamodb-overview-michael_limcaco
La big datacamp-2014-aws-dynamodb-overview-michael_limcacoData Con LA
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
IPC Data Analysis and Extraction
IPC Data Analysis and ExtractionIPC Data Analysis and Extraction
IPC Data Analysis and Extractionpzybrick
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneMongoDB
 

Ähnlich wie How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks (20)

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDB
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
DAT320_Moving a Galaxy into Cloud
DAT320_Moving a Galaxy into CloudDAT320_Moving a Galaxy into Cloud
DAT320_Moving a Galaxy into Cloud
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
SRV415 NEW LAUNCH! DynamoDB just got faster: Deep Dive on DAX and more
SRV415 NEW LAUNCH!  DynamoDB just got faster: Deep Dive on DAX and moreSRV415 NEW LAUNCH!  DynamoDB just got faster: Deep Dive on DAX and more
SRV415 NEW LAUNCH! DynamoDB just got faster: Deep Dive on DAX and more
 
Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18Solving your Backup Needs - Ben Cefalo mdbe18
Solving your Backup Needs - Ben Cefalo mdbe18
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
 
How & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit DublinHow & When to Use NoSQL at Websummit Dublin
How & When to Use NoSQL at Websummit Dublin
 
AWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best PracticesAWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best Practices
 
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
(HLS402) Getting into Your Genes: The Definitive Guide to Using Amazon EMR, A...
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentation
 
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
2017 AWS DB Day | Amazon DynamoDB 서비스, 개요 및 신규 기능 소개
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
La big datacamp-2014-aws-dynamodb-overview-michael_limcaco
La big datacamp-2014-aws-dynamodb-overview-michael_limcacoLa big datacamp-2014-aws-dynamodb-overview-michael_limcaco
La big datacamp-2014-aws-dynamodb-overview-michael_limcaco
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
IPC Data Analysis and Extraction
IPC Data Analysis and ExtractionIPC Data Analysis and Extraction
IPC Data Analysis and Extraction
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova Generazione
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks

  • 1. Migrating from Cassandra to Amazon DynamoDB Edin Zulich NoSQL Solutions Architect May 2018
  • 2. Agenda • Cassandra/DynamoDB overview • Why customers migrate from Cassandra • Benefits of migrating to DynamoDB • Migration process • FAQ • Summary • References
  • 3. Cassandra/DynamoDB Overview • Both NoSQL, inspired by Dynamo • Both used for similar use cases and requirements • Scalable NoSQL store • Performance • Time-series data • Global reach • Both API-driven • Different Capacity and cost model • Instance based vs. request based • Manage your cluster vs. consume a fully-managed service
  • 4. Why Customers Migrate from Cassandra “ The database maintenance overhead slowed down the overall progress. Additional time spent scaling, upgrading and maintaining a database cluster removed time away from adding features or improving service code. ” “ …when the company achieved a moderate traffic scale, the database started becoming unstable and presented us with occasional mood swings (here I refer to long running heavy compactions, infrastructure failures, buggy software patching etc.) leading to outages. Anirban Roy GumGum https://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra Zack Owens Cloud Architect, Nike https://medium.com/nikeengineering/becoming-a-nimble-giant-how-dynamo-db-serves-nike-at-scale-4cc375dbb18e
  • 5. Why Customers Migrate from Cassandra Operational challenges grow with scale • Resulting in increased complexity, resources and cost • This is true for all distributed/clustered databases
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why DynamoDB? Here’s an example:
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 8. Amazon DynamoDB Fully managed nonrelational database for any scale Scale and performance Fast, consistent performance Virtually unlimited throughput Virtually unlimited storage Secure Encryption at rest and in transit VPC Endpoints Fine-grained access control PCI, HIPAA, FIPS140-2 eligible Fully managed Maintenance-free Serverless Auto scaling Backup and restore Global tables
  • 9. DynamoDB over the last year VPC Endpoints April 2017 Auto Scaling June 2017 DynamoDB Accelerator (DAX) April 2017 Time to Live (TTL) February 2017 Global Tables Backup and recovery Encryption at rest February 2018December 2017 December 2017, March 2018 + Adaptive Capacity
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. DynamoDB Global Tables F u l l y m a n a g e d , m u l t i - m a s t e r, m u l t i - re g i o n d a ta b a s e Build high performance, globally distributed applications Low latency reads & writes to locally available tables Disaster proof with multi-region redundancy Easy to setup and no application re-writes required
  • 11. Benefits of migrating to DynamoDB • Stability, performance at scale • Single-digit millisecond latency for reads and writes at any scale • DynamoDB elastic provisioning • Zero maintenance and operations overhead • 30% to 70% cost savings over Cassandra
  • 13. A Phased Approach to Migration Planning Data Analysis Data Modeling Testing Migration App DevOps It’s more of an application migration than a data migration
  • 14. Samsung Cloud Service • Backup and restore and key value store for mobile app • 300 million users • Entire migration process took ~12 mo. • Almost 1 PB in DynamoDB, 130 million daily API requests • Consistent performance and 70% cost savings (TCO) “DynamoDB provided consistent high performance at a drastically lower cost than Cassandra.” —Seongkyu Kim Samsung Example:
  • 15. Planning • Define requirements and goals of the migration • Application specific • E.g.: Is downtime allowed for cutover? If so, how long? • Opportunity for “spring cleaning” • Document per-table requirements and challenges • Define and document backup and restore strategies Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 16. Data Analysis Source DB Source Data Analysis Key data attributes: • Number of items to be migrated • Distribution of the item sizes • Multiplicity of values to be used as partition or sort keys • Data lifecycle Application access patterns • Writes and updates • LightWeight transactions • Queries Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 17. Data Modeling and Capacity Planning Data Modeling • Work from access patterns • Define instantiated views for your access patterns • Define a partitioning (scaling) scheme • Likely the same as in Cassandra • Define data structure • What are records (items in DynamoDB) going to look like? • May not be the same as in Cassandra • Streams and Triggers Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 18. DynamoDB fundamentals: Table Table Items Attributes Partition Key Sort Key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for key ==, <, >, >=, <= “begins with” “between” “contains” “in” sorted results counts top/bottom N values DynamoDB Sort key  Cassandra clustering key
  • 19. DynamoDB fundamentals: data types Type DynamoDB Type String String Integer, Float Number Timestamp Number or String Blob Binary Boolean Bool Null Null List List Set Set of String, Number, or Binary Map Map
  • 20. DynamoDB: provisioned throughput capacity Per table/GSI Read Capacity Unit (RCU) 1 RCU returns 4KB of data for strongly consistent reads, or double the data for eventually consistent reads Capacity is per second, rounded up to the next whole number Write Capacity Unit (WCU) 1 WCU writes 1KB of data, and each item consumes 1 WCU minimum
  • 21. Modeling hierarchical data: item hierarchies… • Use composite sort key to define a hierarchy • Highly selective result sets Primary Key Attributes ProductID type Items 1 bookID title author genre publisher datePublished ISBN Some Book John Smith Science Fiction Ballantine Oct-70 0-345-02046-4 2 albumID title artist genre label studio released producer Some Album Some Band Progressive Rock Harvest Abbey Road 3/1/73 Somebody 2 albumID:trackID title length music vocals Track 1 1:30 Mason Instrumental 2 albumID:trackID title length music vocals Track 2 2:43 Mason Mason 2 albumID:trackID title length music vocals Track 3 3:30 Smith Johnson 3 movieID title genre writer producer Some Movie Scifi Comedy Joe Smith 20th Century Fox 3 movieID:actorID name character image Some Actor Joe img2.jpg 3 movieID:actorID name character image Some Actress Rita img3.jpg 3 movieID:actorID name character image Some Actor Frito img1.jpg
  • 22. … or documents (JSON) • JSON data types (M, L, BOOL, NULL) • Document SDKs available • 400 KB maximum item size (limits hierarchical data structure) Primary Key Attributes ProductID Items 1 id title author genre publisher datePublished ISBN bookID Some Book Some Guy Science Fiction Ballantine Oct-70 0-345-02046-4 2 id title artist genre Attributes albumID Some Album Some Band Progressive Rock { label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals: "Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour, Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music: ”Gilmour, Waters", vocals: "Instrumental"}]} 3 id title genre writer Attributes movieID Some Movie Scifi Comedy Joe Smith { producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob: "9/21/71", character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya Rudolph", dob: "7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax Shepard", dob: "1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
  • 23. Data Modeling and Capacity Planning Capacity Planning • Reads, writes, and storage for tables and GSI’s • Cost considerations • Streams • Migration vs. post-migration capacity • Is there an initial data import phase? • Provision capacity for import during migration, then for normal operation Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 24. Application Development and Operations Development and Testing • Data access layer for DynamoDB • Dynamic application configuration • Write/read to/from both source and target database • Support gradual switchover • Test rollback, backup/restore Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 26. DynamoDB API Admin CRUD Create Table Put/Get Item Update Table Batch Put/Get Item Delete Table Update Item Describe Table Delete Item Query Scan DynamoDB Streams ListStreams DescribeStream GetShardIterator GetRecords Streams APITable and Item API
  • 27. DynamoDB API Notes • Conditional writes/updates • ConditionalCheckFailedException • ConsistentRead parameter • On a per request basis when using GetItem, Scan, Query • Filtering (FilterExpression) • Sort order (ScanIndexForward) • Limit • Pagination (LastEvaluatedKey, ExclusiveStartKey) • ThrottlingException and ProvisionedThroughputExceededException • ReturnConsumedCapacity
  • 28. Rich Expressions • Projection expression • Query/Get/Scan: ProductReviews.FiveStar[0] • Filter expression • Query/Scan: #V > :num (#V is a place holder for keyword VIEWS) • Conditional expression • Put/Update/DeleteItem: attribute_not_exists (#pr.FiveStar) • Update expression • UpdateItem: set Replies = Replies + :num
  • 29. DynamoDB Error Handling • HTTP 400 • A problem with request • Common: ProvisionedThroughputExceededException • Use exponential backoff to retry • ConditionalCheckFailedException • HTTP 500 • A problem on the service side • OK to retry immediately or after a short delay • Retries and exponential backoff • Enabled and handled by SDK by default • Understand the backoff strategies – e.g. in Java SDK: PredefinedBackoffStrategies.java • Consider disabling the default and implementing your own • Log and monitor failed requests
  • 30. Application Development and Operations Operations • Security/access control via AWS IAM • Deployment via AWS CloudFormation • Monitoring via Amazon CloudWatch, AWS CloudTrail, AWS Config • Or use third-party offerings • Custom application metrics Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 31. Data Migration • Approach depends on application • Examples • Data lifecycle-based: data stored for a set amount of time • Phase 1: write to both, read from Cassandra • Phase 2: after a full cycle start reading from DynamoDB • Incremental • E.g. user-based • Migrate users over time, each user as quickly as possible • Batch export/import + live updates • Use EMR to import data into DynamoDB • Provision enough write capacity • Keep read capacity down • Tune EMR cluster instance type and size, DynamoDB write ratio Planning Data Analysis Data Modeling App DevOps MigrationTesting
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Example: Samsung • Online migration • Full migration is not possible (several 100s of TB sized tables) : Per user migration • Some users are in Cassandra while others are in DynamoDB : Storage path DB • To minimize impact for each user, migrate as soon as possible : Accelerate migration Cassandra cluster DynamoDB App Servers user storage path User A <path_to_cassandra> User B <path_to_dynamodb> Mobile clients Storage path DB User A User B Normal data flow for migrated users Normal data flow for not migrated users Migration data flow
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of migrating to DynamoDB: • Successfully launched Samsung Cloud service supporting massive scale workloads for Samsung Galaxy Smart Phones • 40% cost saving in NoSQL infrastructure cost • No capacity planning for Peta-byte Scale Storage Capacity with on-demand capacity • Consistent performance at 10s of Millions of operations • Zero administration for Hundreds of Tables with DynamoDB Auto Scaling • No failures during 2 years Operation • No data corruption or loss for Billions of Items • Enterprise Level Security and Compliance using VPC Endpoints for DynamoDB
  • 34. Benefits of migrating to DynamoDB: • Stability, performance at scale • ~2-3ms read and ~4-6ms write latency • DynamoDB elastic provisioning via auto scaling • Zero maintenance and operations overhead • 65-70% TCO savings over Cassandra ” “ Anirban Roy GumGum https://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra
  • 35. Cassandra to DynamoDB FAQ • Q: We are using Cassandra to store time-series data, and DynamoDB is a key-value store, so it’s not efficient for time-series data. DynamoDB is as efficient at storing time-series data as Cassandra is. DynamoDB supports composite primary keys (partition and sort key), and when the sort key is a timestamp, DynamoDB stores data grouped and sorted by time, allowing for efficient access by time and time ranges. • Q: Does DynamoDB have tombstones like Cassandra? No, DynamoDB does not use tombstones so there aren’t performance issues arising from them. • Q: Does DynamoDB do compactions like Cassandra? No, DynamoDB does not do compactions, and so there is no risk of performance degradation due to those. • Q: Can DynamoDB latency be affected by JVM garbage collection? No. With respect to all three questions above, the key thing to know is that, as a fully- managed service, DynamoDB manages the resources for you to deliver consistent
  • 36. Cassandra to DynamoDB FAQ • Q: We had a hard time with data consistency with Cassandra. Won’t we have the same problem with DynamoDB? No. DynamoDB manages consistency for you and provides straightforward options for reads, as well as conditional update API that can be used to implement transactional behavior. • Q: We have rows in Cassandra with many columns, and they are large in size. Doesn’t DynamoDB have a limit on row size? Yes, DynamoDB limits each row (“item”) size to 400KB. Wide rows from Cassandra can be modeled as item hierarchies (multi-row collections) in DynamoDB, or JSON documents, depending on access patterns. Also, consider using compression for large fields/data that is not queried, and/or storing large objects in S3. • DynamoDB does not have a date/time data type. What should we use instead? Date/time values should be stored as numbers (in epoch format) if they will be used with DynamoDB TTL. They can also be stored as strings (e.g. in ISO 8601 to allow for sorting).
  • 37. From Cassandra to DynamoDB: Summary • Trading one set of idiosyncrasies for another… • From thinking about instances to thinking about read/write capacity • From Cassandra column families to DynamoDB tables and item hierarchies • DynamoDB item size is limited to 400KB • Store date/time data as Numbers or Strings in DynamoDB • Benefits of DynamoDB • No database maintenance, no clusters to manage • Effortless scaling • Stability and consistent performance at any scale • In all AWS regions around the world • Agility • Cost savings
  • 38. From Cassandra to DynamoDB: Summary • Keys to Success • Understand the source data and access patterns • Understand capacity and cost and how scaling affects it • Test thoroughly and often • Plan on an iterative migration process • Learn from documented cases
  • 39. References • DynamoDB Best Practices https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html • Samsung Cassandra to DynamoDB Migration • https://www.youtube.com/watch?v=Z-2UIrI9feQ#t=01m44s • Moving to Amazon DynamoDB from Cassandra: A Leap Towards 60% Cost Saving per Year • http://techblog.gumgum.com/articles/moving-to-amazon-dynamodb-from-hosted-cassandra • Becoming a Nimble Giant: How DynamoDB Serves Nike at Scale • https://medium.com/nikeengineering/becoming-a-nimble-giant-how-dynamo-db-serves-nike-at-scale-4cc375dbb18e • Why Druva Moved Away from Cassandra • https://www.druva.com/blog/why-druva-moved-away-from-cassandra/ • https://aws.amazon.com/solutions/case-studies/druva/ • Why Tellybug Moved from Cassandra to Amazon DynamoDB • https://attentionshard.wordpress.com/2013/09/30/why-tellybug-moved-from-cassandra-to-amazon-dynamodb/