SlideShare a Scribd company logo
1 of 54
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data on AWS: How To Choose The
Right Database and Data Storage
Simon Lee,
Business Development Manager
simonhl@amazon.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data is a strategic asset
for every organization
The world’s most valuable
resource is
*Copyright: The Economist, 2017, David Parkins
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The move
toward
data-centric
companies
Five largest companies by
market cap*
2001
2006
2011
2016
2018
$1.091T
$406B
$446B
$406B
$582B
$976B
$365B
$383B
$556B
$383B
$877B
$272B
$327B
$277B
$452B
$839B
$261B
$293B
$237B
$364B
$523B
$260B
$273B
$228B
$228B
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is a data-
centric
company?
What do we sell?
How do we make money?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thinking about data as an asset, not a cost
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Stop
throwing
data away
Make it
available to
more users
Arm users
with more
data processing
technologies
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data
every 5 years
There is more data than
people think
15
years
live for
Data platforms need to
1,000x
scale
>10x
grows
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hadoop Elasticsearch
There are more
ways to analyze data
than ever before
Years ago
11 8 5 4
Presto Spark
Didn’t exist
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Democratization
of data
Governance
& control
There are more
people working
with data than
ever before
How do I provide democratized
access to data to enable informed
decisions while at the same time
enforce data governance and
prevent mismanagement of the
data?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do we build new
types of applications that
can leverage this data?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Modern apps create new requirements
Users: 1 million+
Data volume: TB–PB–EB
Locality: Global
Performance: Milliseconds–microseconds
Request rate: Millions
Access: Web, mobile, IoT, devices
Scale: Up-down, Out-in
Economics: Pay for what you use
Developer access: No assembly requiredSocial mediaRide hailing Media streaming Dating
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Social mediaRide hailing Media streaming Dating
As application requirements change,
data processing engines need to evolve as well
On Prime Day, DynamoDB requests from
Alexa, the Amazon.com sites, and the
Amazon fulfillment centers totaled 3.34
trillion, peaking at 12.9 million per second
Databases need to be able to provide reliable performance with
highly variable demands and deliver consistent, single-digit
millisecond response time at any scale.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common data categories and use cases
Relational
Referential
integrity, ACID
transactions,
schema-
on-write
Lift and shift, ERP,
CRM, finance
Key-value
High
throughput, low-
latency reads
and writes, endless
scale
Real-time bidding,
shopping cart, social,
product catalog,
customer preferences
Document
Store documents
and quickly
access querying
on any attribute
Content
management,
personalization,
mobile
In-memory
Query by key
with
microsecond
latency
Leaderboards,
real-time analytics,
caching
Graph
Quickly and
easily create and
navigate
relationships
between
data
Fraud detection,
social networking,
recommendation
engine
Time-series
Collect, store,
and process data
sequenced by
time
IoT applications,
event tracking
Ledger
Complete,
immutable, and
verifiable history of
all changes to
application data
Systems
of record, supply
chain, health care,
registrations,
financial
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS purpose-built databases
Relational Key-value Document In-memory Graph Time-series Ledger
DynamoDB NeptuneAmazon RDS
Aurora CommercialCommunity
Timestream QLDBElastiCacheDocumentDB
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Aurora
MySQL and PostgreSQL-compatible relational database built for the cloud
Performance and availability of commercial-grade databases at 1/10th the cost
Performance
and scalability
Availability
and durability
Highly secure Fully managed
5x throughput of standard MySQL
and 3x of standard PostgreSQL;
scale-out up to
15 read replicas
Fault-tolerant, self-healing storage;
six copies of data
across three Availability Zones;
continuous backup to Amazon S3
Network isolation,
encryption at rest/transit
Managed by RDS:
No hardware provisioning, software
patching, setup, configuration, or
backups
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Relational Database Service (RDS)
Managed relational database service with a choice of six popular database engines
Easy to administer Available and durable Highly scalable Fast and secure
No need for infrastructure
provisioning, installing, and
maintaining DB software
Automatic Multi-AZ data replication;
automated backup, snapshots,
failover
Scale database compute
and storage with a few
clicks with no app
downtime
SSD storage and guaranteed
provisioned I/O; data
encryption at rest and in transit
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DynamoDB
Fast and flexible key value database service for any scale
Comprehensive
security
Encrypts all data by default
and fully integrates with AWS
Identity and Access
Management for robust
security
Performance at scale
Consistent, single-digit millisecond
response times at any scale; build
applications with virtually unlimited
throughput
Global database for global
users and apps
Build global applications with fast
access to local data by easily
replicating tables across multiple
AWS Regions
Serverless
No server provisioning, software
patching, or upgrades; scales up
or down automatically;
continuously backs up your data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DocumentDB
Fast, scalable, highly available, fully managed MongoDB-compatible database service
Fully Managed
Managed by AWS:
No hardware provisioning,
software patching, setup,
configuration, or backups
Fast
Millions of requests per second,
millisecond latency
MongoDB-compatible
Compatible with MongoDB
Community Edition 3.6. Use the same
drivers and tools
Reliable
Six replicas of your data across
three AZs with full backup and
restore
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon ElastiCache
Redis and Memcached compatible, in-memory data store and cache
Secure and reliable
Network isolation, encryption
at rest/transit, HIPAA, PCI,
FedRAMP, multi AZ, and
automatic failover
Redis & Memcached
compatible
Fully compatible with open source
Redis and Memcached
Easily scalable
Scale writes and reads with sharding
and replicas
Extreme performance
In-memory data store and cache
for microsecond response times
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Neptune
Fully managed graph database
Easy
Build powerful queries easily
with Gremlin and SPARQL
Fast
Query billions of relationships with
millisecond latency
Open
Supports Apache TinkerPop & W3C
RDF graph models
Reliable
Six replicas of your data across
three AZs with full backup and
restore
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Timestream (sign up for the preview)
Fast, scalable, fully managed time-series database
1,000x faster and 1/10th the
cost of relational databases
Collect data at the rate of
millions of inserts per second
(10M/second)
Trillions of
daily events
Adaptive query processing
engine maintains steady,
predictable performance
Time-series analytics
Built-in functions for
interpolation, smoothing, and
approximation
Serverless
Automated setup, configuration,
server provisioning, software
patching
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Quantum Ledger Database (QLDB)
Fully managed ledger database
Track and verify history of all changes made to your application’s data
Immutable
Maintains a sequenced record of all
changes to your data, which cannot
be deleted or modified; you have
the ability to query and analyze the
full history
Cryptographically
verifiable
Uses cryptography to
generate a secure output
file of your data’s history
Easy to use
Easy to use, letting you
use familiar database
capabilities like SQL APIs for
querying the data
Highly scalable
Executes 2–3X as many
transactions than ledgers
in common blockchain
frameworks
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Database Migration Service (AWS DMS)
M I G R A T I N G
D A T A B A S E S
T O A W S
Migrate between on-premises and AWS
Migrate between databases
Automated schema conversion
Data replication for
zero-downtime migration
100,000+
databases migrated
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers are moving to AWS Databases
Verizon is migrating over 1,000 business-critical applications and database backend systems to AWS, several of
which also include the migration of production databases to Amazon Aurora.
Wappa migrated from their Oracle database to Amazon Aurora and improved their reporting time per
user by 75 percent.
Trimble migrated their Oracle databases to Amazon RDS and project they will pay about 1/4th of what they
paid when managing their private infrastructure.
Intuit migrated from Microsoft SQL Server to Amazon Redshift to reduce data-processing timelines and get
insights to decision makers faster and more frequently.
Equinox Fitness migrated its Teradata on-premises data warehouse to Amazon Redshift. They went from static
reports to a modern data lake that delivers dynamic reports.
By December 2018, Amazon.com migrated 88% of their Oracle DBs (and 97% of critical system DBs) moved
to Amazon Aurora and Amazon DynamoDB. They also migrated their 50 PB Oracle Data Warehouse to AWS
(Amazon S3, Amazon Redshift, and Amazon EMR).
Samsung Electronics migrated their Cassandra clusters to Amazon DynamoDB for their Samsung Cloud
workload with 70% cost savings.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Equinox Fitness Clubs is a company with integrated luxury and lifestyle
offerings centered on movement, nutrition and regeneration. Equinox
built connected experiences using applications that connect to Apple
Health and built data collection in their exercise equipment.
More than 200 locations within every major city across the world
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Many lines of business across 98
clubs & 200+ studios in total
Plus central supporting functions
Digital
Products
CRM Marketing Creative
Development/
Building
Finance Member’s
Services
Maintenance
Personal
training
Pilates Spa Group
Fitness
Membership/
Sales
Retail Food
Services
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Digital products
End user applications
Connections to Apple Health
Connected
equipment
Pursuit (gamified cycling experience)
Cardio
Digital assessment
Location tracking
Connected tech
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data lake architecture
Data & analytics apps
Equinox apps
Third-party apps
Informatica
Maximilian
Amazon EMR
PT
App
Pursuit
Engage
Exact
Target
Adobe Social
MOSO
Fitness
Agg.
Amazon
Redshift
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The assembled pipeline
Adobe
Analytics
Amazon
EMR
AthenaS3
Glue Data
Catalog
Redshift
Spectrum
S3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Re-platformed and productionalized
2 apps in 4 months
Finished re-platform in under a year
Dependability–very few operational issues
Faster time-to-benefit via automated regression
Huge cost savings over Teradata
Results
Reduced time-to-benefit and increased end-
user productivity
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
We need to
rethink what we
mean by data and
analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
This is data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
This is data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
This is data
Skip the trip.
one-hour delivery
Exclusively for Amazon Prime Members
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data can be used to
connect more deeply
with your customer base
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reporting,
analysis, modeling,
and planning are
not going away
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why data lakes?
Data Lakes provide:
Relational and non-relational data
Scale-out to EBs
Diverse set of analytics and machine learning tools
Work on data without any data movement
Designed for low cost storage and analytics
OLTP ERP CRM LOB
Data Warehouse
Business
Intelligence
Data Lake
100110000100101011100101010
111001010100001011111011010
0011110010110010110
0100011000010
Devices Web Sensors Social
Catalog
Machine
Learning
DW Queries Big data
processing
Interactive Real-time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Typical steps of building a data lake
Setup Storage1
Move data2
Cleanse, prep, and
catalog data
3
Configure and enforce
security and compliance
policies
4
Make data available
for analytics
5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon S3 is the base
Data Lake Storage
Secure, highly scalable, durable object storage
with millisecond latency for data access
Store any type of data–web sites, mobile apps,
corporate applications, and IoT sensors, at any
scale
Store data in the format you want:
Unstructured (logs, dump files) | semi-structured (JSON, XML) |
structured (CSV, Parquet)
Storage lifecycle integration
Amazon S3-Standard | Amazon S3-Infrequent Access | Amazon
Glacier
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Lake Formation
Build, secure, and manage a data lake in days
Build a data lake in days, not
months
Build and deploy a fully managed
data lake with a few clicks
Enforce security policies
across multiple services
Centrally define security, governance,
and auditing policies in one place and
enforce those policies for all users and all
applications
Combine different analytics
approaches
Empower analyst and data scientist
productivity, giving them self-service
discovery and safe access to all data
from a single catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How it works
Data Lakes and analytics on AWS
S3
IAM KMS
OLTP
ERP
CRM
LOB
Devices
Web
Sensors
Social Kinesis
Build Data Lakes quickly
• Identify, crawl, and catalog sources
• Ingest and clean data
• Transform into optimal formats
Simplify security management
• Enforce encryption
• Define access policies
• Implement audit login
Enable self-service and combined analytics
• Analysts discover all data available for analysis from a
single data catalog
• Use multiple analytics tools over the same data
Athena
Amazon
Redshift
AI Services
Amazon
EMR
Amazon
QuickSight
Data
Catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Storing is not enough, data needs to be discoverable
Dark data are the information assets
organizations collect, process, and
store during regular business
activities, but generally fail to use
for other purposes (for example,
analytics, business relationships and
direct monetizing).
CRM ERP Data warehouse Mainframe
data
Web Social Log
files
Machine
data
Semi-
structured
Unstructured
“
”Gartner IT Glossary, 2018
https://www.gartner.com/it-glossary/dark-data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use AWS Glue to cleanse, prep, and catalog
AWS Glue Data Catalog - a single view
across your data lake
Automatically discovers data and stores schema
Makes data searchable, and available for ETL
Contains table definitions and custom metadata
Use AWS Glue ETL jobs to cleanse,
transform, and store processed data
Serverless Apache Spark environment
Use Glue ETL libraries or bring your own code
Write code in Python or Scala
Call any AWS API using the AWS boto3 SDKAmazon S3
(Raw data)
Amazon S3
(Staging data)
Amazon S3
(Processed data)
AWS Glue Data Catalog
Crawlers Crawlers Crawlers
CHALLENGE
Need to create constant feedback loop for
designers
Gain up-to-the-minute understanding of
gamer satisfaction to guarantee gamers are
engaged, thus resulting in the most popular
game played in the world
Fortnite | 125+ million players
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Epic Games uses Data Lakes and analytics
Entire analytics platform running on AWS
S3 leveraged as a Data Lake
All telemetry data is collected with Kinesis
Real-time analytics done through Spark on EMR, DynamoDB to
create scoreboards and real-time queries
Use Amazon EMR for large batch data processing
Game designers use data to inform their decisions
Game
clients
Game
servers
Launcher
Game
services
N E A R R E A L T I M E P I P E L I N E
N E A R R E A L T I M E P I P E L I N E
Grafana
Scoreboards API
Limited Raw Data
(real time ad-hoc SQL)
User ETL
(metric definition)
Spark on EMR DynamoDB
NEAR REALTIME PIPELINES
BATCH PIPELINES
ETL using
EMR
Tableau/BI
Ad-hoc SQLS3
(Data Lake)
Kinesis
APIs
Databases
S3
Other
sources
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data has power
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS databases and analytics – There’s a lot more!
Broad and deep portfolio, built for builders
AWS Marketplace
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Analytics
Real-time
Amazon Elasticsearch service
Operational Analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
Amazon
QuickSight
Amazon
SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Amazon Glacier
AWS Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect
Data Movement
AnalyticsDatabases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
Amazon
Comprehend
Amazon
Rekognition
Amazon
Lex
Amazon
Transcribe
AWS DeepLens 250+ solutions
730+ Database
solutions
600+ Analytics
solutions
25+ Blockchain
solutions
20+ Data lake
solutions
30+ solutions
RDS on VMWare
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Most startup database & analytics cloud customers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Most enterprise database & analytics cloud customers
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Amazon Web Services
 

What's hot (20)

Cost optimization - Don't overspend on AWS
Cost optimization - Don't overspend on AWSCost optimization - Don't overspend on AWS
Cost optimization - Don't overspend on AWS
 
Building a Better Business Case for Migrating to Cloud
Building a Better Business Case for Migrating to CloudBuilding a Better Business Case for Migrating to Cloud
Building a Better Business Case for Migrating to Cloud
 
Speed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWSSpeed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWS
 
AWS RDS
AWS RDSAWS RDS
AWS RDS
 
AWS Partnership Model
AWS Partnership ModelAWS Partnership Model
AWS Partnership Model
 
Azure Migration Program Pitch Deck
Azure Migration Program Pitch DeckAzure Migration Program Pitch Deck
Azure Migration Program Pitch Deck
 
Cloud Cost Optimization Whitepaper
Cloud Cost Optimization WhitepaperCloud Cost Optimization Whitepaper
Cloud Cost Optimization Whitepaper
 
Breaking down the economics and tco of migrating to aws - Toronto
Breaking down the economics and tco of migrating to aws - TorontoBreaking down the economics and tco of migrating to aws - Toronto
Breaking down the economics and tco of migrating to aws - Toronto
 
Cost optimization on AWS
Cost optimization on AWSCost optimization on AWS
Cost optimization on AWS
 
Reduce Costs and Build a Strong Operational Foundation with the AWS Migration...
Reduce Costs and Build a Strong Operational Foundation with the AWS Migration...Reduce Costs and Build a Strong Operational Foundation with the AWS Migration...
Reduce Costs and Build a Strong Operational Foundation with the AWS Migration...
 
Migrating Enterprise Applications to AWS: Best Practices & Techniques (ENT303...
Migrating Enterprise Applications to AWS: Best Practices & Techniques (ENT303...Migrating Enterprise Applications to AWS: Best Practices & Techniques (ENT303...
Migrating Enterprise Applications to AWS: Best Practices & Techniques (ENT303...
 
AWS Cost Management Workshop
AWS Cost Management WorkshopAWS Cost Management Workshop
AWS Cost Management Workshop
 
AWS Security Week: Security, Identity, & Compliance
AWS Security Week: Security, Identity, & ComplianceAWS Security Week: Security, Identity, & Compliance
AWS Security Week: Security, Identity, & Compliance
 
Large-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCLarge-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSC
 
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance WorkshopMicrosoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
 
Transforming Consumer Banking with a 100% Cloud-Based Bank (FSV204) - AWS re:...
Transforming Consumer Banking with a 100% Cloud-Based Bank (FSV204) - AWS re:...Transforming Consumer Banking with a 100% Cloud-Based Bank (FSV204) - AWS re:...
Transforming Consumer Banking with a 100% Cloud-Based Bank (FSV204) - AWS re:...
 
Intro to AI & ML at Amazon
Intro to AI & ML at AmazonIntro to AI & ML at Amazon
Intro to AI & ML at Amazon
 
Migrating Your Databases to AWS - Tools and Services.pdf
Migrating Your Databases to AWS -  Tools and Services.pdfMigrating Your Databases to AWS -  Tools and Services.pdf
Migrating Your Databases to AWS - Tools and Services.pdf
 
Architecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSArchitecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWS
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
 

Similar to Databases - Choosing the right Database on AWS

Migrate and Modernize Your Database
Migrate and Modernize Your DatabaseMigrate and Modernize Your Database
Migrate and Modernize Your Database
Amazon Web Services
 

Similar to Databases - Choosing the right Database on AWS (20)

AWS-Quick-Start
AWS-Quick-StartAWS-Quick-Start
AWS-Quick-Start
 
HK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-WorkshopHK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-Workshop
 
Database Freedom: come liberarsi dei database proprietari
Database Freedom: come liberarsi dei database proprietariDatabase Freedom: come liberarsi dei database proprietari
Database Freedom: come liberarsi dei database proprietari
 
AWS Purpose-Built Database Strategy: The Right Tool for The Right Job
AWS Purpose-Built Database Strategy: The Right Tool for The Right JobAWS Purpose-Built Database Strategy: The Right Tool for The Right Job
AWS Purpose-Built Database Strategy: The Right Tool for The Right Job
 
Building with Purpose-Built Databases: Match Your Workload to the Right Database
Building with Purpose-Built Databases: Match Your Workload to the Right DatabaseBuilding with Purpose-Built Databases: Match Your Workload to the Right Database
Building with Purpose-Built Databases: Match Your Workload to the Right Database
 
Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your Applications
 
Building with Purpose-Built Databases: Match Your workload to the Right Database
Building with Purpose-Built Databases: Match Your workload to the Right DatabaseBuilding with Purpose-Built Databases: Match Your workload to the Right Database
Building with Purpose-Built Databases: Match Your workload to the Right Database
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019
 
Migrate and Modernize Your Database
Migrate and Modernize Your DatabaseMigrate and Modernize Your Database
Migrate and Modernize Your Database
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
How to choose the right database for your workload
How to choose the right database for your workloadHow to choose the right database for your workload
How to choose the right database for your workload
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &MLAWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
 
Deriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesDeriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML Architectures
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
2. migration, disaster recovery and business continuity in the cloud
2. migration, disaster recovery and business continuity in the cloud2. migration, disaster recovery and business continuity in the cloud
2. migration, disaster recovery and business continuity in the cloud
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Databases - Choosing the right Database on AWS

  • 1.
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data on AWS: How To Choose The Right Database and Data Storage Simon Lee, Business Development Manager simonhl@amazon.com
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data is a strategic asset for every organization The world’s most valuable resource is *Copyright: The Economist, 2017, David Parkins
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. The move toward data-centric companies Five largest companies by market cap* 2001 2006 2011 2016 2018 $1.091T $406B $446B $406B $582B $976B $365B $383B $556B $383B $877B $272B $327B $277B $452B $839B $261B $293B $237B $364B $523B $260B $273B $228B $228B
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. What is a data- centric company? What do we sell? How do we make money?
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thinking about data as an asset, not a cost © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Stop throwing data away Make it available to more users Arm users with more data processing technologies
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data every 5 years There is more data than people think 15 years live for Data platforms need to 1,000x scale >10x grows
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hadoop Elasticsearch There are more ways to analyze data than ever before Years ago 11 8 5 4 Presto Spark Didn’t exist
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Democratization of data Governance & control There are more people working with data than ever before How do I provide democratized access to data to enable informed decisions while at the same time enforce data governance and prevent mismanagement of the data?
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do we build new types of applications that can leverage this data? © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Modern apps create new requirements Users: 1 million+ Data volume: TB–PB–EB Locality: Global Performance: Milliseconds–microseconds Request rate: Millions Access: Web, mobile, IoT, devices Scale: Up-down, Out-in Economics: Pay for what you use Developer access: No assembly requiredSocial mediaRide hailing Media streaming Dating
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Social mediaRide hailing Media streaming Dating As application requirements change, data processing engines need to evolve as well On Prime Day, DynamoDB requests from Alexa, the Amazon.com sites, and the Amazon fulfillment centers totaled 3.34 trillion, peaking at 12.9 million per second Databases need to be able to provide reliable performance with highly variable demands and deliver consistent, single-digit millisecond response time at any scale.
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common data categories and use cases Relational Referential integrity, ACID transactions, schema- on-write Lift and shift, ERP, CRM, finance Key-value High throughput, low- latency reads and writes, endless scale Real-time bidding, shopping cart, social, product catalog, customer preferences Document Store documents and quickly access querying on any attribute Content management, personalization, mobile In-memory Query by key with microsecond latency Leaderboards, real-time analytics, caching Graph Quickly and easily create and navigate relationships between data Fraud detection, social networking, recommendation engine Time-series Collect, store, and process data sequenced by time IoT applications, event tracking Ledger Complete, immutable, and verifiable history of all changes to application data Systems of record, supply chain, health care, registrations, financial
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS purpose-built databases Relational Key-value Document In-memory Graph Time-series Ledger DynamoDB NeptuneAmazon RDS Aurora CommercialCommunity Timestream QLDBElastiCacheDocumentDB
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Aurora MySQL and PostgreSQL-compatible relational database built for the cloud Performance and availability of commercial-grade databases at 1/10th the cost Performance and scalability Availability and durability Highly secure Fully managed 5x throughput of standard MySQL and 3x of standard PostgreSQL; scale-out up to 15 read replicas Fault-tolerant, self-healing storage; six copies of data across three Availability Zones; continuous backup to Amazon S3 Network isolation, encryption at rest/transit Managed by RDS: No hardware provisioning, software patching, setup, configuration, or backups
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Relational Database Service (RDS) Managed relational database service with a choice of six popular database engines Easy to administer Available and durable Highly scalable Fast and secure No need for infrastructure provisioning, installing, and maintaining DB software Automatic Multi-AZ data replication; automated backup, snapshots, failover Scale database compute and storage with a few clicks with no app downtime SSD storage and guaranteed provisioned I/O; data encryption at rest and in transit
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon DynamoDB Fast and flexible key value database service for any scale Comprehensive security Encrypts all data by default and fully integrates with AWS Identity and Access Management for robust security Performance at scale Consistent, single-digit millisecond response times at any scale; build applications with virtually unlimited throughput Global database for global users and apps Build global applications with fast access to local data by easily replicating tables across multiple AWS Regions Serverless No server provisioning, software patching, or upgrades; scales up or down automatically; continuously backs up your data
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon DocumentDB Fast, scalable, highly available, fully managed MongoDB-compatible database service Fully Managed Managed by AWS: No hardware provisioning, software patching, setup, configuration, or backups Fast Millions of requests per second, millisecond latency MongoDB-compatible Compatible with MongoDB Community Edition 3.6. Use the same drivers and tools Reliable Six replicas of your data across three AZs with full backup and restore
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon ElastiCache Redis and Memcached compatible, in-memory data store and cache Secure and reliable Network isolation, encryption at rest/transit, HIPAA, PCI, FedRAMP, multi AZ, and automatic failover Redis & Memcached compatible Fully compatible with open source Redis and Memcached Easily scalable Scale writes and reads with sharding and replicas Extreme performance In-memory data store and cache for microsecond response times
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Neptune Fully managed graph database Easy Build powerful queries easily with Gremlin and SPARQL Fast Query billions of relationships with millisecond latency Open Supports Apache TinkerPop & W3C RDF graph models Reliable Six replicas of your data across three AZs with full backup and restore
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Timestream (sign up for the preview) Fast, scalable, fully managed time-series database 1,000x faster and 1/10th the cost of relational databases Collect data at the rate of millions of inserts per second (10M/second) Trillions of daily events Adaptive query processing engine maintains steady, predictable performance Time-series analytics Built-in functions for interpolation, smoothing, and approximation Serverless Automated setup, configuration, server provisioning, software patching
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Quantum Ledger Database (QLDB) Fully managed ledger database Track and verify history of all changes made to your application’s data Immutable Maintains a sequenced record of all changes to your data, which cannot be deleted or modified; you have the ability to query and analyze the full history Cryptographically verifiable Uses cryptography to generate a secure output file of your data’s history Easy to use Easy to use, letting you use familiar database capabilities like SQL APIs for querying the data Highly scalable Executes 2–3X as many transactions than ledgers in common blockchain frameworks
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Database Migration Service (AWS DMS) M I G R A T I N G D A T A B A S E S T O A W S Migrate between on-premises and AWS Migrate between databases Automated schema conversion Data replication for zero-downtime migration 100,000+ databases migrated
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customers are moving to AWS Databases Verizon is migrating over 1,000 business-critical applications and database backend systems to AWS, several of which also include the migration of production databases to Amazon Aurora. Wappa migrated from their Oracle database to Amazon Aurora and improved their reporting time per user by 75 percent. Trimble migrated their Oracle databases to Amazon RDS and project they will pay about 1/4th of what they paid when managing their private infrastructure. Intuit migrated from Microsoft SQL Server to Amazon Redshift to reduce data-processing timelines and get insights to decision makers faster and more frequently. Equinox Fitness migrated its Teradata on-premises data warehouse to Amazon Redshift. They went from static reports to a modern data lake that delivers dynamic reports. By December 2018, Amazon.com migrated 88% of their Oracle DBs (and 97% of critical system DBs) moved to Amazon Aurora and Amazon DynamoDB. They also migrated their 50 PB Oracle Data Warehouse to AWS (Amazon S3, Amazon Redshift, and Amazon EMR). Samsung Electronics migrated their Cassandra clusters to Amazon DynamoDB for their Samsung Cloud workload with 70% cost savings.
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Equinox Fitness Clubs is a company with integrated luxury and lifestyle offerings centered on movement, nutrition and regeneration. Equinox built connected experiences using applications that connect to Apple Health and built data collection in their exercise equipment. More than 200 locations within every major city across the world
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Many lines of business across 98 clubs & 200+ studios in total Plus central supporting functions Digital Products CRM Marketing Creative Development/ Building Finance Member’s Services Maintenance Personal training Pilates Spa Group Fitness Membership/ Sales Retail Food Services
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Digital products End user applications Connections to Apple Health Connected equipment Pursuit (gamified cycling experience) Cardio Digital assessment Location tracking Connected tech
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data lake architecture Data & analytics apps Equinox apps Third-party apps Informatica Maximilian Amazon EMR PT App Pursuit Engage Exact Target Adobe Social MOSO Fitness Agg. Amazon Redshift
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. The assembled pipeline Adobe Analytics Amazon EMR AthenaS3 Glue Data Catalog Redshift Spectrum S3
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Re-platformed and productionalized 2 apps in 4 months Finished re-platform in under a year Dependability–very few operational issues Faster time-to-benefit via automated regression Huge cost savings over Teradata Results Reduced time-to-benefit and increased end- user productivity
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. We need to rethink what we mean by data and analytics
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. This is data
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. This is data
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. This is data Skip the trip. one-hour delivery Exclusively for Amazon Prime Members
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data can be used to connect more deeply with your customer base
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reporting, analysis, modeling, and planning are not going away
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Why data lakes? Data Lakes provide: Relational and non-relational data Scale-out to EBs Diverse set of analytics and machine learning tools Work on data without any data movement Designed for low cost storage and analytics OLTP ERP CRM LOB Data Warehouse Business Intelligence Data Lake 100110000100101011100101010 111001010100001011111011010 0011110010110010110 0100011000010 Devices Web Sensors Social Catalog Machine Learning DW Queries Big data processing Interactive Real-time
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Typical steps of building a data lake Setup Storage1 Move data2 Cleanse, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon S3 is the base Data Lake Storage Secure, highly scalable, durable object storage with millisecond latency for data access Store any type of data–web sites, mobile apps, corporate applications, and IoT sensors, at any scale Store data in the format you want: Unstructured (logs, dump files) | semi-structured (JSON, XML) | structured (CSV, Parquet) Storage lifecycle integration Amazon S3-Standard | Amazon S3-Infrequent Access | Amazon Glacier
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Lake Formation Build, secure, and manage a data lake in days Build a data lake in days, not months Build and deploy a fully managed data lake with a few clicks Enforce security policies across multiple services Centrally define security, governance, and auditing policies in one place and enforce those policies for all users and all applications Combine different analytics approaches Empower analyst and data scientist productivity, giving them self-service discovery and safe access to all data from a single catalog
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How it works Data Lakes and analytics on AWS S3 IAM KMS OLTP ERP CRM LOB Devices Web Sensors Social Kinesis Build Data Lakes quickly • Identify, crawl, and catalog sources • Ingest and clean data • Transform into optimal formats Simplify security management • Enforce encryption • Define access policies • Implement audit login Enable self-service and combined analytics • Analysts discover all data available for analysis from a single data catalog • Use multiple analytics tools over the same data Athena Amazon Redshift AI Services Amazon EMR Amazon QuickSight Data Catalog
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Storing is not enough, data needs to be discoverable Dark data are the information assets organizations collect, process, and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). CRM ERP Data warehouse Mainframe data Web Social Log files Machine data Semi- structured Unstructured “ ”Gartner IT Glossary, 2018 https://www.gartner.com/it-glossary/dark-data
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Use AWS Glue to cleanse, prep, and catalog AWS Glue Data Catalog - a single view across your data lake Automatically discovers data and stores schema Makes data searchable, and available for ETL Contains table definitions and custom metadata Use AWS Glue ETL jobs to cleanse, transform, and store processed data Serverless Apache Spark environment Use Glue ETL libraries or bring your own code Write code in Python or Scala Call any AWS API using the AWS boto3 SDKAmazon S3 (Raw data) Amazon S3 (Staging data) Amazon S3 (Processed data) AWS Glue Data Catalog Crawlers Crawlers Crawlers
  • 47. CHALLENGE Need to create constant feedback loop for designers Gain up-to-the-minute understanding of gamer satisfaction to guarantee gamers are engaged, thus resulting in the most popular game played in the world Fortnite | 125+ million players
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Epic Games uses Data Lakes and analytics Entire analytics platform running on AWS S3 leveraged as a Data Lake All telemetry data is collected with Kinesis Real-time analytics done through Spark on EMR, DynamoDB to create scoreboards and real-time queries Use Amazon EMR for large batch data processing Game designers use data to inform their decisions Game clients Game servers Launcher Game services N E A R R E A L T I M E P I P E L I N E N E A R R E A L T I M E P I P E L I N E Grafana Scoreboards API Limited Raw Data (real time ad-hoc SQL) User ETL (metric definition) Spark on EMR DynamoDB NEAR REALTIME PIPELINES BATCH PIPELINES ETL using EMR Tableau/BI Ad-hoc SQLS3 (Data Lake) Kinesis APIs Databases S3 Other sources
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data has power © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS databases and analytics – There’s a lot more! Broad and deep portfolio, built for builders AWS Marketplace Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Athena Interactive analytics Kinesis Analytics Real-time Amazon Elasticsearch service Operational Analytics RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Aurora MySQL, PostgreSQL Amazon QuickSight Amazon SageMaker DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database S3/Amazon Glacier AWS Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect Data Movement AnalyticsDatabases Business Intelligence & Machine Learning Data Lake Managed Blockchain Blockchain Templates Blockchain Amazon Comprehend Amazon Rekognition Amazon Lex Amazon Transcribe AWS DeepLens 250+ solutions 730+ Database solutions 600+ Analytics solutions 25+ Blockchain solutions 20+ Data lake solutions 30+ solutions RDS on VMWare
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Most startup database & analytics cloud customers
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Most enterprise database & analytics cloud customers
  • 53. Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.