1. Introducing Amazon
Redshift
http://aws.amazon.com/resources/databaseservices/webinars
David Pearson
Business Development Manager
2. What is AWS?
Deployment & Administration
Application Services
Compute Storage Database
Networking
AWS Global Infrastructure
3. AWS Database Amazon Redshift
Services Fast, Powerful, Fully Managed, Petabyte-Scale
Data Warehouse Service
Amazon DynamoDB
Scalable High Performance Application Storage in the Fast, Predictable, Highly-Scalable NoSQL Data Store
Cloud
Amazon RDS
Deployment & Administration Managed Relational Database Service for
MySQL, Oracle and SQL Server
Application Services
Amazon ElastiCache
In-Memory Caching Service
Compute Storage Database
Networking
AWS Global Infrastructure
4. Why Data Warehousing?
Easy to provision and scale up massively
No upfront costs, pay as you go
Really fast performance at a really low price
Open and flexible with support for popular tools
5. Amazon
Redshift
fast and fully managed
petabyte-scale
data warehouse service
6. objectives
design and build a petabyte-scale data warehouse service
A Lot Faster
Amazon
Redshift A Lot Cheaper
A Whole Lot Simpler
7. Redshift Dramatically Reduces I/O
• Direct-attached storage Id Age State
123 20 CA
• Large data block sizes 345 25 WA
• Columnar storage 678 40 FL
• Data compression
• Zone maps
Row storage Column storage
8. Redshift Runs on Optimized Hardware
HS1.8XL: 128GB RAM, 16 Cores, 24 Spindles, 16TB Storage, 2GB/sec scan rate
HS1.XL: 16GB RAM, 2 Cores, 3 Spindles, 2TB Storage
• Optimized for I/O intensive workloads
• High disk density
• Runs in HPC - fast network
• HS1.8XL available on Amazon EC2
9. Redshift Runs on Optimized Hardware
HS1.8XL: 128GB RAM, 16 Cores, 24 Spindles, 16TB Storage, 2GB/sec scan rate
HS1.XL: 16GB RAM, 2 Cores, 3 Spindles, 2TB Storage
Grow Big
Start Small 100 x 8XL = 1.6PB
1 x XL = 2TB
11. data generated
Gap
data volume
data available
for analysis
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
12. Redshift is Priced to Analyze All Your Data
$0.85 per hour for on-demand (2TB)
$999 per TB per year (3-yr reservation)
15. Redshift Simplifies Provisioning
• Create a cluster in minutes
• Automatically patch your OS and data warehouse
software
• Scale up to 1.6PB with a few clicks and no
downtime
27. Reporting Warehouse
OLTP
RDBMS Reporting
ERP
Redshift and BI
• Accelerated operational reporting
• Support for short-time use cases
• Data compression, index redundancy
28. On-Premises Integration
OLTP
RDBMS Reporting
ERP
Redshift and BI
Data
Integration
Partners*
* as of 3/14/2013
29. Live Archive for (Structured) Big Data
OLTP DynamoDB
Web Apps Reporting
Redshift and BI
• Direct integration with copy command
• High velocity data ages into Redshift
• Low cost, high scale option for new apps
30. Cloud ETL for Big Data
S3
Reporting
Elastic MapReduce
Redshift and BI
• Maintain online SQL access to historical logs
• Transformation and enrichment with EMR
• Longer history ensures better insight
31. Redshift
“up to 50 times faster than our current OLAP solution”
“exponential gains in performance”
Fast
Low Cost
less than $1 / hour to get started
less than $1K / TB to run Redshift for a year
Easy To Get Started
Please visit: http://aws.amazon.com/redshift/