This presentation from the AWS Lab at Cloud Expo Europe 2014 explores large scale data analysis on AWS. The cost of data generation is falling. Storing, analyzing and sharing data using the tools that AWS offers a low cost and easy to use solution for creating value from your data assets.
8. DATA VOLUME
Generated data
Available for analysis
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
30. AMAZON REDSHIFT LETS YOU
START SMALL AND GROW BIG
Eight Extra Large Node (HS1.8XL)
Extra Large Node (HS1.XL)
Cluster 2-100 Nodes (32 TB – 1.6 PB)
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
X
L
X
L
X
L
X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
X
L
X
L
X
L
X
L
8X
L
8X
L
Cluster 2-32 Nodes (4 TB – 64 TB)
8X
L
8X
L
Single Node (2 TB)
8X
L
8X
L
X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
8X
L
77. Real-time response to content
in semi-structured data streams
Relatively simple computations
on data (aggregates, filters,
sliding window, etc.)
78. Hourly server logs: how your
systems went wrong an hour ago
Real-time metrics: what just went
wrong now
Weekly / Monthly Bill: What you
spent this past billing cycle
Real-time spending alerts/caps:
guaranteeing you can’t overspend
Daily customer report from your
website: tells you what deal or ad
to try next time
Real-time analysis: what to offer
the current customer now
Daily fraud reports: tells you if there
was fraud yesterday
Daily business reports: tells me
how customers used AWS services
yesterday
Real-time detection: blocks
fraudulent use now
Fast ETL into Amazon Redshift:
how are customers using services
now