6. Generated data
Available for analysis
Data volume
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
14. Bankinter uses HPC on AWS for Monte Carlo
Simulation
“Bankinter uses AWS as an
integral part of our credit-
risk simulation application;
We need to perform at
least 5,000,000 simulations
to get realistic results”
Credit
Data
Average simulation
time went from 23 hours to 20 minutes
15. Challenge:
Learn about customer based on
what they do, rather than what
they say (i.e., data exhaust);
virtually unlimited data
Solution:
Always-on cluster continually
processes new financial data
and stores results in S3.
Collaborative filtering used to
provide recommendations and
ad-hoc queries performed
using Hive.
17. S&P Capital IQ
Microsoft
SQL Server
Amazon S3:
• Companies You May
Be Interested In
Amazon S3:
• Clicks
• Key Developments
• Company Profiles
Amazon Elastic Map-Reduce:
• Compute User Selectivity
• Compute Key Developments
• Join & Score
18.
19. Challenge:
Volatile weather is deadly to crops like grapes and tomatoes
Solution:
Built a predictive model based on freely available data—60 years of
crop data, 14 TBs of soil data, and one million government Doppler
radar points. 50 hadoop clusters process new data as it comes into S3
each day, continuously updating the model.
150B Soil
Observations
3M Daily
Weather
Measurements
850K Precision
Rainfall Grids
Tracked
25. More than 25 Million Streaming Members
50 Billion Events Per Day
30 Million plays every day
2 billion hours of video in 3
months
4 million ratings per day
3 million searches
Device location , time ,
day, week etc.
Social data
39. Identified early mobile usage
Invested heavily in mobile
development
Finding signal in the noise of logs
9,432,061 unique mobile devices
used the Yelp mobile app.
44. Challenge: To run a virtual screen with a higher
accuracy algorithm & 21 million compounds
45.
46. Metric Count
Compute Hours of
Work
109,927 hours
Compute Days of
Work
4,580 days
Compute Years of
Work
12.55 years
Ligand Count ~21 million ligands
Using Cycle Computing and Amazon
Web Services
56. Thank you! aws.amazon.com/big-data
May 14st, Kowloonbay International Trade
& Exhibition Centre (KITEC), Hong Kong
One day Free training
Walk through of services
http://aws.amazon.com/apac/awsday/hk/