Learn more about the tools, techniques and technologies for working productively with data at any scale. This session will introduce the family of data analytics tools on AWS which you can use to collect, compute and collaborate around data, from gigabytes to petabytes. We'll discuss Amazon Elastic MapReduce, Hadoop, structured and unstructured data, and the EC2 instance types which enable high performance analytics.
8. Data volume
Generated data
Available for analysis
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
9. Elastic and highly scalable
+
No upfront capital expense
Remove
+ =
Only pay for what you use constraints
+
Available on-demand
68. Analysis of Data Can Transform Society
Enhance scientific Create new business Increase public safety
understanding, drive models and improve and improve
innovation, and organizational energy efficiency with
accelerate medical cures. processes. smart grids.
69. Intel’s Vision to Democratize Big Data
Unlock Value in Support Open Deliver Software Value
Silicon Platforms
70. Intel at the Intersection of Big Data
HPC Cloud Open Source
Enabling exascale Helping enterprises Contributing code
computing on massive build open and fostering
data sets interoperable clouds ecosystem
72. Scale-Out Big Data
Compute Platform Optimization
Cost-effective performance
•Intel® Advanced Vector Extension Technology
•Intel® Turbo Boost Technology 2.0
•Intel® Advanced Encryption Standard New
Instructions Technology
73. Intel® Advanced Vector Extensions Technology
• Newest in a long line of
processor instruction
innovations
• Increases floating point
operations per clock up to
2X1 performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer See backup for configuration details. software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other
1 : Performance comparison using Linpack benchmark. systems, components,
information information on performance forecasts go to http://www.intel.com/performance
For more legal
and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
73
74. Intel® Turbo Boost Technology 2.0
More Performance
Higher turbo speeds maximize
performance for single and
multi-threaded applications
75. Intel® Advanced Encryption
Standard New Instructions
• Processor assistance for
performing AES encryption
7 new instructions
• Makes enabled encryption
software faster and stronger
76. The Power of Intel® Platform Solutions:
TeraSort for 50% Richer
1 TB sort Reduction user
experiences
4 HRS 80%
Reduction 50%
Reduction 40%
Reduction
Previous
Intel®
Xeon®
Intel®
Xeon® Solid-State
10 MIN
Processor Drive 10G
Processor
E5 2600 Ethernet Intel® Apache
Hadoop
82. Data mobility
Generated and stored in AWS
Inbound data transfer is free
Multipart upload to S3
Physical media
AWS Direct Connect
Regional replication of AMIs and snapshots
83. “How do I integrate my data for
maximum impact?”
84. S3 HBase on EMR RDS
DynamoDB EMR Redshift
On-premises
85. S3 HBase on EMR RDS
DynamoDB EMR Redshift
On-premises
86. S3 HBase on EMR RDS
DynamoDB EMR Redshift
On premises
87. S3 HBase on EMR RDS
DynamoDB EMR Redshift
On premises
88. S3 HBase on EMR RDS
DynamoDB EMR Redshift
On premises
90. AWS Data Pipeline
Data-intensive orchestration and automation
Reliable and scheduled
Easy to use, drag and drop
Execution and retry logic
Map data dependencies
Create and manage temporary compute
resources