SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Delivering Apache Hadoop for the Modern 
Data Architecture 
Page 1 © Hortonworks Inc. 2014 
HP & Hortonworks. We do Hadoop Together
Your speakers… 
Page 2 © Hortonworks Inc. 2014 
Raghu Thiagarajan 
Director, Partner Product Management, 
Hortonworks 
Chris Daly 
Chief Outbound Engineer, CSS and Big Data Systems, 
HP
Why Hadoop: Traditional Data Architecture Pressured 
Page 3 © Hortonworks Inc. 2014 
2.8 ZB in 2012 
85% from New Data Types 
15x Machine Data by 2020 
40 ZB by 2020 
Data source: IDC 
SOURCES 
OLTP, ERP, 
CRM 
Documents, 
Emails 
Web Logs, 
Click 
Streams 
Social 
Networks 
Machine 
Generated 
Sensor 
Data 
Geolocation 
Data
What: Business Applications of Hadoop 
Page 4 © Hortonworks Inc. 2014 
Sensor 
Server 
Logs 
Text 
Social 
Geographic 
Machine 
Clickstream 
Structured 
Unstructured 
Financial 
Services 
New Account Risk Screens ✔ ✔ 
Trading Risk ✔ 
Insurance Underwriting ✔ 
✔ 
✔ 
Telecom Call Detail Records (CDR) ✔ 
✔ 
Infrastructure Investment ✔ 
✔ 
Real-time Bandwidth Allocation ✔ 
✔ 
✔ 
Retail 360° View of the Customer ✔ 
✔ 
Localized, Personalized Promotions ✔ 
Website Optimization ✔
What: Business Applications of Hadoop 
Page 5 © Hortonworks Inc. 2014 
Sensor 
Server 
Logs 
Text 
Social 
Geographic 
Machine 
Clickstream 
Structured 
Unstructured 
Manufacturing Supply Chain and Logistics ✔ 
Preventive Maintenance ✔ 
Crowd-sourced Quality Assurance ✔ 
Healthcare Use Genomic Data in Medial Trials ✔ 
✔ 
Monitor Patient Vitals in Real-Time 
Pharmaceutical 
s 
Recruit & Retain Patients for Drug 
Trials ✔ 
✔ 
Improve Prescription Adherence ✔ 
✔ 
✔ 
Oil & Gas Unify Exploration & Production Data ✔ 
✔ 
✔ 
Monitor Rig Safety in Real-Time ✔ 
✔ 
Government ETL Offload in Response to Budgetary 
Pressures ✔ 
Sentiment Analysis for Gov’t Programs 
✔
How: Modern Data Architecture with Hadoop 
Statistical 
Analysis 
Page 6 © Hortonworks Inc. 2014 
DEV & DATA TOOLS 
Build & Test 
OPERATIONS TOOLS 
Provision, 
Manage & 
Monitor 
DATA SYSTEMS APPLICATIONS 
Repositories 
ROOMS 
BI / Reporting, 
Ad Hoc Analysis 
Interactive Web 
& Mobile Apps 
Enterprise 
Applications 
RDBMS EDW MPP 
Governance 
& 
Integra.on 
ENTERPRISE HADOOP 
Security 
Opera.ons 
Data 
Access 
Data 
Management 
SOURCES OLTP, ERP, 
CRM 
Documents, 
Emails 
Web Logs, 
Click Streams 
Social 
Networks 
Machine 
Generated 
Sensor 
Data 
Geolocation 
Data
YARN Transforms Hadoop’s Architecture 
Page 7 © Hortonworks Inc. 2014 
Enables 
deep 
insight 
across 
a 
large, 
broad, 
diverse 
set 
of 
data 
at 
efficient 
scale 
Mul.-­‐Use 
Data 
Pla>orm 
Store 
all 
data 
in 
one 
place, 
process 
in 
many 
ways 
Batch 
Interac.ve 
Itera.ve 
Streaming 
1 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
n 
Store any/all raw data sources 
and processed data over 
extended periods of time. 
YARN 
: 
Data 
Opera.ng 
System
Designing Hadoop Cluster 
§ Cluster Storage Capacity 
§ Server Specification 
§ Cluster Size 
§ Factoring Performance 
Page 8 © Hortonworks Inc. 2014 
Key Considerations 
§ Any piece of hardware can and will 
fail 
§ More nodes means less impact on 
failure 
§ Resiliency and fault tolerance 
improve with scale 
§ Build resiliency through scale 
§ Still use modern hardware 
§ Software handles hardware failures
Storage Capacity 
§ Key Input 
§ Initial Data Size 
§ 3 year YOY growth 
§ Compression ratio 
§ Intermediate and materialized views 
§ Replication Factor 
§ Note 
Materialized Views 
Master Data 
Work In Process Data 
§ Hard to accurately predict the size of intermediate & materialized views at the start of a 
project 
§ Be conservative with compression ratio. Mileage varies by data type 
§ Hadoop needs temp space to store intermediate files 
Page 9 © Hortonworks Inc. 2014 
Hadoop Cluster 
Raw Data
Storage Capacity 
Page 10 © Hortonworks Inc. 2014 
Total Storage 
Required 
(Initial Size + " 
YOY Growth + 
Intermediate Data Size) " 
X Replication Count " 
X 1.2" 
Compression Ratio" 
Good Rule of Thumb 
Replication Count = 3" 
" 
Compression Ratio = 
4-5" 
" 
Intermediate Data Size 
= 50%-100% of Raw 
Data Size" 
Note 
1.2 factor is included in 
the sizing estimator to 
account for the temp 
space requirement of 
Hadoop"
Server Specification 
§ Master Nodes – NameNode, Resource Manager, HBase Master 
§ Dual Intel Xeon E5-26xx series processors 
§ 128GB or 256GB RAM per chassis 
§ 4+ – 1TB NL-SAS/SATA Drives RAID10+ Spares 
§ Worker Nodes – DataNode, Node Manager and Region Server 
§ Dual Intel Xeon E5-26xx series processors 
§ 128GB RAM or 256GB RAM 
§ 12 – 1-4 TB NLSAS/SATA Drives 
§ Gateway Nodes / Edge Nodes 
§ Mirror of Master Nodes configuration 
Page 11 © Hortonworks Inc. 2014
Cluster Size 
Number of Data Nodes 
Page 12 © Hortonworks Inc. 2014 
12 
Storage Per Server 
Number of Master Nodes 
§ Name Node, Zookeeper 
§ Resource Manager, Zookeeper 
§ Failover Name Node, HBase Master, Hive 
Server, Zookeeper 
§ In a half-rack cluster, this would be combined with 
Resource Manager 
§ Management Node (Ambari, Ganglia, Nagios) 
§ In a half-rack cluster, this would be combined with 
the Name Node 
Total Storage" 
Required" 
Note 
§ Large clusters may need more than 4 
master nodes 
§ Start at 2/4 and grow based on usage
Factoring Performance 
§ Data Nodes 
§ 1 TB drives for performance clusters 
§ 4 TB drives for archive clusters 
§ Meeting SLA Requirements 
§ Hadoop workloads are varied 
§ Difficult to assess cluster size based on SLAs without actual testing 
§ Good News: Hadoop performs linearly with scale 
§ Enables one to design experiments using a fraction of data 
§ Best Practice Guidance 
§ Create a test configuration with a rack of servers 
§ Load a slice of data 
§ Run tests with real-life queries to measure performance & fine tune the system 
§ Scale cluster size based on observed performance 
Page 13 © Hortonworks Inc. 2014 
13
HDP and HP are deeply integrated in the data center 
Page 14 © Hortonworks Inc. 2014 
DEV 
& 
DATA 
TOOLS 
OPERATIONAL 
TOOLS 
INFRASTRUCTURE 
SOURCES 
EXISTING 
Systems 
YARN 
Clickstream 
Web 
&Social 
Geoloca.on 
Sensor 
& 
Machine 
Server 
Logs 
Unstructured 
DATA SYSTEM 
RDBMS 
EDW 
MPP 
HANA 
APPLICATIONS 
BusinessObjects BI 
Deep Partnerships 
Hortonworks and HP engaged 
in deep engineered relationships 
with the leaders in the data center, 
such as Microsoft, Teradata, Redhat, 
& SAP 
Broad Partnerships 
Over 600 partners work with 
Hortonworks to certify their 
applications to work with Hadoop so 
they can extend big data to their 
users 
HDP 2.1 Governance 
& Integration 
Security 
Operations 
Data Access 
Data Management
Delivering Apache 
Hadoop for the Modern 
Data Architecture 
HP + Hortonworks Validated Design 
Christopher Daly 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The HP Approach to Apache Hadoop 
Why a Reference Architecture? 
• Provides a starting point or 
baseline 
• Maximum flexibility 
• Customizable to fit YOUR needs 
• Adopt the parts you want 
• Replace the parts you don’t 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 16 without notice.
Solution components 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 17 without notice.
Pre-deployment considerations / system 
selection 
• Operating system 
• Computation 
• Memory 
• Storage 
• Network 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 18 without notice.
High-availability considerations 
• Hadoop NameNode HA 
• ResourceManager HA 
• OS availability and 
reliability 
• Network reliability 
• Power supply 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 19 without notice.
Server selection 
Management nodes – The HP ProLiant DL360p Gen8 
The Management node and head nodes, as 
tested in the Reference Architecture, contain 
the following base configuration: 
2 x Eight-Core Intel E5-2650 v2 Processors 
Smart Array P420i Controller with 512MB FBWC 
3.6 TB – 4 x 900GB SFF SAS 10K RPM disks 
128 GB DDR3 Memory – 8 x 16GB 2Rx4 
PC3-14900R-13 
10GbE 2P NIC 561FLR-T card 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 20 without notice.
Server selection 
Worker nodes – ProLiant DL380p Gen8 
The ProLiant DL380p Gen8 (2U) as 
configured for the Reference Architecture 
as a worker node has the following 
configuration: 
Dual 10-Core Intel Xeon E5-2670 v2 Processors 
with Hyper-Threading 
Twelve 2TB 3.5” 7.2K LFF SATA MDL (22 TB for 
Data) 
128 GB DDR3 Memory (8 x HP 16GB), 4 
channels per socket 
1 x 10GbE 2 Port NIC FlexibleLOM (Bonded) 
1 x Smart Array P420i Controller with 512MB 
FBWC 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 21 without notice.
Switch selection 
Top of Rack (ToR) switches 
The 5900AF-48XGT-4QSFP+10GbE is an ideal ToR 
switch with forty eight 10GbE ports and four 40GbE 
uplinks providing resiliency, high availability and 
scalability support. In addition this model comes with 
support for CAT6 cables (copper wires) and Software 
defined networking (SDN). 
Aggregation switches 
The FlexFabric 5930-32QSFP+40GbE switch is an 
ideal aggregation switch as it is well suited to handle 
very large volumes of inter-rack traffic such as can 
occur during shuffle and sort operations, or large scale 
block replication to recreate a failed node 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 22 without notice.
HP Insight CMU – pushbutton scale-out 
management 
Provision, monitor, and 
control 
Thousands of nodes instantly 
Push-button roll out 
Provisioning via cloning for 
seamless scaling 
Rest easy 
Battletested at top 500 sites for 
over a decade 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 23 without notice.
HP Insight CMU – GUI Monitoring at a Cluster level 
Historical analysis and job recording 
• Designed for Big Data 
customer 
• Multi-petal aggregated, 3D 
RT, and time series views of 
cluster metrics 
• “Click & zoom” analysis at 
both solution and component 
levels 
• Proactively identify and 
isolate performance issues 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 24 without notice.
Single Rack Reference Architecture 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 25 without notice.
Multi-Rack Reference Architecture 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 26 without notice.
Capacity and sizing 
Here is a general guideline on data 
inventory: 
• Sources of data 
• Frequency of data 
• Raw storage 
• Processed HDFS storage 
• Replication factor 
• Default compression turned on 
• Space for intermediate files 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 27 without notice.
System configuration guidance 
Machine 
Type 
Workload 
Patten/Cluster 
Type 
Storage Processor 
(# of 
Cores) 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 28 without notice. 
Memory 
(GB) 
Network 
Slaves 
Balanced workload Four to six 
1-2 TB disks 
Dual 6/8/10 
cores 48-96 
Dual 10 GB 
links for all 
nodes in a 20 
node rack and 
min 2x10 / 2 x 
40 GB 
interconnect 
links per rack 
going to a pair 
of central 
switches 
Compute intensive 
workload 
Four to six 
1-2 TB disks 
Dual 8/10/12 
cores 48-128 
IO intensive workload Twelve 1-2 
TB disks 
Dual 8/10/12 
cores 48-96 
HBase clusters Twelve 1-2 
TB disks 
Dual 8/10/12 
cores 48-128 
Masters All workload patterns/ 
HBase clusters 
Four to six 
1-2 TB disks 
Dual 6/8/10 
cores 
Depends on number 
of file system 
objects to be 
created by 
NameNode.
For More Information 
Get the Reference Architecture at 
http://h20195.www2.hp.com/V2/GetDocument.aspx?docname=4AA5-4975ENW 
Hortonworks www.hortonworks.com 
HP Solutions for Apache Hadoop hp.com/go/Hadoop 
HP ProLiant servers hp.com/go/proliant 
HP Insight Cluster Management Utility (CMU) hp.com/go/cmu 
HP Networking hp.com/go/networking 
Or Contact Me: Christopher.Daly@hp.com 
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 29 without notice.
Next Steps... 
More about HP & Hortonworks 
http://hortonworks.com/partner/HP 
Download the Hortonworks Sandbox 
Learn Hadoop 
Build Your Analytic App 
Try Hadoop 2 
Contact us: events@hortonworks.com 
Page 30 © Hortonworks Inc. 2014
THANK YOU 
Page 31 © Hortonworks Inc. 2014

Más contenido relacionado

Was ist angesagt?

Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Hortonworks
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Hortonworks
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun ConnollyHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNHortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageHortonworks
 

Was ist angesagt? (20)

Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Falcon Meetup
Falcon Meetup Falcon Meetup
Falcon Meetup
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
 

Andere mochten auch

Big Data Analytics - It is here and now!
Big Data Analytics - It is here and now!Big Data Analytics - It is here and now!
Big Data Analytics - It is here and now!Farhan Khan
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic EcosystemsHortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHortonworks
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendHortonworks
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Hortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataHortonworks
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Hortonworks
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNHortonworks
 

Andere mochten auch (20)

Big Data Analytics - It is here and now!
Big Data Analytics - It is here and now!Big Data Analytics - It is here and now!
Big Data Analytics - It is here and now!
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
 

Ähnlich wie Hp Converged Systems and Hortonworks - Webinar Slides

Carpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenCarpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenDataWorks Summit
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Hadoop is not an Island in the Enterprise
Hadoop is not an Island in the EnterpriseHadoop is not an Island in the Enterprise
Hadoop is not an Island in the EnterpriseDataWorks Summit
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsDataWorks Summit
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Etu Solution
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
 How to use Hadoop for operational and transactional purposes by RODRIGO MERI... How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...Big Data Spain
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overviewRohit Jain
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-HadoopHP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-HadoopMapR Technologies
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld
 
DUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansDUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansAndrey Kudryavtsev
 

Ähnlich wie Hp Converged Systems and Hortonworks - Webinar Slides (20)

Carpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenCarpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP Haven
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Hadoop is not an Island in the Enterprise
Hadoop is not an Island in the EnterpriseHadoop is not an Island in the Enterprise
Hadoop is not an Island in the Enterprise
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
 How to use Hadoop for operational and transactional purposes by RODRIGO MERI... How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-HadoopHP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
DUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansDUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution Plans
 

Mehr von Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developersmichealwillson701
 
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurMinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurPriyadarshini T
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfCloudMetic
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energyjeyasrig
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdfOffsiteNOC
 
openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleShane Coughlan
 
Practical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfPractical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfICS
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...MyFAA
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Inc
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houstonjennysmithusa549
 
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
BusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptxBusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptx
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptxAGATSoftware
 
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
Unlocking AI:Navigating Open Source vs. Commercial FrontiersUnlocking AI:Navigating Open Source vs. Commercial Frontiers
Unlocking AI: Navigating Open Source vs. Commercial FrontiersRaphaël Semeteys
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tipsmichealwillson701
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsUniversity of Antwerp
 
VuNet software organisation powerpoint deck
VuNet software organisation powerpoint deckVuNet software organisation powerpoint deck
VuNet software organisation powerpoint deckNaval Singh
 
Building Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startBuilding Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startMaxim Salnikov
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeKaylee Miller
 
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfFlutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfMind IT Systems
 
Revolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridRevolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridMathew Thomas
 
Leveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDevLeveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDevpmgdscunsri
 

Último (20)

Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developers
 
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young EntrepreneurMinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
MinionLabs_Mr. Gokul Srinivas_Young Entrepreneur
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdf
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energy
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf
 
openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scale
 
Practical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfPractical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdf
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houston
 
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
BusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptxBusinessGPT  - SECURITY AND GOVERNANCE  FOR GENERATIVE AI.pptx
BusinessGPT - SECURITY AND GOVERNANCE FOR GENERATIVE AI.pptx
 
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
Unlocking AI:Navigating Open Source vs. Commercial FrontiersUnlocking AI:Navigating Open Source vs. Commercial Frontiers
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tips
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
 
VuNet software organisation powerpoint deck
VuNet software organisation powerpoint deckVuNet software organisation powerpoint deck
VuNet software organisation powerpoint deck
 
Building Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startBuilding Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to start
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller Resume
 
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfFlutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
 
Revolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridRevolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM Grid
 
Leveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDevLeveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDev
 

Hp Converged Systems and Hortonworks - Webinar Slides

  • 1. Delivering Apache Hadoop for the Modern Data Architecture Page 1 © Hortonworks Inc. 2014 HP & Hortonworks. We do Hadoop Together
  • 2. Your speakers… Page 2 © Hortonworks Inc. 2014 Raghu Thiagarajan Director, Partner Product Management, Hortonworks Chris Daly Chief Outbound Engineer, CSS and Big Data Systems, HP
  • 3. Why Hadoop: Traditional Data Architecture Pressured Page 3 © Hortonworks Inc. 2014 2.8 ZB in 2012 85% from New Data Types 15x Machine Data by 2020 40 ZB by 2020 Data source: IDC SOURCES OLTP, ERP, CRM Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data
  • 4. What: Business Applications of Hadoop Page 4 © Hortonworks Inc. 2014 Sensor Server Logs Text Social Geographic Machine Clickstream Structured Unstructured Financial Services New Account Risk Screens ✔ ✔ Trading Risk ✔ Insurance Underwriting ✔ ✔ ✔ Telecom Call Detail Records (CDR) ✔ ✔ Infrastructure Investment ✔ ✔ Real-time Bandwidth Allocation ✔ ✔ ✔ Retail 360° View of the Customer ✔ ✔ Localized, Personalized Promotions ✔ Website Optimization ✔
  • 5. What: Business Applications of Hadoop Page 5 © Hortonworks Inc. 2014 Sensor Server Logs Text Social Geographic Machine Clickstream Structured Unstructured Manufacturing Supply Chain and Logistics ✔ Preventive Maintenance ✔ Crowd-sourced Quality Assurance ✔ Healthcare Use Genomic Data in Medial Trials ✔ ✔ Monitor Patient Vitals in Real-Time Pharmaceutical s Recruit & Retain Patients for Drug Trials ✔ ✔ Improve Prescription Adherence ✔ ✔ ✔ Oil & Gas Unify Exploration & Production Data ✔ ✔ ✔ Monitor Rig Safety in Real-Time ✔ ✔ Government ETL Offload in Response to Budgetary Pressures ✔ Sentiment Analysis for Gov’t Programs ✔
  • 6. How: Modern Data Architecture with Hadoop Statistical Analysis Page 6 © Hortonworks Inc. 2014 DEV & DATA TOOLS Build & Test OPERATIONS TOOLS Provision, Manage & Monitor DATA SYSTEMS APPLICATIONS Repositories ROOMS BI / Reporting, Ad Hoc Analysis Interactive Web & Mobile Apps Enterprise Applications RDBMS EDW MPP Governance & Integra.on ENTERPRISE HADOOP Security Opera.ons Data Access Data Management SOURCES OLTP, ERP, CRM Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data
  • 7. YARN Transforms Hadoop’s Architecture Page 7 © Hortonworks Inc. 2014 Enables deep insight across a large, broad, diverse set of data at efficient scale Mul.-­‐Use Data Pla>orm Store all data in one place, process in many ways Batch Interac.ve Itera.ve Streaming 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° n Store any/all raw data sources and processed data over extended periods of time. YARN : Data Opera.ng System
  • 8. Designing Hadoop Cluster § Cluster Storage Capacity § Server Specification § Cluster Size § Factoring Performance Page 8 © Hortonworks Inc. 2014 Key Considerations § Any piece of hardware can and will fail § More nodes means less impact on failure § Resiliency and fault tolerance improve with scale § Build resiliency through scale § Still use modern hardware § Software handles hardware failures
  • 9. Storage Capacity § Key Input § Initial Data Size § 3 year YOY growth § Compression ratio § Intermediate and materialized views § Replication Factor § Note Materialized Views Master Data Work In Process Data § Hard to accurately predict the size of intermediate & materialized views at the start of a project § Be conservative with compression ratio. Mileage varies by data type § Hadoop needs temp space to store intermediate files Page 9 © Hortonworks Inc. 2014 Hadoop Cluster Raw Data
  • 10. Storage Capacity Page 10 © Hortonworks Inc. 2014 Total Storage Required (Initial Size + " YOY Growth + Intermediate Data Size) " X Replication Count " X 1.2" Compression Ratio" Good Rule of Thumb Replication Count = 3" " Compression Ratio = 4-5" " Intermediate Data Size = 50%-100% of Raw Data Size" Note 1.2 factor is included in the sizing estimator to account for the temp space requirement of Hadoop"
  • 11. Server Specification § Master Nodes – NameNode, Resource Manager, HBase Master § Dual Intel Xeon E5-26xx series processors § 128GB or 256GB RAM per chassis § 4+ – 1TB NL-SAS/SATA Drives RAID10+ Spares § Worker Nodes – DataNode, Node Manager and Region Server § Dual Intel Xeon E5-26xx series processors § 128GB RAM or 256GB RAM § 12 – 1-4 TB NLSAS/SATA Drives § Gateway Nodes / Edge Nodes § Mirror of Master Nodes configuration Page 11 © Hortonworks Inc. 2014
  • 12. Cluster Size Number of Data Nodes Page 12 © Hortonworks Inc. 2014 12 Storage Per Server Number of Master Nodes § Name Node, Zookeeper § Resource Manager, Zookeeper § Failover Name Node, HBase Master, Hive Server, Zookeeper § In a half-rack cluster, this would be combined with Resource Manager § Management Node (Ambari, Ganglia, Nagios) § In a half-rack cluster, this would be combined with the Name Node Total Storage" Required" Note § Large clusters may need more than 4 master nodes § Start at 2/4 and grow based on usage
  • 13. Factoring Performance § Data Nodes § 1 TB drives for performance clusters § 4 TB drives for archive clusters § Meeting SLA Requirements § Hadoop workloads are varied § Difficult to assess cluster size based on SLAs without actual testing § Good News: Hadoop performs linearly with scale § Enables one to design experiments using a fraction of data § Best Practice Guidance § Create a test configuration with a rack of servers § Load a slice of data § Run tests with real-life queries to measure performance & fine tune the system § Scale cluster size based on observed performance Page 13 © Hortonworks Inc. 2014 13
  • 14. HDP and HP are deeply integrated in the data center Page 14 © Hortonworks Inc. 2014 DEV & DATA TOOLS OPERATIONAL TOOLS INFRASTRUCTURE SOURCES EXISTING Systems YARN Clickstream Web &Social Geoloca.on Sensor & Machine Server Logs Unstructured DATA SYSTEM RDBMS EDW MPP HANA APPLICATIONS BusinessObjects BI Deep Partnerships Hortonworks and HP engaged in deep engineered relationships with the leaders in the data center, such as Microsoft, Teradata, Redhat, & SAP Broad Partnerships Over 600 partners work with Hortonworks to certify their applications to work with Hadoop so they can extend big data to their users HDP 2.1 Governance & Integration Security Operations Data Access Data Management
  • 15. Delivering Apache Hadoop for the Modern Data Architecture HP + Hortonworks Validated Design Christopher Daly © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 16. The HP Approach to Apache Hadoop Why a Reference Architecture? • Provides a starting point or baseline • Maximum flexibility • Customizable to fit YOUR needs • Adopt the parts you want • Replace the parts you don’t © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 16 without notice.
  • 17. Solution components © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 17 without notice.
  • 18. Pre-deployment considerations / system selection • Operating system • Computation • Memory • Storage • Network © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 18 without notice.
  • 19. High-availability considerations • Hadoop NameNode HA • ResourceManager HA • OS availability and reliability • Network reliability • Power supply © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 19 without notice.
  • 20. Server selection Management nodes – The HP ProLiant DL360p Gen8 The Management node and head nodes, as tested in the Reference Architecture, contain the following base configuration: 2 x Eight-Core Intel E5-2650 v2 Processors Smart Array P420i Controller with 512MB FBWC 3.6 TB – 4 x 900GB SFF SAS 10K RPM disks 128 GB DDR3 Memory – 8 x 16GB 2Rx4 PC3-14900R-13 10GbE 2P NIC 561FLR-T card © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 20 without notice.
  • 21. Server selection Worker nodes – ProLiant DL380p Gen8 The ProLiant DL380p Gen8 (2U) as configured for the Reference Architecture as a worker node has the following configuration: Dual 10-Core Intel Xeon E5-2670 v2 Processors with Hyper-Threading Twelve 2TB 3.5” 7.2K LFF SATA MDL (22 TB for Data) 128 GB DDR3 Memory (8 x HP 16GB), 4 channels per socket 1 x 10GbE 2 Port NIC FlexibleLOM (Bonded) 1 x Smart Array P420i Controller with 512MB FBWC © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 21 without notice.
  • 22. Switch selection Top of Rack (ToR) switches The 5900AF-48XGT-4QSFP+10GbE is an ideal ToR switch with forty eight 10GbE ports and four 40GbE uplinks providing resiliency, high availability and scalability support. In addition this model comes with support for CAT6 cables (copper wires) and Software defined networking (SDN). Aggregation switches The FlexFabric 5930-32QSFP+40GbE switch is an ideal aggregation switch as it is well suited to handle very large volumes of inter-rack traffic such as can occur during shuffle and sort operations, or large scale block replication to recreate a failed node © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 22 without notice.
  • 23. HP Insight CMU – pushbutton scale-out management Provision, monitor, and control Thousands of nodes instantly Push-button roll out Provisioning via cloning for seamless scaling Rest easy Battletested at top 500 sites for over a decade © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 23 without notice.
  • 24. HP Insight CMU – GUI Monitoring at a Cluster level Historical analysis and job recording • Designed for Big Data customer • Multi-petal aggregated, 3D RT, and time series views of cluster metrics • “Click & zoom” analysis at both solution and component levels • Proactively identify and isolate performance issues © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 24 without notice.
  • 25. Single Rack Reference Architecture © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 25 without notice.
  • 26. Multi-Rack Reference Architecture © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 26 without notice.
  • 27. Capacity and sizing Here is a general guideline on data inventory: • Sources of data • Frequency of data • Raw storage • Processed HDFS storage • Replication factor • Default compression turned on • Space for intermediate files © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 27 without notice.
  • 28. System configuration guidance Machine Type Workload Patten/Cluster Type Storage Processor (# of Cores) © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 28 without notice. Memory (GB) Network Slaves Balanced workload Four to six 1-2 TB disks Dual 6/8/10 cores 48-96 Dual 10 GB links for all nodes in a 20 node rack and min 2x10 / 2 x 40 GB interconnect links per rack going to a pair of central switches Compute intensive workload Four to six 1-2 TB disks Dual 8/10/12 cores 48-128 IO intensive workload Twelve 1-2 TB disks Dual 8/10/12 cores 48-96 HBase clusters Twelve 1-2 TB disks Dual 8/10/12 cores 48-128 Masters All workload patterns/ HBase clusters Four to six 1-2 TB disks Dual 6/8/10 cores Depends on number of file system objects to be created by NameNode.
  • 29. For More Information Get the Reference Architecture at http://h20195.www2.hp.com/V2/GetDocument.aspx?docname=4AA5-4975ENW Hortonworks www.hortonworks.com HP Solutions for Apache Hadoop hp.com/go/Hadoop HP ProLiant servers hp.com/go/proliant HP Insight Cluster Management Utility (CMU) hp.com/go/cmu HP Networking hp.com/go/networking Or Contact Me: Christopher.Daly@hp.com © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change 29 without notice.
  • 30. Next Steps... More about HP & Hortonworks http://hortonworks.com/partner/HP Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 Contact us: events@hortonworks.com Page 30 © Hortonworks Inc. 2014
  • 31. THANK YOU Page 31 © Hortonworks Inc. 2014