SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
Page 1 © Hortonworks Inc. 2014
Delivering Apache Hadoop for the Modern
Data Architecture
Cisco & Hortonworks. We do Hadoop
Together
Page 2 © Hortonworks Inc. 2014
Our speakers…
Ajay Singh
Director Technical Channels, Hortonworks
Sean McKeown
Solutions Architect, Data Center, Cisco
Page 3 © Hortonworks Inc. 2014
Why Hadoop: Traditional Data Architecture Pressured
2.8 ZB in 2012
85% from New Data Types
15x Machine Data by 2020
40 ZB by 2020
Data source: IDC
SOURCES
OLTP, ERP,
CRM
Documents,
Emails
Web Logs,
Click
Streams
Social
Networks
Machine
Generated
Sensor
Data
Geolocation
Data
Page 4 © Hortonworks Inc. 2014
Sensor
Server
Logs
Text
Social
Geographic
Machine
Clickstream
Structured
Unstructured
Financial
Services
New Account Risk Screens ✔ ✔

Trading Risk ✔

Insurance Underwriting ✔

 ✔

 ✔

Telecom Call Detail Records (CDR) ✔

 ✔

Infrastructure Investment ✔

 ✔

Real-time Bandwidth Allocation ✔

 ✔

 ✔

Retail 360° View of the Customer ✔

 ✔

Localized, Personalized Promotions ✔

Website Optimization ✔

What: Business Applications of Hadoop
Page 5 © Hortonworks Inc. 2014
Sensor
Server
Logs
Text
Social
Geographic
Machine
Clickstream
Structured
Unstructured
Manufacturing Supply Chain and Logistics ✔

Preventive Maintenance ✔

Crowd-sourced Quality Assurance ✔

Healthcare Use Genomic Data in Medial Trials ✔

 ✔

Monitor Patient Vitals in Real-Time
Pharmaceutical
s
Recruit & Retain Patients for Drug
Trials
✔

 ✔

Improve Prescription Adherence ✔

 ✔

 ✔

Oil & Gas Unify Exploration & Production Data ✔

 ✔

 ✔

Monitor Rig Safety in Real-Time ✔

 ✔

Government ETL Offload in Response to Budgetary
Pressures ✔

Sentiment Analysis for Gov’t Programs
✔

What: Business Applications of Hadoop
Page 6 © Hortonworks Inc. 2014
OPERATIONS TOOLS
Provision,
Manage &
Monitor
DEV & DATA TOOLS
Build & Test
DATASYSTEMSAPPLICATIONS
Repositories
ROOMS
Statistical
Analysis
BI / Reporting,
Ad Hoc Analysis
Interactive Web
& Mobile Apps
Enterprise
Applications
RDBMS EDW MPP
How: Modern Data Architecture with Hadoop
Governance	
  	
  
&	
  Integra.on	
  
Security	
  
Opera.ons	
  
Data	
  Access	
  
Data	
  Management	
  
ENTERPRISE HADOOP
SOURCES
OLTP, ERP,
CRM
Documents,
Emails
Web Logs,
Click Streams
Social
Networks
Machine
Generated
Sensor
Data
Geolocation
Data
Page 7 © Hortonworks Inc. 2014
YARN Transforms Hadoop’s Architecture
	
  	
  
Enables	
  deep	
  insight	
  
across	
  a	
  large,	
  broad,	
  
diverse	
  set	
  of	
  data	
  at	
  
efficient	
  scale	
  	
  
Mul.-­‐Use	
  Data	
  Pla>orm	
  
Store	
  all	
  data	
  in	
  one	
  place,	
  process	
  in	
  many	
  ways	
  
Batch	
   Interac.ve	
   Itera.ve	
   Streaming	
  
1	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
  
°	
  
°	
  
°	
  
°	
  
°	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
  
°	
  
°	
  
°	
  
°	
  
n	
  
Store any/all raw data sources
and processed data over
extended periods of time.
YARN	
  :	
  Data	
  Opera.ng	
  System	
  
Page 8 © Hortonworks Inc. 2014
Designing Hadoop Cluster
§ Cluster Storage Capacity
§ Server Specification
§ Cluster Size
§ Factoring Performance
Key Considerations
§  Any piece of hardware can and will
fail
§  More nodes means less impact on
failure
§  Resiliency and fault tolerance
improve with scale
§  Build resiliency through scale
§  Still use modern hardware
§  Software handles hardware failures
Page 9 © Hortonworks Inc. 2014
Storage Capacity
§  Key Input
§  Initial Data Size
§  3 year YOY growth
§  Compression ratio
§  Intermediate and materialized views
§  Replication Factor
§  Note
§  Hard to accurately predict the size of intermediate & materialized views at the start of a
project
§  Be conservative with compression ratio. Mileage varies by data type
§  Hadoop needs temp space to store intermediate files
Hadoop Cluster
Raw Data
Work In Process Data
Master Data
Materialized Views
Page 10 © Hortonworks Inc. 2014
Storage Capacity
Total Storage
Required
(Initial Size + "
YOY Growth +
Intermediate Data Size) "
X Replication Count "
X 1.2"
Compression Ratio"
Good Rule of Thumb
Replication Count = 3"
"
Compression Ratio =
4-5"
"
Intermediate Data Size
= 50%-100% of Raw
Data Size"
Note
1.2 factor is included in
the sizing estimator to
account for the temp
space requirement of
Hadoop"
Page 11 © Hortonworks Inc. 2014
Server Specification
Page 11
§  Master Nodes – NameNode, Resource Manager, HBase Master
§  Dual Intel Xeon E5-26xx series processors
§  128GB or 256GB RAM per chassis
§  4+ – 1TB NL-SAS/SATA Drives RAID10+ Spares
§  Worker Nodes – DataNode, Node Manager and Region Server
§  Dual Intel Xeon E5-26xx series processors
§  128GB RAM or 256GB RAM
§  12 – 1-4 TB NLSAS/SATA Drives
§  Gateway Nodes / Edge Nodes
§  Mirror of Master Nodes configuration
Page 12 © Hortonworks Inc. 2014
Number of Data Nodes
Cluster Size
12
Storage Per Server
Number of Master Nodes
§  Name Node, Zookeeper
§  Resource Manager, Zookeeper
§  Failover Name Node, HBase Master, Hive
Server, Zookeeper
§  In a half-rack cluster, this would be combined with
Resource Manager
§  Management Node (Ambari, Ganglia, Nagios)
§  In a half-rack cluster, this would be combined with
the Name Node
Total Storage"
Required"
Note
§  Large clusters may need more than 4
master nodes
§  Start at 2/4 and grow based on usage
Page 13 © Hortonworks Inc. 2014
Factoring Performance
§ Data Nodes
§ 1 TB drives for performance clusters
§ 4 TB drives for archive clusters
§ Meeting SLA Requirements
§  Hadoop workloads are varied
§  Difficult to assess cluster size based on SLAs without actual testing
§  Good News: Hadoop performs linearly with scale
§  Enables one to design experiments using a fraction of data
§  Best Practice Guidance
§  Create a test configuration with a rack of servers
§  Load a slice of data
§  Run tests with real-life queries to measure performance & fine tune the system
§  Scale cluster size based on observed performance
13
Page 14 © Hortonworks Inc. 2014
OPERATIONAL	
  TOOLS	
  
DEV	
  &	
  DATA	
  TOOLS	
  
INFRASTRUCTURE	
  
HDP and Cisco are deeply integrated in the data centerSOURCES
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  &Social	
   Geoloca.on	
   Sensor	
  &	
  
Machine	
  
Server	
  Logs	
   Unstructured	
  
DATASYSTEM
RDBMS	
   EDW	
   MPP	
  
HANA
APPLICATIONS	
  
BusinessObjects BI
Deep Partnerships
Hortonworks and Cisco engages
in deep engineered relationships
with the leaders in the data center,
such as Microsoft, Teradata, Redhat,
& SAP
Broad Partnerships
Over 600 partners work with
Hortonworks to certify their
applications to work with Hadoop so
they can extend big data to their
users
HDP 2.1
Governance
&Integration
Security
Operations
Data Access
Data Management
YARN
Page 15 © Hortonworks Inc. 2014
Cisco + Hortonworks Validated Design
Sean McKeown
Solutions Architect, Data Center, Cisco
Page 16 © Hortonworks Inc. 2014
Cisco + Hortonworks Validated Design
Page 17 © Hortonworks Inc. 2014
Cisco UCS Common Platform Architecture (CPA)
Building Blocks for Big Data
17
UCS	
  6200	
  Series	
  
Fabric	
  Interconnects	
  
Nexus	
  2232	
  
Fabric	
  Extenders	
  
	
  
UCS	
  Manager	
  
UCS	
  240	
  M3	
  
Servers	
  
LAN,	
  SAN,	
  Management	
  
Page 18 © Hortonworks Inc. 2014
UCS + Hortonworks Reference Configurations
18
unformatted storage per rack for a total
of 7.68 petabytes (PB) when scaled to
per rack, for a total of 7.68 PB and
31.25 TB of flash memory per domain.
entailed in designing and building your
own custom solution. The solution
Performance Optimized
(UCS-SL-CPA2-P)
Performance and Capacity
Balanced
(UCS-SL-CPA2-PC)
Capacity Optimized
(UCS-SL-CPA2-C)
Capacity Optimized with
Flash Memory
(UCS-SL-CPA2-CF)
Connectivity • 2 Cisco UCS 6248UP 48-
Port Fabric Interconnects
• 2 Cisco Nexus® 2232PP
10GE Fabric Extenders
• 2 Cisco UCS 6296UP 96-
Port Fabric Interconnects
• 2 Cisco Nexus 2232PP
10GE Fabric Extenders
• 2 Cisco UCS 6296UP 96-
Port Fabric Interconnects
• 2 Cisco Nexus 2232PP
10GE Fabric Extenders
• 2 Cisco UCS 6296UP 96-
Port Fabric Interconnects
• 2 Cisco Nexus 2232PP
10GE Fabric Extenders
Management Cisco UCS Manager Cisco UCS Manager Cisco UCS Manager Cisco UCS Manager
Servers 8 Cisco UCS C240 M3
Rack Servers, each with:
• 2 Intel Xeon processors
E5-2680 v2
• 256 GB of memory
• LSI MegaRaid 9271CV
8i card
• 24 900-GB 10K SFF SAS
drives (168 TB total)
16 Cisco UCS C240 M3
Rack Servers, each with:
• 2 Intel Xeon processors
E5-2660 v2
• 256 GB of memory
• LSI MegaRaid 9271CV
8i card
• 24 1-TB 7.2K SFF SAS
drives (384 TB total)
16 Cisco UCS C240 M3
Rack Servers, each with:
• 2 Intel Xeon processors
E5-2640 v2
• 128 GB of memory
• LSI MegaRaid 9271CV
8i card
• 12 4-TB 7.2K LFF SAS
drives (768 TB total)
16 Cisco UCS C240 M3
Rack Servers, each with:
• 2 Intel Xeon processors
E5-2660 v2
• 128 GB of memory
• Cisco UCS Nytro
MegaRAID 200-GB
Controller
• 12 4-TB 7.2K LFF SAS
drives (768 TB total)
Table 1. Cisco CPA v2 for Big Data Includes Four Optimized Configurations
Page 19 © Hortonworks Inc. 2014
Installing Servers Today
LAN
SAN
• RAID settings
• Disk scrub actions
• Number of vHBAs
• HBA WWN assignments
• FC Boot Parameters
• HBA firmware
• FC Fabric assignments for HBAs
• QoS settings
• Border port assignment per vNIC
• NIC Transmit/Receive Rate Limiting
• VLAN assignments for NICs
• VLAN tagging config for NICs
• Number of vNICs
• PXE settings
• NIC firmware
• Advanced feature settings
• Remote KVM IP settings
• Call Home behaviour
• Remote KVM firmware
• Server UUID
• Serial over LAN settings
• Boot order
• IPMI settings
• BIOS scrub actions
• BIOS firmware
• BIOS Settings
Page 20 © Hortonworks Inc. 2014
UCS Service Profiles
LAN
SAN
ServiceProfile
Page 21 © Hortonworks Inc. 2014
Abstracting the Logical Architecture
21
Adapter
Switch
10GE
A
Eth 1/1
FEX A
6200-A
Physical
Cable
Virtual Cable
(VN-Tag)Server
vNIC
1
10GE
A
vEth
1
FEX A
Adapte
r
6200-A
vHBA
1
vFC
1
Service Profile
Cables
vNIC
1
vEth
1
6200-A
vHBA
1
vFC
1
(Server)
Server
ü  Dynamic,
Rapid
Provisioning
ü  State
abstraction
ü  Location
Independence
ü  Blade or Rack
What you getWhat you see
Chassis
Page 22 © Hortonworks Inc. 2014
Cisco UCS: Physical Architecture
22
6200
Fabric A
6200
Fabric B
B200
VIC
F
E
X
B
F
E
X
A
SAN	
  A	
   SAN	
  B	
  ETH	
  1	
   ETH	
  2	
  
MGMT MGMT
Chassis 1
Fabric Switch
Fabric Extenders
Uplink Ports
Compute Blades
Half / Full width
OOB Mgmt
Server Ports
Virtualized Adapters
Cluster
Rack Mount C240
VIC
FEX A FEX B
Page 23 © Hortonworks Inc. 2014
Simple Scalability
23
Single Rack
16 servers
Single Domain
Up to 10 racks, 160 servers,
7PBytes
Multiple Domains
L2/L3 Switching
Page 24 © Hortonworks Inc. 2014
Proven performance and linear scalability
24
Page 25 © Hortonworks Inc. 2014
Simplified Management Throughout Cluster Lifecycle
Provisioning
Monitoring
Maintenance
Growth
UCSM provides:
•  Speed
•  Ease of experimentation
•  Consistency
•  Simplicity
•  Visibility
Page 26 © Hortonworks Inc. 2014
Complete Network Flexibility
Example:
•  vNIC0 for management
•  vNIC1 for internal
•  vNIC2 for external
•  No OS bonding needed
with Fabric Failover
Configure as vNICs and vLANs as you need with the click of a mouse
26
Data ingress/egress
VNIC
0
VNIC
0
VNIC 1
L2/L3 Switching
Data	
  Node	
  1	
  
VNIC 2
Data	
  Node	
  2	
  
6200 A
VNIC 2
6200 B
VNIC 1
Page 27 © Hortonworks Inc. 2014
Creating QoS Policies and Enabling JumboFrames
27
!!
Best Effort policy for management VLAN Platinum policy for cluster VLAN
Page 28 © Hortonworks Inc. 2014
Switch Buffer Usage
With Network QoS
Policy to prioritize
HBase Read
Operations
0"
5000"
10000"
15000"
20000"
25000"
30000"
35000"
40000"
Latency((us)(
Time(
READ","Average"Latency"(us)" QoS","READ","Average"Latency"(us)"
1"
70"
139"
208"
277"
346"
415"
484"
553"
622"
691"
760"
829"
898"
967"
1036"
1105"
1174"
1243"
1312"
1381"
1450"
1519"
1588"
1657"
1726"
1795"
1864"
1933"
2002"
2071"
2140"
2209"
2278"
2347"
2416"
2485"
2554"
2623"
2692"
2761"
2830"
2899"
2968"
3037"
3106"
3175"
3244"
3313"
3382"
3451"
3520"
3589"
3658"
3727"
3796"
3865"
3934"
4003"
4072"
4141"
4210"
4279"
4348"
4417"
4486"
4555"
4624"
4693"
4762"
4831"
4900"
4969"
5038"
5107"
5176"
5245"
5314"
5383"
5452"
5521"
5590"
5659"
5728"
5797"
5866"
5935"
Buffer&Used&
Timeline&
Hadoop"TeraSort" Hbase"
Read Latency
Comparison of Non-
QoS vs. QoS Policy
~60% Read
Improvement
HBase + Hadoop Map Reduce (Terasort)
Page 29 © Hortonworks Inc. 2014
UCS Rack-Mount
Servers
UCS Blade
Servers
UCS Common Platform
Architecture with Hortonworks
SAN/NAS Arrays
Enterprise Applications
Single Platform for Traditional and Big Data Applications
Page 30 © Hortonworks Inc. 2014
THANK YOU
ajaysingh@hortonworks.com
semckeow@cisco.com

Más contenido relacionado

Was ist angesagt?

Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchHortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveHortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Pig Out to Hadoop
Pig Out to HadoopPig Out to Hadoop
Pig Out to HadoopHortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksData Con LA
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Hortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Hortonworks
 

Was ist angesagt? (20)

Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Pig Out to Hadoop
Pig Out to HadoopPig Out to Hadoop
Pig Out to Hadoop
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 

Andere mochten auch

Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoopgregchanan
 
Using hadoop for enterprise data management
Using hadoop for enterprise data managementUsing hadoop for enterprise data management
Using hadoop for enterprise data managementEstuate, Inc.
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?Hortonworks
 
David valovcin big data - big risk
David valovcin big data - big riskDavid valovcin big data - big risk
David valovcin big data - big riskIBM Sverige
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing ArchitectureGang Tao
 
Enabling Big Data with IBM InfoSphere Optim
Enabling Big Data with IBM InfoSphere OptimEnabling Big Data with IBM InfoSphere Optim
Enabling Big Data with IBM InfoSphere OptimVineet
 
NoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopNoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopDmitry Kan
 
Saving Human Lives with the IoT
Saving Human Lives with the IoTSaving Human Lives with the IoT
Saving Human Lives with the IoTDat Tran
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop AdministrationEdureka!
 
Graphics for big data reference architecture blog
Graphics for big data reference architecture blogGraphics for big data reference architecture blog
Graphics for big data reference architecture blogSunil Soares
 
Presentation data center and cloud architecture
Presentation   data center and cloud architecturePresentation   data center and cloud architecture
Presentation data center and cloud architecturexKinAnx
 
The LightConnectTM Fabric V-POD Data Center Architecture
The LightConnectTM Fabric V-POD Data Center ArchitectureThe LightConnectTM Fabric V-POD Data Center Architecture
The LightConnectTM Fabric V-POD Data Center ArchitectureCALIENT Technologies
 
3D IT Architecture - Data Center
3D IT Architecture - Data Center3D IT Architecture - Data Center
3D IT Architecture - Data CenterPaul Brink
 
Information Technology Innovator David Ward 2011
Information Technology Innovator David Ward 2011Information Technology Innovator David Ward 2011
Information Technology Innovator David Ward 2011ward2dr
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsDATAVERSITY
 
Cloud Architecture in the Data Center
Cloud Architecture in the Data CenterCloud Architecture in the Data Center
Cloud Architecture in the Data CenterInterVision Systems
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederOpenAIRE
 

Andere mochten auch (20)

Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
Using hadoop for enterprise data management
Using hadoop for enterprise data managementUsing hadoop for enterprise data management
Using hadoop for enterprise data management
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
David valovcin big data - big risk
David valovcin big data - big riskDavid valovcin big data - big risk
David valovcin big data - big risk
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing Architecture
 
Enabling Big Data with IBM InfoSphere Optim
Enabling Big Data with IBM InfoSphere OptimEnabling Big Data with IBM InfoSphere Optim
Enabling Big Data with IBM InfoSphere Optim
 
NoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopNoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache Hadoop
 
Clickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache SparkClickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache Spark
 
Saving Human Lives with the IoT
Saving Human Lives with the IoTSaving Human Lives with the IoT
Saving Human Lives with the IoT
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Graphics for big data reference architecture blog
Graphics for big data reference architecture blogGraphics for big data reference architecture blog
Graphics for big data reference architecture blog
 
Presentation data center and cloud architecture
Presentation   data center and cloud architecturePresentation   data center and cloud architecture
Presentation data center and cloud architecture
 
The LightConnectTM Fabric V-POD Data Center Architecture
The LightConnectTM Fabric V-POD Data Center ArchitectureThe LightConnectTM Fabric V-POD Data Center Architecture
The LightConnectTM Fabric V-POD Data Center Architecture
 
Data-center SDN
Data-center  SDN Data-center  SDN
Data-center SDN
 
HTRC Architecture Overview
HTRC Architecture OverviewHTRC Architecture Overview
HTRC Architecture Overview
 
3D IT Architecture - Data Center
3D IT Architecture - Data Center3D IT Architecture - Data Center
3D IT Architecture - Data Center
 
Information Technology Innovator David Ward 2011
Information Technology Innovator David Ward 2011Information Technology Innovator David Ward 2011
Information Technology Innovator David Ward 2011
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture Requirements
 
Cloud Architecture in the Data Center
Cloud Architecture in the Data CenterCloud Architecture in the Data Center
Cloud Architecture in the Data Center
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
 

Ähnlich wie Delivering Apache Hadoop for the Modern Data Architecture

Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Richard McDougall
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data InfrastructureTrivadis
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-finalHaluk Ulubay
 
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation:NoSQL Features, Replication, FDW & MorePostgres for Digital Transformation:NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & MoreAshnikbiz
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformRackspace
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data AnalyticsAmazon Web Services
 

Ähnlich wie Delivering Apache Hadoop for the Modern Data Architecture (20)

Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data Infrastructure
 
Empower Data-Driven Organizations
Empower Data-Driven OrganizationsEmpower Data-Driven Organizations
Empower Data-Driven Organizations
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-final
 
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation:NoSQL Features, Replication, FDW & MorePostgres for Digital Transformation:NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data Analytics
 

Mehr von Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTxtailishbaloch
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applicationsnooralam814309
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...DianaGray10
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0DanBrown980551
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationKnoldus Inc.
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud DataEric D. Schabell
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and businessFrancesco Corti
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechProduct School
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 

Último (20)

SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applications
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its application
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and business
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 

Delivering Apache Hadoop for the Modern Data Architecture

  • 1. Page 1 © Hortonworks Inc. 2014 Delivering Apache Hadoop for the Modern Data Architecture Cisco & Hortonworks. We do Hadoop Together
  • 2. Page 2 © Hortonworks Inc. 2014 Our speakers… Ajay Singh Director Technical Channels, Hortonworks Sean McKeown Solutions Architect, Data Center, Cisco
  • 3. Page 3 © Hortonworks Inc. 2014 Why Hadoop: Traditional Data Architecture Pressured 2.8 ZB in 2012 85% from New Data Types 15x Machine Data by 2020 40 ZB by 2020 Data source: IDC SOURCES OLTP, ERP, CRM Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data
  • 4. Page 4 © Hortonworks Inc. 2014 Sensor Server Logs Text Social Geographic Machine Clickstream Structured Unstructured Financial Services New Account Risk Screens ✔ ✔ Trading Risk ✔ Insurance Underwriting ✔ ✔ ✔ Telecom Call Detail Records (CDR) ✔ ✔ Infrastructure Investment ✔ ✔ Real-time Bandwidth Allocation ✔ ✔ ✔ Retail 360° View of the Customer ✔ ✔ Localized, Personalized Promotions ✔ Website Optimization ✔ What: Business Applications of Hadoop
  • 5. Page 5 © Hortonworks Inc. 2014 Sensor Server Logs Text Social Geographic Machine Clickstream Structured Unstructured Manufacturing Supply Chain and Logistics ✔ Preventive Maintenance ✔ Crowd-sourced Quality Assurance ✔ Healthcare Use Genomic Data in Medial Trials ✔ ✔ Monitor Patient Vitals in Real-Time Pharmaceutical s Recruit & Retain Patients for Drug Trials ✔ ✔ Improve Prescription Adherence ✔ ✔ ✔ Oil & Gas Unify Exploration & Production Data ✔ ✔ ✔ Monitor Rig Safety in Real-Time ✔ ✔ Government ETL Offload in Response to Budgetary Pressures ✔ Sentiment Analysis for Gov’t Programs ✔ What: Business Applications of Hadoop
  • 6. Page 6 © Hortonworks Inc. 2014 OPERATIONS TOOLS Provision, Manage & Monitor DEV & DATA TOOLS Build & Test DATASYSTEMSAPPLICATIONS Repositories ROOMS Statistical Analysis BI / Reporting, Ad Hoc Analysis Interactive Web & Mobile Apps Enterprise Applications RDBMS EDW MPP How: Modern Data Architecture with Hadoop Governance     &  Integra.on   Security   Opera.ons   Data  Access   Data  Management   ENTERPRISE HADOOP SOURCES OLTP, ERP, CRM Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data
  • 7. Page 7 © Hortonworks Inc. 2014 YARN Transforms Hadoop’s Architecture     Enables  deep  insight   across  a  large,  broad,   diverse  set  of  data  at   efficient  scale     Mul.-­‐Use  Data  Pla>orm   Store  all  data  in  one  place,  process  in  many  ways   Batch   Interac.ve   Itera.ve   Streaming   1   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   n   Store any/all raw data sources and processed data over extended periods of time. YARN  :  Data  Opera.ng  System  
  • 8. Page 8 © Hortonworks Inc. 2014 Designing Hadoop Cluster § Cluster Storage Capacity § Server Specification § Cluster Size § Factoring Performance Key Considerations §  Any piece of hardware can and will fail §  More nodes means less impact on failure §  Resiliency and fault tolerance improve with scale §  Build resiliency through scale §  Still use modern hardware §  Software handles hardware failures
  • 9. Page 9 © Hortonworks Inc. 2014 Storage Capacity §  Key Input §  Initial Data Size §  3 year YOY growth §  Compression ratio §  Intermediate and materialized views §  Replication Factor §  Note §  Hard to accurately predict the size of intermediate & materialized views at the start of a project §  Be conservative with compression ratio. Mileage varies by data type §  Hadoop needs temp space to store intermediate files Hadoop Cluster Raw Data Work In Process Data Master Data Materialized Views
  • 10. Page 10 © Hortonworks Inc. 2014 Storage Capacity Total Storage Required (Initial Size + " YOY Growth + Intermediate Data Size) " X Replication Count " X 1.2" Compression Ratio" Good Rule of Thumb Replication Count = 3" " Compression Ratio = 4-5" " Intermediate Data Size = 50%-100% of Raw Data Size" Note 1.2 factor is included in the sizing estimator to account for the temp space requirement of Hadoop"
  • 11. Page 11 © Hortonworks Inc. 2014 Server Specification Page 11 §  Master Nodes – NameNode, Resource Manager, HBase Master §  Dual Intel Xeon E5-26xx series processors §  128GB or 256GB RAM per chassis §  4+ – 1TB NL-SAS/SATA Drives RAID10+ Spares §  Worker Nodes – DataNode, Node Manager and Region Server §  Dual Intel Xeon E5-26xx series processors §  128GB RAM or 256GB RAM §  12 – 1-4 TB NLSAS/SATA Drives §  Gateway Nodes / Edge Nodes §  Mirror of Master Nodes configuration
  • 12. Page 12 © Hortonworks Inc. 2014 Number of Data Nodes Cluster Size 12 Storage Per Server Number of Master Nodes §  Name Node, Zookeeper §  Resource Manager, Zookeeper §  Failover Name Node, HBase Master, Hive Server, Zookeeper §  In a half-rack cluster, this would be combined with Resource Manager §  Management Node (Ambari, Ganglia, Nagios) §  In a half-rack cluster, this would be combined with the Name Node Total Storage" Required" Note §  Large clusters may need more than 4 master nodes §  Start at 2/4 and grow based on usage
  • 13. Page 13 © Hortonworks Inc. 2014 Factoring Performance § Data Nodes § 1 TB drives for performance clusters § 4 TB drives for archive clusters § Meeting SLA Requirements §  Hadoop workloads are varied §  Difficult to assess cluster size based on SLAs without actual testing §  Good News: Hadoop performs linearly with scale §  Enables one to design experiments using a fraction of data §  Best Practice Guidance §  Create a test configuration with a rack of servers §  Load a slice of data §  Run tests with real-life queries to measure performance & fine tune the system §  Scale cluster size based on observed performance 13
  • 14. Page 14 © Hortonworks Inc. 2014 OPERATIONAL  TOOLS   DEV  &  DATA  TOOLS   INFRASTRUCTURE   HDP and Cisco are deeply integrated in the data centerSOURCES EXISTING   Systems   Clickstream   Web  &Social   Geoloca.on   Sensor  &   Machine   Server  Logs   Unstructured   DATASYSTEM RDBMS   EDW   MPP   HANA APPLICATIONS   BusinessObjects BI Deep Partnerships Hortonworks and Cisco engages in deep engineered relationships with the leaders in the data center, such as Microsoft, Teradata, Redhat, & SAP Broad Partnerships Over 600 partners work with Hortonworks to certify their applications to work with Hadoop so they can extend big data to their users HDP 2.1 Governance &Integration Security Operations Data Access Data Management YARN
  • 15. Page 15 © Hortonworks Inc. 2014 Cisco + Hortonworks Validated Design Sean McKeown Solutions Architect, Data Center, Cisco
  • 16. Page 16 © Hortonworks Inc. 2014 Cisco + Hortonworks Validated Design
  • 17. Page 17 © Hortonworks Inc. 2014 Cisco UCS Common Platform Architecture (CPA) Building Blocks for Big Data 17 UCS  6200  Series   Fabric  Interconnects   Nexus  2232   Fabric  Extenders     UCS  Manager   UCS  240  M3   Servers   LAN,  SAN,  Management  
  • 18. Page 18 © Hortonworks Inc. 2014 UCS + Hortonworks Reference Configurations 18 unformatted storage per rack for a total of 7.68 petabytes (PB) when scaled to per rack, for a total of 7.68 PB and 31.25 TB of flash memory per domain. entailed in designing and building your own custom solution. The solution Performance Optimized (UCS-SL-CPA2-P) Performance and Capacity Balanced (UCS-SL-CPA2-PC) Capacity Optimized (UCS-SL-CPA2-C) Capacity Optimized with Flash Memory (UCS-SL-CPA2-CF) Connectivity • 2 Cisco UCS 6248UP 48- Port Fabric Interconnects • 2 Cisco Nexus® 2232PP 10GE Fabric Extenders • 2 Cisco UCS 6296UP 96- Port Fabric Interconnects • 2 Cisco Nexus 2232PP 10GE Fabric Extenders • 2 Cisco UCS 6296UP 96- Port Fabric Interconnects • 2 Cisco Nexus 2232PP 10GE Fabric Extenders • 2 Cisco UCS 6296UP 96- Port Fabric Interconnects • 2 Cisco Nexus 2232PP 10GE Fabric Extenders Management Cisco UCS Manager Cisco UCS Manager Cisco UCS Manager Cisco UCS Manager Servers 8 Cisco UCS C240 M3 Rack Servers, each with: • 2 Intel Xeon processors E5-2680 v2 • 256 GB of memory • LSI MegaRaid 9271CV 8i card • 24 900-GB 10K SFF SAS drives (168 TB total) 16 Cisco UCS C240 M3 Rack Servers, each with: • 2 Intel Xeon processors E5-2660 v2 • 256 GB of memory • LSI MegaRaid 9271CV 8i card • 24 1-TB 7.2K SFF SAS drives (384 TB total) 16 Cisco UCS C240 M3 Rack Servers, each with: • 2 Intel Xeon processors E5-2640 v2 • 128 GB of memory • LSI MegaRaid 9271CV 8i card • 12 4-TB 7.2K LFF SAS drives (768 TB total) 16 Cisco UCS C240 M3 Rack Servers, each with: • 2 Intel Xeon processors E5-2660 v2 • 128 GB of memory • Cisco UCS Nytro MegaRAID 200-GB Controller • 12 4-TB 7.2K LFF SAS drives (768 TB total) Table 1. Cisco CPA v2 for Big Data Includes Four Optimized Configurations
  • 19. Page 19 © Hortonworks Inc. 2014 Installing Servers Today LAN SAN • RAID settings • Disk scrub actions • Number of vHBAs • HBA WWN assignments • FC Boot Parameters • HBA firmware • FC Fabric assignments for HBAs • QoS settings • Border port assignment per vNIC • NIC Transmit/Receive Rate Limiting • VLAN assignments for NICs • VLAN tagging config for NICs • Number of vNICs • PXE settings • NIC firmware • Advanced feature settings • Remote KVM IP settings • Call Home behaviour • Remote KVM firmware • Server UUID • Serial over LAN settings • Boot order • IPMI settings • BIOS scrub actions • BIOS firmware • BIOS Settings
  • 20. Page 20 © Hortonworks Inc. 2014 UCS Service Profiles LAN SAN ServiceProfile
  • 21. Page 21 © Hortonworks Inc. 2014 Abstracting the Logical Architecture 21 Adapter Switch 10GE A Eth 1/1 FEX A 6200-A Physical Cable Virtual Cable (VN-Tag)Server vNIC 1 10GE A vEth 1 FEX A Adapte r 6200-A vHBA 1 vFC 1 Service Profile Cables vNIC 1 vEth 1 6200-A vHBA 1 vFC 1 (Server) Server ü  Dynamic, Rapid Provisioning ü  State abstraction ü  Location Independence ü  Blade or Rack What you getWhat you see Chassis
  • 22. Page 22 © Hortonworks Inc. 2014 Cisco UCS: Physical Architecture 22 6200 Fabric A 6200 Fabric B B200 VIC F E X B F E X A SAN  A   SAN  B  ETH  1   ETH  2   MGMT MGMT Chassis 1 Fabric Switch Fabric Extenders Uplink Ports Compute Blades Half / Full width OOB Mgmt Server Ports Virtualized Adapters Cluster Rack Mount C240 VIC FEX A FEX B
  • 23. Page 23 © Hortonworks Inc. 2014 Simple Scalability 23 Single Rack 16 servers Single Domain Up to 10 racks, 160 servers, 7PBytes Multiple Domains L2/L3 Switching
  • 24. Page 24 © Hortonworks Inc. 2014 Proven performance and linear scalability 24
  • 25. Page 25 © Hortonworks Inc. 2014 Simplified Management Throughout Cluster Lifecycle Provisioning Monitoring Maintenance Growth UCSM provides: •  Speed •  Ease of experimentation •  Consistency •  Simplicity •  Visibility
  • 26. Page 26 © Hortonworks Inc. 2014 Complete Network Flexibility Example: •  vNIC0 for management •  vNIC1 for internal •  vNIC2 for external •  No OS bonding needed with Fabric Failover Configure as vNICs and vLANs as you need with the click of a mouse 26 Data ingress/egress VNIC 0 VNIC 0 VNIC 1 L2/L3 Switching Data  Node  1   VNIC 2 Data  Node  2   6200 A VNIC 2 6200 B VNIC 1
  • 27. Page 27 © Hortonworks Inc. 2014 Creating QoS Policies and Enabling JumboFrames 27 !! Best Effort policy for management VLAN Platinum policy for cluster VLAN
  • 28. Page 28 © Hortonworks Inc. 2014 Switch Buffer Usage With Network QoS Policy to prioritize HBase Read Operations 0" 5000" 10000" 15000" 20000" 25000" 30000" 35000" 40000" Latency((us)( Time( READ","Average"Latency"(us)" QoS","READ","Average"Latency"(us)" 1" 70" 139" 208" 277" 346" 415" 484" 553" 622" 691" 760" 829" 898" 967" 1036" 1105" 1174" 1243" 1312" 1381" 1450" 1519" 1588" 1657" 1726" 1795" 1864" 1933" 2002" 2071" 2140" 2209" 2278" 2347" 2416" 2485" 2554" 2623" 2692" 2761" 2830" 2899" 2968" 3037" 3106" 3175" 3244" 3313" 3382" 3451" 3520" 3589" 3658" 3727" 3796" 3865" 3934" 4003" 4072" 4141" 4210" 4279" 4348" 4417" 4486" 4555" 4624" 4693" 4762" 4831" 4900" 4969" 5038" 5107" 5176" 5245" 5314" 5383" 5452" 5521" 5590" 5659" 5728" 5797" 5866" 5935" Buffer&Used& Timeline& Hadoop"TeraSort" Hbase" Read Latency Comparison of Non- QoS vs. QoS Policy ~60% Read Improvement HBase + Hadoop Map Reduce (Terasort)
  • 29. Page 29 © Hortonworks Inc. 2014 UCS Rack-Mount Servers UCS Blade Servers UCS Common Platform Architecture with Hortonworks SAN/NAS Arrays Enterprise Applications Single Platform for Traditional and Big Data Applications
  • 30. Page 30 © Hortonworks Inc. 2014 THANK YOU ajaysingh@hortonworks.com semckeow@cisco.com