Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10 Considerations

A r c h i t e c t i n g a S c a l a b l e H a d o o p P l a t f o r m :
To p 1 0 C o n s i d e r a t i o n s f o r S u c c e s s
PRESENTED BY Sumeet Singh ⎪ April 15, 2015
H a d o o p S u m m i t 2 0 1 5 , B r u s s e l s

Introduction
2
§  Manages Hadoop products team at Yahoo
§  Responsible for Product Management, Strategy and
Customer Engagements
§  Managed Cloud Engineering products teams and
headed Strategy functions for the Cloud Platform
Group at Yahoo
§  MBA from UCLA and MS from Rensselaer
Polytechnic Institute (RPI)
Sumeet Singh
Sr. Director, Product Management
Cloud and Big Data Platforms
Platforms and Personalization Products
701 First Avenue,
Sunnyvale, CA 94089 USA
@sumeetksingh
Disclaimer
The considerations presented
here are my personal opinion,
driven purely from my
experiences working with cloud
technologies and services

Hadoop as Secure Shared Hosted Multi-tenant Platform
3
TV
PC
Phone
Tablet
Pushed Data
Pulled Data
Web Crawl
Social
Email
3rd Party Content
Data Highway
Hadoop Grid
BI, Reporting, Adhoc Analytics
Data
Content
Ads
No-SQL
Serving Stores
Serving

Platform Evolution (2006 – 2015)
4
0
100
200
300
400
500
600
700
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
RawHDFS(inPB)
#Servers
Year
Servers Storage
Yahoo!
Commits to
Scaling Hadoop
for Production
Use
Research
Workloads
in Search and
Advertising
Production
(Modeling)
with machine
learning &
WebMap
Revenue
Systems
with Security,
Multi-tenancy,
and SLAs
Open
Sourced with
Apache
Hortonworks
Spinoff for
Enterprise
hardening
Nextgen
Hadoop
(H 0.23 YARN)
New Services
(HBase,
Storm, Spark,
Hive)
Increased
User-base
with partitioned
namespaces
Apache H2.6
(Scalable ML, Latency,
Utilization, Productivity)
Servers Use Cases
Hadoop 43,000 300
HBase 3,000 70
Storm 2,000 50

Top 10 Considerations for Scaling Hadoop-based Platform
5
On-Premise or Public Cloud
Total Cost Of Ownership (TCO)
Hardware Configuration
2
3
Network4
Software Stack5
6
7
8
10
Security and Account Management
Data Lifecycle Management and BCP
Metering, Audit and Governance
9 Integration with External Systems
Debunking Architectural Myths
1

On-Premise or Public Cloud – Deployment Models
6
1
Private (dedicated)
Clusters
Hosted Multi-tenant
(Private Cloud)
Clusters
Hosted Compute
Clusters
§  Large demanding use
cases
§  New technology not
yet platformized
§  Data movement and
regulation issues
§  For cases where
more cost effective
than on-premise
§  Time to market/
results matter
§  Exploration and
learning
§  Source of truth for all
of orgs data
§  App delivery agility
§  Operational efficiency
and cost savings
through economies of
scale
On-Premise Public Cloud
Purpose-built
Big Data
Clusters
§  For performance,
tighter integration
with tech stack
§  Value added services
such as monitoring,
alerts, tuning and
common tools

On-Premise or Public Cloud – Selection Criteria
7
1
§  Fixed, does not vary with utilization
§  Favors scale and 24x7 centralized ops
§  Variable with usage
§  Favors run and done, decentralized ops
Cost
§  Aggregated from disparate or distributed
sources
§  Typically generated and stored in the
cloudData
§  Job queue, cap. sched., BCP, catchup
§  Controlled latency and throughput
§  No guarantees (beyond uptime) without
provisioning additional resourcesSLA
§  Control over deployed technology
§  Requires platform team/ vendor support
§  Little to no control over tech stack
§  No need for platform R&D headcount
Tech Stack
§  Shared env., control over data /
movement, PII, ACLs, pluggable security
§  Data typically not shared among users in
the cloudSecurity
§  Matters, complex to develop and
operate
§  Does not matter, clusters are dynamic/
virtual and dedicated
Multi-
tenancy
On-Premise Public CloudCriteria

On-Premise or Public Cloud – Evaluation
8
1
On-Premise
Public Cloud
Cost
Data
SLA
Tech Stack
Security
Multi-tenancy

On-Premise or Public Cloud – A Lot About Utilization
9
1
Utilization / Consumption (Compute and Storage)
Cost($)
On-premise
Hadoop as a
Service
On-demand
public cloud
service
Terms-based
public cloud
service
Favors on-premise
Hadoop as a Service
Favors public cloud
service
x
x
Current and expected
or target utilization
can provide further
insights into your
operations and cost
competitiveness
Highstartingcost
Scalingup
Crossover
point 1

Total Cost Of Ownership (TCO) – Components
10
2
$2.1 M
60%
12%
7%
6%
3%
2%
6
5
4
3
2
1
7
10%
Operations Engineering
§  Headcount for service engineering and data operations teams responsible for day-to-day ops and
support
6
Acquisition/ Install (One-time)
§  Labor, POs, transportation, space, support, upgrades, decommissions, shipping/ receiving etc.
5
Network Hardware
§  Aggregated network component costs, including switches, wiring, terminal servers, power strips etc.
4
Active Use and Operations (Recurring)
§  Recurring datacenter ops cost (power, space, labor support, and facility maintenance
3
R&D HC
§  Headcount for platform software development, quality, and release engineering
2
Cluster Hardware
§  Data nodes, name nodes, job trackers, gateways, load proxies, monitoring, aggregator, and web servers
1
Monthly TCOTCO Components
Network Bandwidth
§  Data transferred into and out of clusters for all colos, including cross-colo transfers
7
ILLUSTRATIVE

Total Cost Of Ownership (TCO) – Unit Costs (Hadoop)
11
2
Compute (Memory)
Container memory
where apps perform
computation and
access HDFS if
needed
Compute (CPU)
Container CPU
cores used by apps
to perform
computation / data
processing
Bandwidth
Network bandwidth
needed to move
data into/out of the
clusters by the app
$ / GB-Hour (H 2.0+)
GBs of Memory
available for an hour
Monthly Memory Cost
Avail. Memory Capacity
$ / vCore-Hour (H 2.6+)
vCores of CPU
available for an hour
Monthly CPU Cost
Avail. CPU vCores
Unit
Total Capacity
Unit Cost
$ / GB of data stored
Usable storage space
(less replication and
overheads)
Monthly Storage Cost
Avail. Usable Storage
$ / GB for Inter-region
data transfers
Inter-region (peak) link
capacity
Monthly BW Cost
Monthly GB In + Out
Files and directories
used by the apps to
understand/ limit the
load on NN)
Namespace
HFDS (usable)
space needed by
an app with default
replication factor of
three
Storage (Disk)

Total Cost Of Ownership (TCO) – Consumption Costs
12
2
Map GB-Hours = GB(M1) x
T(M1) + GB(M2) x T(M2) +
…
Reduce GB-Hours = GB(R1)
x T(R1) + GB(R2) x T(R2) +
…
Cost = (M + R) GB-Hour x
$0.002 / GB-Hour / Month
= $ for the Job/ Month
(M+R) GB-Hours for all
jobs can summed up for
the month for a user, app,
BU, or the entire platform
Monthly Job
and Task
Cost
Monthly Roll-
ups
Compute (Memory) Compute (CPU) Bandwidth
Map vCore-Hours =
vCores(M1) x T(M1) +
vCores(M2) x T(M2) + …
Reduce vCore-Hours =
vCores(R1) x T(R1) +
vCores(R2) x T(R2) + …
Cost = (M + R) vCore-Hour
x $0.002 / vCore-Hour /
Month
= $ for the Job/ Month
(M+R) vCore-Hours for all
jobs can summed up for
the month for a user, app,
BU, or the entire platform
/ project (app) quota in GB
(peak monthly used)
/ user quota in GB (peak
monthly used)
/ data as each user
accountable for their portion
of use. For e.g.
GB Read (U1)
GB Read (U1) + GB Read
(U2) + …
Roll-ups through
relationship among user,
file ownership, app, and
their BU
Bandwidth measured at the
cluster level and divided
among select apps and
users of data based on
average volume In/Out
Roll-ups through
relationship among user,
app, and their BU
Storage (Disk)

Hardware Configuration – Physical Resources
13
3
.
.
.
.
Datacenter 1
Rack 1 Rack N
.
.
Bandwidth
Storage (Disk)
Memory
CPU
Clusters in Datacenters Server Resources
C-nn / 64,128,256 G / 4000, 6000 etc.

Hardware Configuration – Eventual Heterogeneity
14
3
Memory CPU Storage
24 G 8 cores SATA 0.5 TB
48 G 12 cores SATA 1.0 TB
64 G Harpertown SATA 2.0 TB
128 G Sandy Bridge SATA 3.0 TB
192 G Ivy Bridge SATA 4.0 TB
256 G Haswell SATA 6.0 TB
384 G
§  Heterogeneous Configurations:
10s of configs of data nodes
(collected over the years) without
dictating scheduling decisions –
let the framework balance out the
configs
§  Heterogeneous Storage:
HDFS supports heterogeneous
storage (HDD, SSD, RAM, RAID
etc.) – HDFS-2832, HDFS-5682
§  Heterogeneous Scheduling:
operate multiple purpose
hardware in the same cluster (e.g.
GPUs) – YARN 796

Network – Common Backplane
15
4
DataNode NodeManager
NameNode RM
DataNodes RegionServers
NameNode HBase Master Nimbus
Supervisor
Administration, Management and Monitoring
ZooKeeper
Pools
HTTP/HDFS/GDM
Load Proxies
Applications and Data
Data
Feeds
Data
Stores
Oozie
Server
HS2/
HCat
Network
Backplane

Network – Bottleneck Awareness
16
4
Hadoop Cluster
(Data Set 1)
Hadoop Cluster
(Data Set 2)
HBase Cluster
(Low-latency
Data Store)
Storm Cluster
(Real-time /
Stream
Processing)
Large dataset joins
or data sharing over
network
1
Large extractions
may saturate the
network
2
Fast bulk updates
may saturate the
network
3 Large data
copies may
not be
possible
4

Network – 1G BAS (Rack Locality Not A Major Issue)
17
4
RSW
…
…
…
N x
RSW RSW
BAS1-1 BAS1-2
FAB 1 FAB 2 FAB 3 FAB 4 FAB 5 FAB 6 FAB 7 FAB 8
L3
Backplane
RSW
…
…
…
N x
RSW RSW
BAS8-1 BAS8-2
L3
Backplane
…
1 Gbps
2:1 oversubscription
10 Gbps
8 x 10 Gbps
Fabric
Layer
48 racks, 15,360 hosts
SPOF!

Network –10G CLOS (Server Placement Not an Issue)
18
4
Spine 1
Leaf 1
Spine 2
Leaf 2
Leaf 3
Leaf 4
Spine 15 Leaf 29
Leaf 30
Leaf 31
Spine 0
Leaf 0
.
.
.
.
.
.
Virtual Chassis 0
Spine 1
Leaf 1
Spine 2
Leaf 2
Leaf 3
Leaf 4
Spine 15 Leaf 29
Leaf 30
Leaf 31
Spine 0
Leaf 0
.
.
.
.
.
.
Virtual Chassis 1
RSW
N x
RSW RSW
10 Gbps
5:1 oversubscription
16 spines, 32 leafs
2 x 40 Gbps
512 racks, 20,480 hosts
SPOF!

Network – Gen Next
19
4
Source: http://www.opencompute.org

Software Stack – Where are We Today
20
5
Compute
Services
Storage
Infrastructure
Services
Hive
(0.13, 1.0)
Pig
(0.11, 0.14)
Oozie
(4.4)
HDFS Proxy
(3.2)
GDM
(6.2)
YARN
(2.6)
MapReduce
(2.6)
HDFS
(2.6)
HBase
(0.98)
Zookeeper
Grid UI
(SS/Doppler,
Discovery, Hue 3.7)
Monitoring
Starling,
Timeline
Server
Messaging
Service
HCatalog
(0.13, 1.0)
Storm
(0.9)
Spark
(1.3)
Tez
(0.6)

Software Stack – Obsess With Use Cases, Not Tech
21
5
HDFS
(File System)
YARN
(Scheduling, Resource Management)
Compute
Services
Storage
Infrastructure
Services
Hive
(0.13, 1.0)
Pig
(0.11, 0.14)
Oozie
(4.4)
HDFS Proxy
(3.2)
GDM
(6.2)
YARN
(2.6)
MapReduce
(2.6)
HDFS
(2.6)
HBase
(0.98)
Zookeeper
Grid UI
(SS/Doppler,
Discovery, Hue 3.7)
Monitoring
Starling,
Timeline
Server
Messaging
Service
HCatalog
(0.13, 1.0)
Storm
(0.9)
Spark
(1.3)
Tez
(0.6)
Common
In-
progress,
Unmet
needs or
Apache
Alignment
Platformized
Tech with
Production
Support
RHEL6 64-bit, JDK8

Security and Account Management – Overview
22
6
Grid
Identity,
Authentication and
Authorization
User Id
SSO
Groups, Netgroups, Roles
RPC (GSSAPI)
UI (SPNEGO)

Security and Account Management – Flexibly Secure
23
6
Kerb Realm 2
(Users)
Kerb Realm 1
(Projects, Services)
IdP
SP
CLIENTS
CORP
PROD
Auth
User SSO
Netgroups
Hadoop RPC
Delegation
tokens
Block tokens
Job tokens
Grid

24
7
Acquisition
Replication
(Feeds)Source
Retention
(Policy based
Expiration)
Archival
(Tape Backup)
DataOut
Data Lifecycle
Datastore
Datastore defines a data
source/target (e.g. HDFS)
Dataset
Defines the data flow of a feed
Workflow
Defines a unit of work carried
out by acquisition, replication,
retention servers for moving
an instance of a feed

25
7
MetaStore
Cluster 1 - Colo 1
HDFS
Cluster 2 – Colo 2
HDFS
Grid Data
Management
Feed Acquisition
MetaStore
Feed datasets as
partitioned external
tables
Growl extracts
schema for backfill
HCatClient.
addPartitions(…)
Mark
LOAD_DONE
HCatClient.
addPartitions(…)
Mark
LOAD_DONE
Partitions are dropped with
(HCatClient.dropPartitions(…)) after
retention expiration with a
drop_partition notification
add_partition
event notification
add_partition
event notification
Acquisition
Archival,
Dataout
Retention
Feed
Replication

Metering, Audit, and Governance
26
8
Starling
FS, Job, Task logs
Cluster 1 Cluster 2 Cluster n...
CF, Region, Action, Query Stats
Cluster 1 Cluster 2 Cluster n...
DB, Tbl., Part., Colmn. Access Stats
...MS 1 MS 2 MS n
GDM
Data Defn., Flow, Feed, Source
F 1 F 2 F n
Log Warehouse
Log Sources

Metering, Audit, and Governance
27
8
Data Discovery and Access
Public
Non-sensitive
Financial
Restricted
$
Governance
Classification
No addn. reqmt.
LMS Integration
Stock Admin
Integration
Approval Flow

Integration with External Systems
28
9
BI, Reporting, Transactional DBs
Hadoop Customers
…
DH
Cloud Messaging
Serving Systems
Monitoring, Tools, Portals
Infrastructure in Transition

Debunking Myths
29
10
Hadoop isn’t enterprise ready
Hadoop isn’t stable, clusters go down
You lose data on HDFS
Data cannot be shared across the org
NameNodes do not scale
Software upgrades are rare✗
Hadoop use cases are limited
I need expensive servers to get more
Hadoop is so dead
I need Apache this vs. that
✗
✗
✗
✗
✗
✗
✗
✗
✗

Thank You
@sumeetksingh
Yahoo Kiosk #8

Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10 Considerations

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10 Considerations

Ähnlich wie Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10 Considerations (20)

Mehr von Sumeet Singh

Mehr von Sumeet Singh (9)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10 Considerations