In a prior tech blog (http://nflx.it/XoySYR), we had discussed the architecture of our petabyte-scale data warehouse in the cloud. Salient features of our architecture include the use of Amazon’s Simple Storage Service (S3) as our "source of truth", leveraging the elasticity of the cloud to run multiple dynamically resizable Hadoop clusters to support various workloads, and our horizontally scalable Hadoop Platform as a Service called Genie.
We are pleased to announce that Genie is now open source (http://nflx.it/15rd6pJ), and available to the public from the Netflix OSS GitHub site (https://github.com/Netflix/genie).
8. Data Platform as a Service
Cloud Data Warehouse
Hadoop (EMR) Clusters
Hadoop Platform as a Service
Job
Execution
Resource Configuration
& Management
Metadata Service
(Franklin)
9. Large Ecosystem of Clients & Tools
Cloud Data Warehouse
Hadoop (EMR) Clusters
Hadoop Platform as a Service
Job
Execution
Resource Configuration
& Management
Metadata Service
(Franklin)
10. Why Genie?
Simple API for job submission and management
Accessible from the data center and the cloud
Abstraction of physical details of back-end
Hadoop clusters
11. What Genie is Not
A workflow scheduler, such as Oozie
A task scheduler, such as fair share or capacity
schedulers
An end-to-end resource management tool
12. Genie: Job Execution
API to run Hadoop, Hive and Pig
jobs
Auto-magic submission of jobs
to the right Hadoop cluster
Abstracting away cluster details
from clients
13. Genie: Resource Configuration
API for management of cluster
metadata
Status: up, out of service, or
terminated
Site-specific Hadoop, Hive and
Pig configurations
Cluster naming/tagging for job
submissions
14. Eureka ServiceEureka Service
ClientEureka
Client
Ribbon
Client Eureka
Client
Python API
Registers
service
Discovers
service
Discovers
service
Invokes
(submits job)
Launches
cluster(s)
Launches
job
Registers
cluster
End-users
Admins
Netflix OSS
http://netflix.github.com
Karyon
Eureka
Client
Ribbon
Servo
Hadoop
Hive
Pig
Karyon
Archaius
Ribbon
Servo
Hadoop
Hive
Pig
Eureka
Client
17. Genie Job Details
Job ID
Script to execute
Standard output and error
Pig logs
Job conf directory
18. Genie – Use Cases Enabled at Netflix
Running nightly short-lived “bonus” clusters to
augment ETL processing
Re-routing traffic between clusters
“Red/black” pushes for clusters
Attaching stand-alone gateways to clusters
Running 100% of all SLA jobs, and a high
percentage of ad-hoc jobs
19. Nightly Short-lived Bonus Clusters
Execution Service Configuration Service
Prod SLA Cluster:
Schedule: sla
Configurations: prod
23. Rerouting Traffic Between Clusters
Ad-hoc Cluster:
Schedule: adhoc
Configurations: prod, test
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Execution Service Configuration Service
{Schedule=sla,
Configuration=prod}
24. Rerouting Traffic Between Clusters
Ad-hoc Cluster:
Schedule: adhoc, sla
Configurations: prod, test
Execution Service Configuration Service
{Schedule=sla,
Configuration=prod}
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: OUT_OF_SERVICE
25. Rerouting Traffic Between Clusters
Ad-hoc Cluster:
Schedule: adhoc
Configurations: prod, test
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: UP
Execution Service Configuration Service
{Schedule=sla,
Configuration=prod}
26. “Red/Black” Pushes for Clusters
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: UP
Execution Service Configuration Service
{Schedule=sla,
Configuration=prod}
27. “Red/Black” Pushes for Clusters
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: OUT_OF_SERVICE
Execution Service Configuration Service
{Schedule=sla,
Configuration=prod}
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: UP
28. “Red/Black” Pushes for Clusters
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: TERMINATED
Execution Service Configuration Service
{Schedule=sla,
Configuration=prod}
Prod SLA Cluster:
Schedule: sla
Configurations: prod
Status: UP
29. Genie Usage at Netflix
Usage statistics brought to you by “Sherlock”
Pig job to gather Hadoop job statistics
Tableau-based visualization
32. Genie is now part of Netflix OSS!
http://techblog.netflix.com/2013/06/genie-is-out-
of-bottle.html
Clone it on GitHub at:
https://github.com/Netflix/genie
Still “version 0” – work in progress!
All contributions and feedback welcome!
Come talk to us and check out live demos at the
Netflix Booth
Referencehttp://techblog.netflix.com/2013/01/hadoop-platform-as-service-in-cloud.htmlUse cases – reporting, analytics, insights, algorithms (e.g. recommendations)But big deal – so does everyone in the room
What is scale? It means different things to different people
Few petabytes of data – billons of log events captured each data, with retention of a few monthsMany clusters – 1000s of nodesAgain, big deal – there are many others in the room who do Hadoop at this scale (petabyte is the new terabyte)
Our Hadoop processing is 100% in the (public) cloudIn our case, public cloud is AWSThis is what differentiates our infrastructure from the restHadoop in the cloud is different from Hadoop in the datacenter – in this talk, we will discuss our cloud-based Hadoop platform
S3 is the source of truthDecoupling of storage from the computational infrastructureS3 benefitsHighly durable and available – 11 9’sBucket versioningHighly elastic - we grew our data warehouse organically from a few hundred terabytes to petabytes without having to provision any storage resources in advanceHDFS? Only for transient data, intermediate results for multi-stage jobsS3 cons – performance, eventual consistency
Another benefit of S3 - Multiple clusters can read/process the same data(Semi-) persistent sla and ad-hoc clusters~800-1300 nodesMultiple ad-hoc clusters to A/B test new releases/featuresNightly "bonus" clusters to supplement SLA clusterOperation assumption – clusters may go down at any time
Traditional Gateways/CLIsAd-hoc queryingGenieREST API for job execution/monitoringRepository/abstraction for clusters and metastoresFranklin – MDSUses HCAT/HiveServer to talk to Hive metastore
Next – we will focus on Genie for the rest of the talkOther tools will be talked about in the other Netflix talk
EMR: HadoopIaaS, and an API to run jobs on transient clusters – our clusters are semi-persistent, and job submissions don’t result in new clusters.Oozie: Workflow tool, which only supports Hadoop ecosystem – we have hybrid jobs (Teradata+Hadoop) being orchestrated by UC4, so we just needed a job submission API. Also no support for Hive when we started.Templeton: No multi-cluster, multi-user support, not quite ready for prime-time.
* Genie is a resource “match-maker”
Unit of execution is a Hadoop/Hive/Pig jobUsers provide scripts, dependencies and other metadataDoes no scheduling per se – only does “meta-scheduling” or resource matching
Status defines whether it is accepting jobsConfigurations are *-site.xmls and propertiesCluster name, schedule, etc
Two classes of users: admins and end-usersAdmins spin up clusters, set cluster metadataUsers use the clusters once they have been registeredGenie is built on top of Netflix OSS
Genie figures out the resources to run jobs on – back-end resources are abstracted outAsynchronous execution since jobs may be long-running
Every job run as a separate process using Hadoop/Hive/Pig CLIAvoids “jar hell” since it needs Hadoop jarsJobs run in their own sandbox (working directory)Provides isolation between jobs, and between Genie and the jobsStandard output/error of jobs easily availableAble to support multiple versions of Hadoop/Hive/Pig, and connect to multiple clusters
Configuration service helps us do crazy (cool) thingsWill describe each of these in greater detail
New bonus clusters launched each night – but clients are oblivious of actual host names/IP’sOne way to do thisHigher SLA jobs first ask for cluster by name
If it doesn’t exist, revert back to existing clusterWhy not just expand?Better isolationMixing matching instance types not ideal for HadoopProd cluster uses m1.xlarges for slave nodesShrink has proven to be a problemWe want to do hard shutdown when those instances are needed on awsprod
We had to bounce the prod job tracker to enable priorities for “long-pole” jobsWanted to do it with minimal impact to SLA jobs
Must wait for all existing jobs to finish for minimal impactHadoop jobs are long running – don’t want to kill a 5 hour job nearing its finish
Prod cluster is back up after maintenanceJobs that were scheduled on query cluster will continue to run there until it finishesThis is done from time to time – although not too often, we do red-black pushes…
This is initial state – we need to spin up a new cluster, e.g. to push a new feature
* Spin up new cluster, mark it as UP, mark old cluster as OOS
OUT_OF_SERVICE to TERMINATED
Mention that we will be writing a techblog about this soon, with more detailsTwo query clusters – A/B testing new fair share scheduler
Set up desired instance counts across multiple AZ’sDo “red-black” pushes using “sequential ASGs”Loss of individual nodes will cause jobs running on those nodes to be lost
Auto-scaling policy set up to expand if number of running jobs > ~80%
Still biased towards running in the cloud and at Netflix, but will generalize/improve it based on community feedback
* Come listen to how we enable “Data Platform as a Service” – it is truly Lipstick on a Pig.