SlideShare ist ein Scribd-Unternehmen logo
1 von 19
1
Oozie High Availability (HA)
Robert Kanter
2
High Availability
• A system without non-planned downtime when
partial failures occur
• Typically achieved by having redundancies and removing
single-points of failure
• Our Goals
• Don’t change the API or usage patterns
• User doesn’t even have to know its HA
3
The HA Solution
Architectural Overview
4
The HA Solution: Database
• Oozie stores all state in a database
• (submitted jobs, workflow definitions, etc)
• Instead of a failover model, we want to run many
Oozie servers against the same database
• Active-Active HA
• Also provides horizontal scalability
• ZooKeeper for coordination
5
The HA Solution: Database
6
The HA Solution: Access
• Users and client programs need a single address to
connect (Web UI, REST/Java API, JobTracker callbacks,
etc)
• Load Balancer, Virtual IP, or DNS round-robin can be
used to provide a single entry point to the Oozie
servers
• Technically also needs to be HA
7
The HA Solution: Access
8
The HA Solution: Log Streaming
• Oozie’s log files are not in the database
• Each Oozie Server only has access to its own logs
• Jobs are not assigned to a specific Oozie server
• What if Oozie Server A wants to get logs for a job
processed by Oozie Server B?
• Oozie Server A can ask Oozie Server B for its logs
• Caveat: If an Oozie Server goes down, any logs from it will
be unavailable until it is brought back up
9
The HA Solution: Log Streaming
10
How to Enable HA
Configuration and Security
11
How to Enable HA
• Setup Load balancer, ZooKeeper ensemble, HA database,
and multiple identically configured Oozie servers
• Enable Oozie HA services:
<property>
<name>oozie.services.ext</name>
<value>
org.apache.oozie.service.ZKLocksService,
org.apache.oozie.service.ZKXLogStreamingService,
org.apache.oozie.service.ZKJobsConcurrencyService
</value>
</property>
12
How to Enable HA
• Point Oozie to ZooKeeper Ensemble:
<property>
<name>oozie.zookeeper.connection.string</name>
<value>ZK_HOST1:2181,ZK_HOST2:2181</value>
</property>
• Point environment variable for callbacks to load
balancer:
export OOZIE_BASE_URL="http://loadbalancer:11000/oozie"
13
How to Enable HA: Security
• Extra step to configure Kerberos with Load Balancer:
<property>
<name>
oozie.authentication.kerberos.principal
</name>
<value>HTTP/loadbalancer@REALM</value>
</property>
• Note: this currently prevents clients from talking
directly to any Oozie server
14
How to Enable HA: Security
• Enable Kerberos connection to ZooKeeper and ACLs:
<property>
<name>oozie.zookeeper.secure</name>
<value>true</value>
</property>
• ACLs prevent malicious users or programs from
interfering with Oozie’s znodes
15
Using Oozie with HA
16
Using Oozie with HA
• New Oozie CLI/REST API command to list all servers
$ oozie admin -oozie http://loadbalancer:11000/oozie -servers
hostA : http://hostA:11000/oozie
hostB : http://hostB:11000/oozie
hostC : http://hostC:11000/oozie
• Log messages now include which server wrote them
2013-09-29 16:46:20,182 WARN
org.apache.oozie.command.wf.ActionStartXCommand:
SERVER[hostA] USER[root] GROUP[-] TOKEN[] APP[demo-wf]
JOB[0000000-130925230553293-oozie-oozi-W] ACTION[0000000-
130925230553293-oozie-oozi-W@streaming-node] [***0000000-
130925230553293-oozie-oozi-W@streaming-node***]Action
status=RUNNING
17
To Do
What’s left
18
To Do
• HA support for SLAs and HCatalog integration
• Sharelib Purging with HA
• Log Streaming HA
• With Kerberos, Oozie servers can’t talk to each other
• Breaks log streaming, sharelibupdate
• Other misc improvements
19

Weitere ähnliche Inhalte

Was ist angesagt?

Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NApache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Yahoo Developer Network
 

Was ist angesagt? (20)

Oozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeepOozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeep
 
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
 
Data Pipeline Management Framework on Oozie
Data Pipeline Management Framework on OozieData Pipeline Management Framework on Oozie
Data Pipeline Management Framework on Oozie
 
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12
 
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for HadoopMay 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
 
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NApache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
 
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud InfrastructureSCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
 
Oozie Summit 2011
Oozie Summit 2011Oozie Summit 2011
Oozie Summit 2011
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
 
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deployment
 
Oozie or Easy: Managing Hadoop Workloads the EASY Way
Oozie or Easy: Managing Hadoop Workloads the EASY WayOozie or Easy: Managing Hadoop Workloads the EASY Way
Oozie or Easy: Managing Hadoop Workloads the EASY Way
 
Cool MariaDB Plugins
Cool MariaDB Plugins Cool MariaDB Plugins
Cool MariaDB Plugins
 
SQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12cSQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12c
 
Hitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning toolsHitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning tools
 
DevOps for DBAs
DevOps for DBAsDevOps for DBAs
DevOps for DBAs
 
Reactive Jersey Client
Reactive Jersey ClientReactive Jersey Client
Reactive Jersey Client
 
Gradle - Build System
Gradle - Build SystemGradle - Build System
Gradle - Build System
 
Agile Database Development with Liquibase
Agile Database Development with LiquibaseAgile Database Development with Liquibase
Agile Database Development with Liquibase
 
Creating Modular Test-Driven SPAs with Spring and AngularJS
Creating Modular Test-Driven SPAs with Spring and AngularJSCreating Modular Test-Driven SPAs with Spring and AngularJS
Creating Modular Test-Driven SPAs with Spring and AngularJS
 

Andere mochten auch

HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
Zheng Shao
 

Andere mochten auch (8)

July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieBuilding and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
A Basic Hive Inspection
A Basic Hive InspectionA Basic Hive Inspection
A Basic Hive Inspection
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
 

Ähnlich wie Oozie meetup - HA

Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
DataWorks Summit
 
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
DataWorks Summit
 
One daytalk hbraun_oct2011
One daytalk hbraun_oct2011One daytalk hbraun_oct2011
One daytalk hbraun_oct2011
hbraun
 

Ähnlich wie Oozie meetup - HA (20)

What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
 
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
 
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
 
Breathing new life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow ManagerBreathing new life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow Manager
 
Before OTD EDU - Introduction
Before OTD EDU - IntroductionBefore OTD EDU - Introduction
Before OTD EDU - Introduction
 
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
 
What's new in chef 12
What's new in chef 12 What's new in chef 12
What's new in chef 12
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Oracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new featuresOracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new features
 
One daytalk hbraun_oct2011
One daytalk hbraun_oct2011One daytalk hbraun_oct2011
One daytalk hbraun_oct2011
 
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
 
Hadoop Oozie
Hadoop OozieHadoop Oozie
Hadoop Oozie
 
Oracle Fusion Middleware provisioning with Puppet
Oracle Fusion Middleware provisioning with PuppetOracle Fusion Middleware provisioning with Puppet
Oracle Fusion Middleware provisioning with Puppet
 
oozieee.pdf
oozieee.pdfoozieee.pdf
oozieee.pdf
 
Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3
 
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1
 
8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab
 

Kürzlich hochgeladen

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 

Kürzlich hochgeladen (20)

(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 

Oozie meetup - HA

  • 1. 1 Oozie High Availability (HA) Robert Kanter
  • 2. 2 High Availability • A system without non-planned downtime when partial failures occur • Typically achieved by having redundancies and removing single-points of failure • Our Goals • Don’t change the API or usage patterns • User doesn’t even have to know its HA
  • 4. 4 The HA Solution: Database • Oozie stores all state in a database • (submitted jobs, workflow definitions, etc) • Instead of a failover model, we want to run many Oozie servers against the same database • Active-Active HA • Also provides horizontal scalability • ZooKeeper for coordination
  • 6. 6 The HA Solution: Access • Users and client programs need a single address to connect (Web UI, REST/Java API, JobTracker callbacks, etc) • Load Balancer, Virtual IP, or DNS round-robin can be used to provide a single entry point to the Oozie servers • Technically also needs to be HA
  • 8. 8 The HA Solution: Log Streaming • Oozie’s log files are not in the database • Each Oozie Server only has access to its own logs • Jobs are not assigned to a specific Oozie server • What if Oozie Server A wants to get logs for a job processed by Oozie Server B? • Oozie Server A can ask Oozie Server B for its logs • Caveat: If an Oozie Server goes down, any logs from it will be unavailable until it is brought back up
  • 9. 9 The HA Solution: Log Streaming
  • 10. 10 How to Enable HA Configuration and Security
  • 11. 11 How to Enable HA • Setup Load balancer, ZooKeeper ensemble, HA database, and multiple identically configured Oozie servers • Enable Oozie HA services: <property> <name>oozie.services.ext</name> <value> org.apache.oozie.service.ZKLocksService, org.apache.oozie.service.ZKXLogStreamingService, org.apache.oozie.service.ZKJobsConcurrencyService </value> </property>
  • 12. 12 How to Enable HA • Point Oozie to ZooKeeper Ensemble: <property> <name>oozie.zookeeper.connection.string</name> <value>ZK_HOST1:2181,ZK_HOST2:2181</value> </property> • Point environment variable for callbacks to load balancer: export OOZIE_BASE_URL="http://loadbalancer:11000/oozie"
  • 13. 13 How to Enable HA: Security • Extra step to configure Kerberos with Load Balancer: <property> <name> oozie.authentication.kerberos.principal </name> <value>HTTP/loadbalancer@REALM</value> </property> • Note: this currently prevents clients from talking directly to any Oozie server
  • 14. 14 How to Enable HA: Security • Enable Kerberos connection to ZooKeeper and ACLs: <property> <name>oozie.zookeeper.secure</name> <value>true</value> </property> • ACLs prevent malicious users or programs from interfering with Oozie’s znodes
  • 16. 16 Using Oozie with HA • New Oozie CLI/REST API command to list all servers $ oozie admin -oozie http://loadbalancer:11000/oozie -servers hostA : http://hostA:11000/oozie hostB : http://hostB:11000/oozie hostC : http://hostC:11000/oozie • Log messages now include which server wrote them 2013-09-29 16:46:20,182 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostA] USER[root] GROUP[-] TOKEN[] APP[demo-wf] JOB[0000000-130925230553293-oozie-oozi-W] ACTION[0000000- 130925230553293-oozie-oozi-W@streaming-node] [***0000000- 130925230553293-oozie-oozi-W@streaming-node***]Action status=RUNNING
  • 18. 18 To Do • HA support for SLAs and HCatalog integration • Sharelib Purging with HA • Log Streaming HA • With Kerberos, Oozie servers can’t talk to each other • Breaks log streaming, sharelibupdate • Other misc improvements
  • 19. 19