SlideShare a Scribd company logo
1 of 13
Download to read offline
Long Haul Hadoop
Steve Loughran




© 2009 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice
Recurrent theme

 Issue           Title
 HADOOP-3421     Requirements for a Resource Manager
 HADOOP-4559     REST API for job/task statistics
 MAPREDUCE-454   Service interface for Job submission
 MAPREDUCE-445   Web service interface to the job tracker
 HADOOP-5123     Ant tasks for job submission
In-datacentre: binary

  Hadoop RPC, AVRO, Apache Thrift...
  Space-efficient text and data
  Performance through efficient marshalling,
 writable re-use
  Brittle
         – fails with odd messages
  Insecure – caller is trusted!
Requirements

  Works through proxies and firewalls
  Robust against cluster upgrades
  Deploys  tasks and sequences of tasks
  Monitor jobs, logs
  Pause,  resume, kill jobs
  Only trusted users to submit, manage jobs
WS-*
WS-*

  Integrates with SOA story
  Integration with WS-* Management layer
  BPEL-ready
  Auto-generation of WSDL from server code
  Auto-generation of client code from WSDL
  Stable Apache stacks (Axis2, CXF)‫‏‬
  Security through HTTPS and WS-Security
WS-* Problems

 Generated WSDL is a maintenance mess
 Generated clients and server code brittle
 No stable WS-* Management layer (yet)
 WS-Security
 Little/no web browser integration
 Binary data tricky
REST
Pure REST

  PUT  and DELETE
  Cleaner, purer
  Restlet is best client (LGPL)‫‏‬
  Needs HTML5 browser for in-browser PUT


Model jobs as URLs you PUT;
State through attributes you manipulate
HTTP POST

  Easy  to use from web browser
  Include URLs in bugreps
  Security through HTTPS only
  HttpClient for client API
  Binary files through form/multipart


Have a job manager you POST to;
URLs for each created job
JobTracker or Workflow?

1. HADOOP-5303: Oozie, Hadoop Workflow
2. Cascading
3. Pig
4. BPEL
5. Constraint-based languages


The REST model could be independent of the
 back-end
Status as of Aug 6 2009

 1. Can deploy SmartFrog components and hence
   workflows or constrained models
 2. Portlet engine runs in datacentre
 3. No REST API
 4. Need access to logs, rest of JobTracker
 5. Issue: how best to configure the work?
Long Haul Hadoop

More Related Content

What's hot

Introduction to REST - API
Introduction to REST - APIIntroduction to REST - API
Introduction to REST - APIChetan Gadodia
 
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...Niels de Bruijn
 
Offline Web with Oracle JET
Offline Web with Oracle JETOffline Web with Oracle JET
Offline Web with Oracle JETandrejusb
 
Oracle JET CRUD and ADF BC REST
Oracle JET CRUD and ADF BC RESTOracle JET CRUD and ADF BC REST
Oracle JET CRUD and ADF BC RESTandrejusb
 
(ATS6-PLAT07) Managing AEP in an enterprise environment
(ATS6-PLAT07) Managing AEP in an enterprise environment(ATS6-PLAT07) Managing AEP in an enterprise environment
(ATS6-PLAT07) Managing AEP in an enterprise environmentBIOVIA
 
Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...
Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...
Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...Andrejs Prokopjevs
 
Development Journey of the Shopware Administration
Development Journey of the Shopware AdministrationDevelopment Journey of the Shopware Administration
Development Journey of the Shopware Administrationklarstil
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBJitendra Bafna
 
2.2 Reliable Message Bus based on RocketMQ
2.2 Reliable Message Bus based on RocketMQ2.2 Reliable Message Bus based on RocketMQ
2.2 Reliable Message Bus based on RocketMQ振东 刘
 
eBPF - Observability In Deep
eBPF - Observability In DeepeBPF - Observability In Deep
eBPF - Observability In DeepMydbops
 
Principal Propagation with SAP Cloud Platform
Principal Propagation with SAP Cloud PlatformPrincipal Propagation with SAP Cloud Platform
Principal Propagation with SAP Cloud PlatformGary Jackson MBCS
 
De-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerDe-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerJosh Elser
 
Mule and web services
Mule and web servicesMule and web services
Mule and web servicesManav Prasad
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Josh Elser
 
Using Azure cloud and Firebird to develop applications easily
Using Azure cloud and Firebird to develop applications easilyUsing Azure cloud and Firebird to develop applications easily
Using Azure cloud and Firebird to develop applications easilyMind The Firebird
 

What's hot (20)

Introduction to REST - API
Introduction to REST - APIIntroduction to REST - API
Introduction to REST - API
 
Koha ILMS
Koha ILMSKoha ILMS
Koha ILMS
 
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
 
Offline Web with Oracle JET
Offline Web with Oracle JETOffline Web with Oracle JET
Offline Web with Oracle JET
 
Oracle JET CRUD and ADF BC REST
Oracle JET CRUD and ADF BC RESTOracle JET CRUD and ADF BC REST
Oracle JET CRUD and ADF BC REST
 
(ATS6-PLAT07) Managing AEP in an enterprise environment
(ATS6-PLAT07) Managing AEP in an enterprise environment(ATS6-PLAT07) Managing AEP in an enterprise environment
(ATS6-PLAT07) Managing AEP in an enterprise environment
 
Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...
Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...
Optimize DR and Cloning with Logical Hostnames in Oracle E-Business Suite (OA...
 
Development Journey of the Shopware Administration
Development Journey of the Shopware AdministrationDevelopment Journey of the Shopware Administration
Development Journey of the Shopware Administration
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESB
 
Velocity - Edge UG
Velocity - Edge UGVelocity - Edge UG
Velocity - Edge UG
 
2.2 Reliable Message Bus based on RocketMQ
2.2 Reliable Message Bus based on RocketMQ2.2 Reliable Message Bus based on RocketMQ
2.2 Reliable Message Bus based on RocketMQ
 
eBPF - Observability In Deep
eBPF - Observability In DeepeBPF - Observability In Deep
eBPF - Observability In Deep
 
Principal Propagation with SAP Cloud Platform
Principal Propagation with SAP Cloud PlatformPrincipal Propagation with SAP Cloud Platform
Principal Propagation with SAP Cloud Platform
 
De-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerDe-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServer
 
Mule and web services
Mule and web servicesMule and web services
Mule and web services
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016
 
Rest API
Rest APIRest API
Rest API
 
REST API V2
REST API V2REST API V2
REST API V2
 
Using Azure cloud and Firebird to develop applications easily
Using Azure cloud and Firebird to develop applications easilyUsing Azure cloud and Firebird to develop applications easily
Using Azure cloud and Firebird to develop applications easily
 
Apache web service
Apache web serviceApache web service
Apache web service
 

Viewers also liked

Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloudSteve Loughran
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataCloudera, Inc.
 
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBaseHBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBaseCloudera, Inc.
 
HBaseCon 2013: Apache HBase Replication
HBaseCon 2013: Apache HBase ReplicationHBaseCon 2013: Apache HBase Replication
HBaseCon 2013: Apache HBase ReplicationCloudera, Inc.
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseCloudera, Inc.
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseJosh Elser
 

Viewers also liked (9)

Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloud
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at Pinterest
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
 
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBaseHBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
 
HBaseCon 2013: Apache HBase Replication
HBaseCon 2013: Apache HBase ReplicationHBaseCon 2013: Apache HBase Replication
HBaseCon 2013: Apache HBase Replication
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBase
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 

Similar to Long Haul Hadoop

apresentacao_apache2..
apresentacao_apache2..apresentacao_apache2..
apresentacao_apache2..webhostingguy
 
apresentacao_apache2..
apresentacao_apache2..apresentacao_apache2..
apresentacao_apache2..webhostingguy
 
Web API or WCF - An Architectural Comparison
Web API or WCF - An Architectural ComparisonWeb API or WCF - An Architectural Comparison
Web API or WCF - An Architectural ComparisonAdnan Masood
 
RESTFul Tools For Lazy Experts - CFSummit 2016
RESTFul Tools For Lazy Experts - CFSummit 2016RESTFul Tools For Lazy Experts - CFSummit 2016
RESTFul Tools For Lazy Experts - CFSummit 2016Ortus Solutions, Corp
 
Iron.io Technical Overview
Iron.io Technical OverviewIron.io Technical Overview
Iron.io Technical OverviewChad Arimura
 
HLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and FrameworkHLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and FrameworkDániel Stein
 
New Roles In The Cloud
New Roles In The CloudNew Roles In The Cloud
New Roles In The CloudSteve Loughran
 
Ch 22: Web Hosting and Internet Servers
Ch 22: Web Hosting and Internet ServersCh 22: Web Hosting and Internet Servers
Ch 22: Web Hosting and Internet Serverswebhostingguy
 
Seattle StrongLoop Node.js Workshop
Seattle StrongLoop Node.js WorkshopSeattle StrongLoop Node.js Workshop
Seattle StrongLoop Node.js WorkshopJimmy Guerrero
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stackRed Hat
 
Oracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewOracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewKris Rice
 
IP based standards for IoT
IP based standards for IoTIP based standards for IoT
IP based standards for IoTMichael Koster
 
.NET Core Today and Tomorrow
.NET Core Today and Tomorrow.NET Core Today and Tomorrow
.NET Core Today and TomorrowJon Galloway
 
Advanced Web Development in PHP - Understanding REST API
Advanced Web Development in PHP - Understanding REST APIAdvanced Web Development in PHP - Understanding REST API
Advanced Web Development in PHP - Understanding REST APIRasan Samarasinghe
 
SQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopSQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopMukund Babbar
 
Pivotal HAWQ 소개
Pivotal HAWQ 소개Pivotal HAWQ 소개
Pivotal HAWQ 소개Seungdon Choi
 

Similar to Long Haul Hadoop (20)

apresentacao_apache2..
apresentacao_apache2..apresentacao_apache2..
apresentacao_apache2..
 
apresentacao_apache2..
apresentacao_apache2..apresentacao_apache2..
apresentacao_apache2..
 
Web API or WCF - An Architectural Comparison
Web API or WCF - An Architectural ComparisonWeb API or WCF - An Architectural Comparison
Web API or WCF - An Architectural Comparison
 
Rest ful tools for lazy experts
Rest ful tools for lazy expertsRest ful tools for lazy experts
Rest ful tools for lazy experts
 
RESTFul Tools For Lazy Experts - CFSummit 2016
RESTFul Tools For Lazy Experts - CFSummit 2016RESTFul Tools For Lazy Experts - CFSummit 2016
RESTFul Tools For Lazy Experts - CFSummit 2016
 
Iron.io Technical Overview
Iron.io Technical OverviewIron.io Technical Overview
Iron.io Technical Overview
 
Getting Started with API Management
Getting Started with API ManagementGetting Started with API Management
Getting Started with API Management
 
HLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and FrameworkHLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and Framework
 
New Roles In The Cloud
New Roles In The CloudNew Roles In The Cloud
New Roles In The Cloud
 
Ch 22: Web Hosting and Internet Servers
Ch 22: Web Hosting and Internet ServersCh 22: Web Hosting and Internet Servers
Ch 22: Web Hosting and Internet Servers
 
Seattle StrongLoop Node.js Workshop
Seattle StrongLoop Node.js WorkshopSeattle StrongLoop Node.js Workshop
Seattle StrongLoop Node.js Workshop
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stack
 
Oracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewOracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ Overview
 
IP based standards for IoT
IP based standards for IoTIP based standards for IoT
IP based standards for IoT
 
.NET Core Today and Tomorrow
.NET Core Today and Tomorrow.NET Core Today and Tomorrow
.NET Core Today and Tomorrow
 
Running PHP In The Cloud
Running PHP In The CloudRunning PHP In The Cloud
Running PHP In The Cloud
 
Advanced Web Development in PHP - Understanding REST API
Advanced Web Development in PHP - Understanding REST APIAdvanced Web Development in PHP - Understanding REST API
Advanced Web Development in PHP - Understanding REST API
 
SQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopSQL and Machine Learning on Hadoop
SQL and Machine Learning on Hadoop
 
Couch db
Couch dbCouch db
Couch db
 
Pivotal HAWQ 소개
Pivotal HAWQ 소개Pivotal HAWQ 소개
Pivotal HAWQ 소개
 

More from Steve Loughran

The age of rename() is over
The age of rename() is overThe age of rename() is over
The age of rename() is overSteve Loughran
 
What does Rename Do: (detailed version)
What does Rename Do: (detailed version)What does Rename Do: (detailed version)
What does Rename Do: (detailed version)Steve Loughran
 
Put is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit EditionPut is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit EditionSteve Loughran
 
@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!Steve Loughran
 
PUT is the new rename()
PUT is the new rename()PUT is the new rename()
PUT is the new rename()Steve Loughran
 
Extreme Programming Deployed
Extreme Programming DeployedExtreme Programming Deployed
Extreme Programming DeployedSteve Loughran
 
What does rename() do?
What does rename() do?What does rename() do?
What does rename() do?Steve Loughran
 
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveDancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveSteve Loughran
 
Apache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupApache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupSteve Loughran
 
Spark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSpark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSteve Loughran
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresSteve Loughran
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object StoresSteve Loughran
 
Household INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraHousehold INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraSteve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionSteve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateSteve Loughran
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARNSteve Loughran
 

More from Steve Loughran (20)

Hadoop Vectored IO
Hadoop Vectored IOHadoop Vectored IO
Hadoop Vectored IO
 
The age of rename() is over
The age of rename() is overThe age of rename() is over
The age of rename() is over
 
What does Rename Do: (detailed version)
What does Rename Do: (detailed version)What does Rename Do: (detailed version)
What does Rename Do: (detailed version)
 
Put is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit EditionPut is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit Edition
 
@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!
 
PUT is the new rename()
PUT is the new rename()PUT is the new rename()
PUT is the new rename()
 
Extreme Programming Deployed
Extreme Programming DeployedExtreme Programming Deployed
Extreme Programming Deployed
 
Testing
TestingTesting
Testing
 
I hate mocking
I hate mockingI hate mocking
I hate mocking
 
What does rename() do?
What does rename() do?What does rename() do?
What does rename() do?
 
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveDancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
 
Apache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupApache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User Group
 
Spark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSpark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object stores
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
 
Household INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraHousehold INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony Era
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
 
YARN Services
YARN ServicesYARN Services
YARN Services
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Long Haul Hadoop

  • 1. Long Haul Hadoop Steve Loughran © 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  • 2. Recurrent theme Issue Title HADOOP-3421 Requirements for a Resource Manager HADOOP-4559 REST API for job/task statistics MAPREDUCE-454 Service interface for Job submission MAPREDUCE-445 Web service interface to the job tracker HADOOP-5123 Ant tasks for job submission
  • 3. In-datacentre: binary   Hadoop RPC, AVRO, Apache Thrift...   Space-efficient text and data   Performance through efficient marshalling, writable re-use   Brittle – fails with odd messages   Insecure – caller is trusted!
  • 4. Requirements   Works through proxies and firewalls   Robust against cluster upgrades   Deploys tasks and sequences of tasks   Monitor jobs, logs   Pause, resume, kill jobs   Only trusted users to submit, manage jobs
  • 6. WS-*   Integrates with SOA story   Integration with WS-* Management layer   BPEL-ready   Auto-generation of WSDL from server code   Auto-generation of client code from WSDL   Stable Apache stacks (Axis2, CXF)‫‏‬   Security through HTTPS and WS-Security
  • 7. WS-* Problems  Generated WSDL is a maintenance mess  Generated clients and server code brittle  No stable WS-* Management layer (yet)  WS-Security  Little/no web browser integration  Binary data tricky
  • 9. Pure REST   PUT and DELETE   Cleaner, purer   Restlet is best client (LGPL)‫‏‬   Needs HTML5 browser for in-browser PUT Model jobs as URLs you PUT; State through attributes you manipulate
  • 10. HTTP POST   Easy to use from web browser   Include URLs in bugreps   Security through HTTPS only   HttpClient for client API   Binary files through form/multipart Have a job manager you POST to; URLs for each created job
  • 11. JobTracker or Workflow? 1. HADOOP-5303: Oozie, Hadoop Workflow 2. Cascading 3. Pig 4. BPEL 5. Constraint-based languages The REST model could be independent of the back-end
  • 12. Status as of Aug 6 2009 1. Can deploy SmartFrog components and hence workflows or constrained models 2. Portlet engine runs in datacentre 3. No REST API 4. Need access to logs, rest of JobTracker 5. Issue: how best to configure the work?