SlideShare ist ein Scribd-Unternehmen logo
1 von 30
BranchReduce
Distributed Branch-and-Bound on YARN
June 14, 2012
About Me




           Copyright 2012 Cloudera Inc. All rights reserved   2
Hadoop Distributed Processing Frameworks




             Copyright 2012 Cloudera Inc. All rights reserved
Lots of Other Parallel Processing Platforms




               Copyright 2012 Cloudera Inc. All rights reserved
Hadoop 2.0: Resource Scheduling with YARN




              Copyright 2012 Cloudera Inc. All rights reserved
The Data Deluge and the Cambrian Explosion




              Copyright 2012 Cloudera Inc. All rights reserved
Parallel Distributed Processing For Everyone




               Copyright 2012 Cloudera Inc. All rights reserved
Building a New Processing Framework on YARN




           Copyright 2012 Cloudera Inc. All rights reserved
A Terrifyingly Accurate Paraphrasing of JWZ



Some people, when confronted with a tedious
problem, say, “I know, I’ll write a framework.”
Now they have two tedious problems.




               Copyright 2012 Cloudera Inc. All rights reserved
On Designing Frameworks




             Copyright 2012 Cloudera Inc. All rights reserved
The Example YARN App: Distributed Shell




              Copyright 2012 Cloudera Inc. All rights reserved
Do We Need a New Programming Language for
      Developing YARN Applications?




           Copyright 2012 Cloudera Inc. All rights reserved
Do We Need a New Programming Language for
      Developing YARN Applications?




           Copyright 2012 Cloudera Inc. All rights reserved
Leverage Existing Frameworks

 • Popular RPC libraries
   with support for
   multiple languages
    • C++, Java, Python


 • We need to make it
   easy to deploy existing
   applications on YARN



                 Copyright 2012 Cloudera Inc. All rights reserved
Kitten: Playing with YARN




              Copyright 2012 Cloudera Inc. All rights reserved
Design Pattern: The Unified Application Master

                                              • Contains business logic
                                                and YARN logic
                                              • Primary reason:
                                                Communication
                                                     • Also: dynamic resource
                                                       allocation
                                              • Develop our
                                                master/worker
                                                applications locally and
                                                then deploy them on
                                                YARN
              Copyright 2012 Cloudera Inc. All rights reserved
YARN Lifecycle Management as a Service

 • Specifically, extensions
   of Guava’s Service
   interface
    • YarnClientService
    • AppMasterService
 • Contains all of the logic
   for creating applications
   and keeping an eye on
   them


                 Copyright 2012 Cloudera Inc. All rights reserved
Moving the Configuration Logic Out of Java




              Copyright 2012 Cloudera Inc. All rights reserved
Lua as a Configuration Language

 • Small and Simple
    • Looks like a
      configuration file
    • Functions are there
      when/if you need them
 • Inheritance
    • Don’t Repeat Yourself
 • Forgiving of undefined
   values
 • Java/C++ Integration
                 Copyright 2012 Cloudera Inc. All rights reserved
First Kitten Utility: The cat Function




                Copyright 2012 Cloudera Inc. All rights reserved
Second Kitten Utility: The yarn Function




               Copyright 2012 Cloudera Inc. All rights reserved
BranchReduce




Copyright 2012 Cloudera Inc. All rights reserved
Branch-and-Bound




            Copyright 2012 Cloudera Inc. All rights reserved
The Challenge of Parallel Branch and Bound:
Unbalanced Search Space
                                              • Some branches are
                                                pruned quickly

                                              • Can be difficult to
                                                determine the best
                                                splits a priori

                                              • Easy to revert to a de
                                                facto single-threaded
                                                search
              Copyright 2012 Cloudera Inc. All rights reserved
The Solution: Work Stealing




              Copyright 2012 Cloudera Inc. All rights reserved
You Write Three Classes

• A Task class that implements Writable

• A GlobalState class that implements Writable and has a
  mergeWith(GlobalState other) method

• A Processor class that defines:
   • execute(T task, BranchReduceContext<T, GlobalState> ctxt);
   • With optional initialize and cleanup methods


• Configuration is done via BranchReduceJob
                   Copyright 2012 Cloudera Inc. All rights reserved
Example: The Knapsack Problem




    Copyright 2012 Cloudera Inc. All rights reserved
0-1 Integer Programming Problems


 • NP-Hard Resource
   Allocation Problem

 • Portfolio Optimization

 • Asset Securitization



                Copyright 2012 Cloudera Inc. All rights reserved
Problem Formulation: (Simplified) LP Format




              Copyright 2012 Cloudera Inc. All rights reserved
Questions?
@josh_wills

Weitere ähnliche Inhalte

Was ist angesagt?

Oracle Cloud Reference Architecture
Oracle Cloud Reference ArchitectureOracle Cloud Reference Architecture
Oracle Cloud Reference ArchitectureBob Rhubart
 
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoPowered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoGeneXus
 
ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012
ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012
ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012ITCamp
 
Cloud Computing - Making IT Simple
 Cloud Computing - Making IT Simple Cloud Computing - Making IT Simple
Cloud Computing - Making IT SimpleBob Rhubart
 
Custom Development with Novell Teaming
Custom Development with Novell TeamingCustom Development with Novell Teaming
Custom Development with Novell TeamingNovell
 
Oracle cloud story short
Oracle cloud story   shortOracle cloud story   short
Oracle cloud story shortYuri Grinshteyn
 
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightScale
 
Engineered Systems: Oracle’s Vision for the Future
Engineered Systems: Oracle’s Vision for the FutureEngineered Systems: Oracle’s Vision for the Future
Engineered Systems: Oracle’s Vision for the FutureBob Rhubart
 
EMC #1 Open XML Database (OEM)
EMC #1 Open XML Database (OEM)EMC #1 Open XML Database (OEM)
EMC #1 Open XML Database (OEM)Mountaha
 
Novell Success Stories: Collaboration in Travel and Hospitality
Novell Success Stories: Collaboration in Travel and HospitalityNovell Success Stories: Collaboration in Travel and Hospitality
Novell Success Stories: Collaboration in Travel and HospitalityNovell
 
MySQL HA Solutions
MySQL HA SolutionsMySQL HA Solutions
MySQL HA SolutionsMat Keep
 
Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsMark Slingsby
 
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...Novell
 
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...ORACLE USER GROUP ESTONIA
 
Advanced DNS/DHCP for Novell eDirectory Environments
Advanced DNS/DHCP for Novell eDirectory EnvironmentsAdvanced DNS/DHCP for Novell eDirectory Environments
Advanced DNS/DHCP for Novell eDirectory EnvironmentsNovell
 
Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...
Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...
Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...C2B2 Consulting
 
Business Integration for the 21st Century
Business Integration for the 21st Century Business Integration for the 21st Century
Business Integration for the 21st Century Bob Rhubart
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperJeff Smith
 

Was ist angesagt? (18)

Oracle Cloud Reference Architecture
Oracle Cloud Reference ArchitectureOracle Cloud Reference Architecture
Oracle Cloud Reference Architecture
 
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoPowered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
 
ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012
ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012
ITCamp 2012 - Adrian Stoian - Migrating from CFG MGR 2007 to CFG MGR 2012
 
Cloud Computing - Making IT Simple
 Cloud Computing - Making IT Simple Cloud Computing - Making IT Simple
Cloud Computing - Making IT Simple
 
Custom Development with Novell Teaming
Custom Development with Novell TeamingCustom Development with Novell Teaming
Custom Development with Novell Teaming
 
Oracle cloud story short
Oracle cloud story   shortOracle cloud story   short
Oracle cloud story short
 
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
 
Engineered Systems: Oracle’s Vision for the Future
Engineered Systems: Oracle’s Vision for the FutureEngineered Systems: Oracle’s Vision for the Future
Engineered Systems: Oracle’s Vision for the Future
 
EMC #1 Open XML Database (OEM)
EMC #1 Open XML Database (OEM)EMC #1 Open XML Database (OEM)
EMC #1 Open XML Database (OEM)
 
Novell Success Stories: Collaboration in Travel and Hospitality
Novell Success Stories: Collaboration in Travel and HospitalityNovell Success Stories: Collaboration in Travel and Hospitality
Novell Success Stories: Collaboration in Travel and Hospitality
 
MySQL HA Solutions
MySQL HA SolutionsMySQL HA Solutions
MySQL HA Solutions
 
Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web Apps
 
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
Protection against Lost or Stolen Data with Novell ZENworks Endpoint Security...
 
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
 
Advanced DNS/DHCP for Novell eDirectory Environments
Advanced DNS/DHCP for Novell eDirectory EnvironmentsAdvanced DNS/DHCP for Novell eDirectory Environments
Advanced DNS/DHCP for Novell eDirectory Environments
 
Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...
Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...
Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011...
 
Business Integration for the 21st Century
Business Integration for the 21st Century Business Integration for the 21st Century
Business Integration for the 21st Century
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL Developer
 

Ähnlich wie Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN

Machine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and FutureMachine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and FutureData Science London
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk UpdateESUG
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learningjoshwills
 
Machine Learning and Hadoop: Present and future
Machine Learning and Hadoop: Present and futureMachine Learning and Hadoop: Present and future
Machine Learning and Hadoop: Present and futureCloudera, Inc.
 
Building Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid CloudsBuilding Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid CloudsRightScale
 
DB2 z/OS &amp; Java - What\'s New?
DB2 z/OS &amp; Java - What\'s New?DB2 z/OS &amp; Java - What\'s New?
DB2 z/OS &amp; Java - What\'s New?Laura Hood
 
VMware - Snapshot sessions - Deploy and manage tomorrow's applications today
VMware - Snapshot sessions  - Deploy and manage tomorrow's applications todayVMware - Snapshot sessions  - Deploy and manage tomorrow's applications today
VMware - Snapshot sessions - Deploy and manage tomorrow's applications todayAnnSteyaert_vmware
 
Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016Kellyn Pot'Vin-Gorman
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computingJoey Echeverria
 
Open stack powered_cloud_solution_interop
Open stack powered_cloud_solution_interopOpen stack powered_cloud_solution_interop
Open stack powered_cloud_solution_interopKamesh Pemmaraju
 
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto
 
CARA User Interface for Oracle WebCenter
CARA User Interface for Oracle WebCenterCARA User Interface for Oracle WebCenter
CARA User Interface for Oracle WebCentercara4oraclewebcenter
 
WebLogic 12c Developer Deep Dive at Oracle Develop India 2012
WebLogic 12c Developer Deep Dive at Oracle Develop India 2012WebLogic 12c Developer Deep Dive at Oracle Develop India 2012
WebLogic 12c Developer Deep Dive at Oracle Develop India 2012Arun Gupta
 
1 architecture & design
1   architecture & design1   architecture & design
1 architecture & designMark Swarbrick
 
Fusion app integration_con8685_pdf_8685_0001
Fusion app integration_con8685_pdf_8685_0001Fusion app integration_con8685_pdf_8685_0001
Fusion app integration_con8685_pdf_8685_0001jucaab
 

Ähnlich wie Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN (20)

Machine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and FutureMachine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and Future
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk Update
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learning
 
Machine Learning and Hadoop: Present and future
Machine Learning and Hadoop: Present and futureMachine Learning and Hadoop: Present and future
Machine Learning and Hadoop: Present and future
 
Building Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid CloudsBuilding Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid Clouds
 
DB2 z/OS &amp; Java - What\'s New?
DB2 z/OS &amp; Java - What\'s New?DB2 z/OS &amp; Java - What\'s New?
DB2 z/OS &amp; Java - What\'s New?
 
VMware - Snapshot sessions - Deploy and manage tomorrow's applications today
VMware - Snapshot sessions  - Deploy and manage tomorrow's applications todayVMware - Snapshot sessions  - Deploy and manage tomorrow's applications today
VMware - Snapshot sessions - Deploy and manage tomorrow's applications today
 
Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computing
 
Open stack powered_cloud_solution_interop
Open stack powered_cloud_solution_interopOpen stack powered_cloud_solution_interop
Open stack powered_cloud_solution_interop
 
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
 
The great 8 of ODA
The great 8 of ODAThe great 8 of ODA
The great 8 of ODA
 
YARN
YARNYARN
YARN
 
CARA User Interface for Oracle WebCenter
CARA User Interface for Oracle WebCenterCARA User Interface for Oracle WebCenter
CARA User Interface for Oracle WebCenter
 
WebLogic 12c Developer Deep Dive at Oracle Develop India 2012
WebLogic 12c Developer Deep Dive at Oracle Develop India 2012WebLogic 12c Developer Deep Dive at Oracle Develop India 2012
WebLogic 12c Developer Deep Dive at Oracle Develop India 2012
 
Inside hadoop-dev
Inside hadoop-devInside hadoop-dev
Inside hadoop-dev
 
1 architecture & design
1   architecture & design1   architecture & design
1 architecture & design
 
Features
FeaturesFeatures
Features
 
OOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-SharableOOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-Sharable
 
Fusion app integration_con8685_pdf_8685_0001
Fusion app integration_con8685_pdf_8685_0001Fusion app integration_con8685_pdf_8685_0001
Fusion app integration_con8685_pdf_8685_0001
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Kürzlich hochgeladen (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN

  • 2. About Me Copyright 2012 Cloudera Inc. All rights reserved 2
  • 3. Hadoop Distributed Processing Frameworks Copyright 2012 Cloudera Inc. All rights reserved
  • 4. Lots of Other Parallel Processing Platforms Copyright 2012 Cloudera Inc. All rights reserved
  • 5. Hadoop 2.0: Resource Scheduling with YARN Copyright 2012 Cloudera Inc. All rights reserved
  • 6. The Data Deluge and the Cambrian Explosion Copyright 2012 Cloudera Inc. All rights reserved
  • 7. Parallel Distributed Processing For Everyone Copyright 2012 Cloudera Inc. All rights reserved
  • 8. Building a New Processing Framework on YARN Copyright 2012 Cloudera Inc. All rights reserved
  • 9. A Terrifyingly Accurate Paraphrasing of JWZ Some people, when confronted with a tedious problem, say, “I know, I’ll write a framework.” Now they have two tedious problems. Copyright 2012 Cloudera Inc. All rights reserved
  • 10. On Designing Frameworks Copyright 2012 Cloudera Inc. All rights reserved
  • 11. The Example YARN App: Distributed Shell Copyright 2012 Cloudera Inc. All rights reserved
  • 12. Do We Need a New Programming Language for Developing YARN Applications? Copyright 2012 Cloudera Inc. All rights reserved
  • 13. Do We Need a New Programming Language for Developing YARN Applications? Copyright 2012 Cloudera Inc. All rights reserved
  • 14. Leverage Existing Frameworks • Popular RPC libraries with support for multiple languages • C++, Java, Python • We need to make it easy to deploy existing applications on YARN Copyright 2012 Cloudera Inc. All rights reserved
  • 15. Kitten: Playing with YARN Copyright 2012 Cloudera Inc. All rights reserved
  • 16. Design Pattern: The Unified Application Master • Contains business logic and YARN logic • Primary reason: Communication • Also: dynamic resource allocation • Develop our master/worker applications locally and then deploy them on YARN Copyright 2012 Cloudera Inc. All rights reserved
  • 17. YARN Lifecycle Management as a Service • Specifically, extensions of Guava’s Service interface • YarnClientService • AppMasterService • Contains all of the logic for creating applications and keeping an eye on them Copyright 2012 Cloudera Inc. All rights reserved
  • 18. Moving the Configuration Logic Out of Java Copyright 2012 Cloudera Inc. All rights reserved
  • 19. Lua as a Configuration Language • Small and Simple • Looks like a configuration file • Functions are there when/if you need them • Inheritance • Don’t Repeat Yourself • Forgiving of undefined values • Java/C++ Integration Copyright 2012 Cloudera Inc. All rights reserved
  • 20. First Kitten Utility: The cat Function Copyright 2012 Cloudera Inc. All rights reserved
  • 21. Second Kitten Utility: The yarn Function Copyright 2012 Cloudera Inc. All rights reserved
  • 22. BranchReduce Copyright 2012 Cloudera Inc. All rights reserved
  • 23. Branch-and-Bound Copyright 2012 Cloudera Inc. All rights reserved
  • 24. The Challenge of Parallel Branch and Bound: Unbalanced Search Space • Some branches are pruned quickly • Can be difficult to determine the best splits a priori • Easy to revert to a de facto single-threaded search Copyright 2012 Cloudera Inc. All rights reserved
  • 25. The Solution: Work Stealing Copyright 2012 Cloudera Inc. All rights reserved
  • 26. You Write Three Classes • A Task class that implements Writable • A GlobalState class that implements Writable and has a mergeWith(GlobalState other) method • A Processor class that defines: • execute(T task, BranchReduceContext<T, GlobalState> ctxt); • With optional initialize and cleanup methods • Configuration is done via BranchReduceJob Copyright 2012 Cloudera Inc. All rights reserved
  • 27. Example: The Knapsack Problem Copyright 2012 Cloudera Inc. All rights reserved
  • 28. 0-1 Integer Programming Problems • NP-Hard Resource Allocation Problem • Portfolio Optimization • Asset Securitization Copyright 2012 Cloudera Inc. All rights reserved
  • 29. Problem Formulation: (Simplified) LP Format Copyright 2012 Cloudera Inc. All rights reserved

Hinweis der Redaktion

  1. 1600+ Lines of Java Code796 Client, 837 Application MasterRelatively Simple Lifecycle PatternsStart Up: RPCs and Resource DefinitionsPeriodic Heartbeat ChecksShut Down: Resource CleanupNo advanced featuresCheckpointingApplication Master failuresDynamic resource allocation