SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Hadoop	
  Nextgen/MRv2/YARN	
  
           Sharad	
  Agarwal	
  
         sharad@apache.org	
  
About	
  me	
  
•  Apache	
  FoundaAon	
  
   –  Hadoop	
  CommiDer	
  and	
  PMC	
  member	
  
   –  Hadoop	
  MR	
  contributor	
  ~	
  4	
  years	
  
   –  Author	
  of	
  Hadoop	
  Nextgen	
  core	
  


•  Head	
  of	
  Technology	
  PlaKorms	
  @InMobi	
  
   –  Formerly	
  Architect	
  @Yahoo!	
  
   	
  
Hadoop	
  Map-­‐Reduce	
  Today	
  
•  JobTracker	
  
   –  Manages	
  cluster	
  
      resources	
  and	
  job	
  
      scheduling	
  
•  TaskTracker	
  
   –  Per-­‐node	
  agent	
  
   –  Manage	
  tasks	
  
Current	
  LimitaAons	
  
•  Scalability	
  
    –  Maximum	
  Cluster	
  size	
  –	
  4,000	
  nodes	
  
    –  Maximum	
  concurrent	
  tasks	
  –	
  40,000	
  
    –  Coarse	
  synchronizaAon	
  in	
  JobTracker	
  
•  Single	
  point	
  of	
  failure	
  
                                   	
  
    –  Failure	
  kills	
  all	
  queued	
  and	
  running	
  jobs	
  
    –  Jobs	
  need	
  to	
  be	
  re-­‐submiDed	
  by	
  users	
  
•  Restart	
  is	
  very	
  tricky	
  due	
  to	
  complex	
  state	
  
•  Hard	
  parAAon	
  of	
  resources	
  into	
  map	
  and	
  
   reduce	
  slots	
  
Current	
  LimitaAons	
  
•  Lacks	
  support	
  for	
  alternate	
  paradigms	
  
    –  IteraAve	
  applicaAons	
  implemented	
  using	
  Map-­‐
       Reduce	
  are	
  10x	
  slower.	
  	
  
    –  Example:	
  K-­‐Means,	
  PageRank	
  
•  Lack	
  of	
  wire-­‐compaAble	
  protocols	
  	
  
    –  Client	
  and	
  cluster	
  must	
  be	
  of	
  same	
  version	
  
    –  ApplicaAons	
  and	
  workflows	
  cannot	
  migrate	
  to	
  
       different	
  clusters	
  
Next	
  GeneraAon	
  Map-­‐Reduce	
  
                   Requirements	
  
•  Reliability	
  
•  Availability	
  
•  Scalability	
  -­‐	
  Clusters	
  of	
  6,000	
  machines	
  
    –  Each	
  machine	
  with	
  16	
  cores,	
  48G	
  RAM,	
  24TB	
  disks	
  
    –  100,000	
  concurrent	
  tasks	
  
    –  10,000	
  concurrent	
  jobs	
  
•  Wire	
  CompaAbility	
  
•  Agility	
  &	
  EvoluAon	
  –	
  Ability	
  for	
  customers	
  to	
  
   control	
  upgrades	
  to	
  the	
  grid	
  sodware	
  stack.	
  
Next	
  GeneraAon	
  Map-­‐Reduce	
  
                   Architecture	
  

•  Split	
  up	
  the	
  two	
  major	
  funcAons	
  of	
  JobTracker	
  
    –  Cluster	
  resource	
  management	
  
    –  ApplicaAon	
  life-­‐cycle	
  management	
  
•  Map-­‐Reduce	
  becomes	
  user-­‐land	
  library	
  
Architecture	
  
                                         Node
                                         Node
                                        Manager
                                        Manager


                                  Container   App Mstr
                                              App Mstr


Client

                      Resource
                      Resource           Node
                                         Node
                      Manager           Manager
                                        Manager
                      Manager
 Client
Client

                                  App Mstr    Container
                                              Container




 MapReduce Status                        Node
                                         Node
 MapReduce Status                       Manager
                                        Manager
   Job Submission
  Job Submission
    Node Status
   Node Status
 Resource Request
 Resource Request                 Container   Container
Architecture	
  
•  Resource	
  Manager	
  
    –  Global	
  resource	
  scheduler	
  
    –  Hierarchical	
  queues	
  
•  Node	
  Manager	
  
    –  Per-­‐machine	
  agent	
  
    –  Manages	
  the	
  life-­‐cycle	
  of	
  container	
  
    –  Container	
  resource	
  monitoring	
  
•  ApplicaAon	
  Master	
  
    –  Per-­‐applicaAon	
  
    –  Manages	
  applicaAon	
  scheduling	
  and	
  task	
  execuAon	
  
    –  E.g.	
  Map-­‐Reduce	
  ApplicaAon	
  Master	
  
 Improvements	
  vis-­‐à-­‐vis	
  current	
  Map-­‐
                Reduce	
  
•  Scalability	
  	
  
    –  ApplicaAon	
  life-­‐cycle	
  management	
  is	
  very	
  
       expensive	
  
    –  ParAAon	
  resource	
  management	
  and	
  
       applicaAon	
  life-­‐cycle	
  management	
  
    –  ApplicaAon	
  management	
  is	
  distributed	
  
    –  Hardware	
  trends	
  -­‐	
  Currently	
  run	
  clusters	
  of	
  
       4,000	
  machines	
  
         •  6,000	
  2012	
  machines	
  >	
  12,000	
  2009	
  machines	
  
         •  <8	
  cores,	
  16G,	
  4TB>	
  v/s	
  <16+	
  cores,	
  48/96G,	
  
            24TB>	
  
 Improvements	
  vis-­‐à-­‐vis	
  current	
  Map-­‐
                Reduce	
  
•  Availability	
  	
  
    –  ApplicaAon	
  Master	
  
         •  OpAonal	
  failover	
  via	
  applicaAon-­‐specific	
  
            checkpoint	
  
         •  Map-­‐Reduce	
  applicaAons	
  pick	
  up	
  where	
  they	
  
            led	
  off	
  
    –  Resource	
  Manager	
  
         •  No	
  single	
  point	
  of	
  failure	
  -­‐	
  failover	
  via	
  
            ZooKeeper	
  
         •  ApplicaAon	
  Masters	
  are	
  restarted	
  
            automaAcally	
  
 Improvements	
  vis-­‐à-­‐vis	
  current	
  Map-­‐
                Reduce	
  
•  Wire	
  CompaAbility	
  	
  
   –  Protocols	
  are	
  wire-­‐compaAble	
  
   –  Old	
  clients	
  can	
  talk	
  to	
  new	
  servers	
  
   –  Rolling	
  upgrades	
  
 Improvements	
  vis-­‐à-­‐vis	
  current	
  Map-­‐
                Reduce	
  
•  Agility	
  /	
  EvoluAon	
  	
  
    –  Map-­‐Reduce	
  now	
  becomes	
  a	
  user-­‐land	
  
       library	
  
    –  MulAple	
  versions	
  of	
  Map-­‐Reduce	
  can	
  run	
  
       in	
  the	
  same	
  cluster	
  (ala	
  Apache	
  Pig)	
  
         •  Faster	
  deployment	
  cycles	
  for	
  improvements	
  
    –  Customers	
  upgrade	
  Map-­‐Reduce	
  versions	
  
       on	
  their	
  schedule	
  
 Improvements	
  vis-­‐à-­‐vis	
  current	
  Map-­‐
                Reduce	
  
•  UAlizaAon	
  
   –  Generic	
  resource	
  model	
  	
  
       •  Memory	
  
       •  CPU	
  
       •  Disk	
  b/w	
  
       •  Network	
  b/w	
  
   –  Remove	
  fixed	
  parAAon	
  of	
  map	
  and	
  reduce	
  
      slots	
  
 Improvements	
  vis-­‐à-­‐vis	
  current	
  Map-­‐
                Reduce	
  
•  Support	
  for	
  programming	
  paradigms	
  
   other	
  than	
  Map-­‐Reduce	
  
   –  MPI	
  
   –  Master-­‐Worker	
  
   –  Machine	
  Learning	
  
   –  IteraAve	
  processing	
  
   –  Enabled	
  by	
  allowing	
  use	
  of	
  paradigm-­‐
      specific	
  ApplicaAon	
  Master	
  
   –  Run	
  all	
  on	
  the	
  same	
  Hadoop	
  cluster	
  
Summary	
  
•  The	
  next	
  generaAon	
  of	
  Map-­‐Reduce	
  takes	
  
   Hadoop	
  to	
  the	
  next	
  level	
  
   –  Scale-­‐out	
  even	
  further	
  
   –  High	
  availability	
  
   –  Cluster	
  UAlizaAon	
  	
  
   –  Support	
  for	
  paradigms	
  other	
  than	
  Map-­‐Reduce	
  
Status	
  
•  Apache	
  Hadoop	
  0.23	
  release	
  is	
  out	
  
     –  HDFS	
  FederaAon	
  
     –  MRv2	
  
•  Currently	
  undergoing	
  tests	
  on	
  Small	
  scale	
  ~	
  500	
  nodes	
  
•  Alpha	
  	
  
     –  2000	
  nodes	
  
     –  Q1	
  2012	
  
•  Beta/ProducAon	
  
     –  Variety	
  of	
  applicaAons	
  and	
  loads	
  	
  
     –  4000+	
  nodes	
  
     –  Q2	
  2012	
  
     	
  
     	
  
QuesAons?	
  


Follow	
  me	
  on	
  @twiDer:	
  sharad_ag	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Hortonworks
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureVinod Kumar Vavilapalli
 
Anti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deploymentAnti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deploymentNaganarasimha Garla
 
Investing the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesInvesting the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesDataWorks Summit/Hadoop Summit
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingOh Chan Kwon
 
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Adam Doyle
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is uselessAdrian Cockcroft
 
Application Timeline Server Past, Present and Future
Application Timeline Server  Past, Present and FutureApplication Timeline Server  Past, Present and Future
Application Timeline Server Past, Present and FutureNaganarasimha Garla
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersDataWorks Summit
 
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...DataStax Academy
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Cloudera, Inc.
 
Migrating to Riak at Shareaholic
Migrating to Riak at ShareaholicMigrating to Riak at Shareaholic
Migrating to Riak at ShareaholicShareaholic
 

Was ist angesagt? (20)

Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
 
Anti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deploymentAnti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deployment
 
Investing the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesInvesting the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resources
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
 
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Hadoop YARN overview
Hadoop YARN overviewHadoop YARN overview
Hadoop YARN overview
 
Yarnthug2014
Yarnthug2014Yarnthug2014
Yarnthug2014
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is useless
 
Application Timeline Server Past, Present and Future
Application Timeline Server  Past, Present and FutureApplication Timeline Server  Past, Present and Future
Application Timeline Server Past, Present and Future
 
Philly DB MapR Overview
Philly DB MapR OverviewPhilly DB MapR Overview
Philly DB MapR Overview
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN Clusters
 
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
 
Migrating to Riak at Shareaholic
Migrating to Riak at ShareaholicMigrating to Riak at Shareaholic
Migrating to Riak at Shareaholic
 

Andere mochten auch

Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...Dataconomy Media
 
Coursera neuromarketing 2015
Coursera neuromarketing 2015Coursera neuromarketing 2015
Coursera neuromarketing 2015Ines Solari
 
Ppt citysurv as_a_service_en_slideshare
Ppt citysurv as_a_service_en_slidesharePpt citysurv as_a_service_en_slideshare
Ppt citysurv as_a_service_en_slideshareAxis Communications
 
Hombres iguales por naturaleza
Hombres iguales por naturalezaHombres iguales por naturaleza
Hombres iguales por naturalezalucas zvala
 
історичний розвиток органічного світу
історичний розвиток органічного світуісторичний розвиток органічного світу
історичний розвиток органічного світуltasenko
 
Understanding Analytics With Twitter
Understanding Analytics With TwitterUnderstanding Analytics With Twitter
Understanding Analytics With TwitterChidi Okereke
 
Compliance plus presentation slide share
Compliance plus presentation slide shareCompliance plus presentation slide share
Compliance plus presentation slide shareAndy Brooks
 
Sortem repite experiencia en Funergal
Sortem repite experiencia en FunergalSortem repite experiencia en Funergal
Sortem repite experiencia en FunergalSortem
 
Предиктивная аналитика
Предиктивная аналитикаПредиктивная аналитика
Предиктивная аналитикаMOBILE DIMENSION LLC
 
AI and Big Data For National Intelligence
AI and Big Data For National IntelligenceAI and Big Data For National Intelligence
AI and Big Data For National IntelligenceSonal Goyal
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...Dataconomy Media
 
[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...
[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...
[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...CXL
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Choppers and cycloconverters
Choppers and cycloconvertersChoppers and cycloconverters
Choppers and cycloconvertersSHIMI S L
 
InfoSecurity Magazine - Data Loss Prevention
InfoSecurity Magazine - Data Loss PreventionInfoSecurity Magazine - Data Loss Prevention
InfoSecurity Magazine - Data Loss PreventionSimon Perry
 

Andere mochten auch (17)

Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
 
Coursera neuromarketing 2015
Coursera neuromarketing 2015Coursera neuromarketing 2015
Coursera neuromarketing 2015
 
Ppt citysurv as_a_service_en_slideshare
Ppt citysurv as_a_service_en_slidesharePpt citysurv as_a_service_en_slideshare
Ppt citysurv as_a_service_en_slideshare
 
Hombres iguales por naturaleza
Hombres iguales por naturalezaHombres iguales por naturaleza
Hombres iguales por naturaleza
 
Asanid hajar
Asanid hajarAsanid hajar
Asanid hajar
 
історичний розвиток органічного світу
історичний розвиток органічного світуісторичний розвиток органічного світу
історичний розвиток органічного світу
 
Understanding Analytics With Twitter
Understanding Analytics With TwitterUnderstanding Analytics With Twitter
Understanding Analytics With Twitter
 
Compliance plus presentation slide share
Compliance plus presentation slide shareCompliance plus presentation slide share
Compliance plus presentation slide share
 
Sortem repite experiencia en Funergal
Sortem repite experiencia en FunergalSortem repite experiencia en Funergal
Sortem repite experiencia en Funergal
 
Streaming Outlier Analysis for Fun and Scalability
Streaming Outlier Analysis for Fun and Scalability Streaming Outlier Analysis for Fun and Scalability
Streaming Outlier Analysis for Fun and Scalability
 
Предиктивная аналитика
Предиктивная аналитикаПредиктивная аналитика
Предиктивная аналитика
 
AI and Big Data For National Intelligence
AI and Big Data For National IntelligenceAI and Big Data For National Intelligence
AI and Big Data For National Intelligence
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...
[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...
[Elite Camp 2016] Stacey MacNaught - Nobody Pays the Bills in "Social Shares"...
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Choppers and cycloconverters
Choppers and cycloconvertersChoppers and cycloconverters
Choppers and cycloconverters
 
InfoSecurity Magazine - Data Loss Prevention
InfoSecurity Magazine - Data Loss PreventionInfoSecurity Magazine - Data Loss Prevention
InfoSecurity Magazine - Data Loss Prevention
 

Ähnlich wie Hadoop bangalore-meetup-dec-2011-hadoop nextgen

Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHortonworks
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopHortonworks
 
YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011Sharad Agarwal
 
Next Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduceNext Generation of Hadoop MapReduce
Next Generation of Hadoop MapReducehuguk
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure bloomreacheng
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioningStanley Wang
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnMike Frampton
 
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史Insight Technology, Inc.
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfWasyihunSema2
 
Apache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup PresentationApache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup PresentationHortonworks
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Chris Fregly
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupRommel Garcia
 
Apache Hadoop YARN State of the Union
Apache Hadoop YARN State of the UnionApache Hadoop YARN State of the Union
Apache Hadoop YARN State of the UnionWeiwei Yang
 

Ähnlich wie Hadoop bangalore-meetup-dec-2011-hadoop nextgen (20)

Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011YARN Hadoop Summit Bangalore 2011
YARN Hadoop Summit Bangalore 2011
 
Next Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduceNext Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduce
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioning
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
 
Glint with Apache Spark
Glint with Apache SparkGlint with Apache Spark
Glint with Apache Spark
 
Apache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup PresentationApache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup Presentation
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
 
Apache Hadoop YARN State of the Union
Apache Hadoop YARN State of the UnionApache Hadoop YARN State of the Union
Apache Hadoop YARN State of the Union
 

Mehr von InMobi

Responding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsiblyResponding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsiblyInMobi
 
2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected Consumer2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected ConsumerInMobi
 
Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019InMobi
 
The Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile UserThe Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile UserInMobi
 
Unlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on MobileUnlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on MobileInMobi
 
InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018InMobi
 
Neural Field aware Factorization Machine
Neural Field aware Factorization MachineNeural Field aware Factorization Machine
Neural Field aware Factorization MachineInMobi
 
The Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - KoreanThe Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - KoreanInMobi
 
A Comprehensive Guide for App Marketers
A Comprehensive Guide for App MarketersA Comprehensive Guide for App Marketers
A Comprehensive Guide for App MarketersInMobi
 
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud PreventionA Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud PreventionInMobi
 
[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertising[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertisingInMobi
 
The Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video ViewabilityThe Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video ViewabilityInMobi
 
Top 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in IndonesiaTop 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in IndonesiaInMobi
 
Mobile marketing strategy guide
Mobile marketing strategy guide Mobile marketing strategy guide
Mobile marketing strategy guide InMobi
 
InMobi Yearbook 2016
InMobi Yearbook 2016InMobi Yearbook 2016
InMobi Yearbook 2016InMobi
 
Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!InMobi
 
Building Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real ResultsBuilding Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real ResultsInMobi
 
Everything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apacEverything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apacInMobi
 
The Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | GlobalThe Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | GlobalInMobi
 
Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads InMobi
 

Mehr von InMobi (20)

Responding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsiblyResponding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsibly
 
2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected Consumer2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected Consumer
 
Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019
 
The Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile UserThe Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile User
 
Unlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on MobileUnlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on Mobile
 
InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018
 
Neural Field aware Factorization Machine
Neural Field aware Factorization MachineNeural Field aware Factorization Machine
Neural Field aware Factorization Machine
 
The Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - KoreanThe Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - Korean
 
A Comprehensive Guide for App Marketers
A Comprehensive Guide for App MarketersA Comprehensive Guide for App Marketers
A Comprehensive Guide for App Marketers
 
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud PreventionA Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
 
[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertising[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertising
 
The Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video ViewabilityThe Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video Viewability
 
Top 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in IndonesiaTop 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in Indonesia
 
Mobile marketing strategy guide
Mobile marketing strategy guide Mobile marketing strategy guide
Mobile marketing strategy guide
 
InMobi Yearbook 2016
InMobi Yearbook 2016InMobi Yearbook 2016
InMobi Yearbook 2016
 
Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!
 
Building Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real ResultsBuilding Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real Results
 
Everything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apacEverything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apac
 
The Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | GlobalThe Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | Global
 
Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads
 

Kürzlich hochgeladen

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 

Kürzlich hochgeladen (20)

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 

Hadoop bangalore-meetup-dec-2011-hadoop nextgen

  • 1. Hadoop  Nextgen/MRv2/YARN   Sharad  Agarwal   sharad@apache.org  
  • 2. About  me   •  Apache  FoundaAon   –  Hadoop  CommiDer  and  PMC  member   –  Hadoop  MR  contributor  ~  4  years   –  Author  of  Hadoop  Nextgen  core   •  Head  of  Technology  PlaKorms  @InMobi   –  Formerly  Architect  @Yahoo!    
  • 3. Hadoop  Map-­‐Reduce  Today   •  JobTracker   –  Manages  cluster   resources  and  job   scheduling   •  TaskTracker   –  Per-­‐node  agent   –  Manage  tasks  
  • 4. Current  LimitaAons   •  Scalability   –  Maximum  Cluster  size  –  4,000  nodes   –  Maximum  concurrent  tasks  –  40,000   –  Coarse  synchronizaAon  in  JobTracker   •  Single  point  of  failure     –  Failure  kills  all  queued  and  running  jobs   –  Jobs  need  to  be  re-­‐submiDed  by  users   •  Restart  is  very  tricky  due  to  complex  state   •  Hard  parAAon  of  resources  into  map  and   reduce  slots  
  • 5. Current  LimitaAons   •  Lacks  support  for  alternate  paradigms   –  IteraAve  applicaAons  implemented  using  Map-­‐ Reduce  are  10x  slower.     –  Example:  K-­‐Means,  PageRank   •  Lack  of  wire-­‐compaAble  protocols     –  Client  and  cluster  must  be  of  same  version   –  ApplicaAons  and  workflows  cannot  migrate  to   different  clusters  
  • 6. Next  GeneraAon  Map-­‐Reduce   Requirements   •  Reliability   •  Availability   •  Scalability  -­‐  Clusters  of  6,000  machines   –  Each  machine  with  16  cores,  48G  RAM,  24TB  disks   –  100,000  concurrent  tasks   –  10,000  concurrent  jobs   •  Wire  CompaAbility   •  Agility  &  EvoluAon  –  Ability  for  customers  to   control  upgrades  to  the  grid  sodware  stack.  
  • 7. Next  GeneraAon  Map-­‐Reduce   Architecture   •  Split  up  the  two  major  funcAons  of  JobTracker   –  Cluster  resource  management   –  ApplicaAon  life-­‐cycle  management   •  Map-­‐Reduce  becomes  user-­‐land  library  
  • 8. Architecture   Node Node Manager Manager Container App Mstr App Mstr Client Resource Resource Node Node Manager Manager Manager Manager Client Client App Mstr Container Container MapReduce Status Node Node MapReduce Status Manager Manager Job Submission Job Submission Node Status Node Status Resource Request Resource Request Container Container
  • 9. Architecture   •  Resource  Manager   –  Global  resource  scheduler   –  Hierarchical  queues   •  Node  Manager   –  Per-­‐machine  agent   –  Manages  the  life-­‐cycle  of  container   –  Container  resource  monitoring   •  ApplicaAon  Master   –  Per-­‐applicaAon   –  Manages  applicaAon  scheduling  and  task  execuAon   –  E.g.  Map-­‐Reduce  ApplicaAon  Master  
  • 10.  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce   •  Scalability     –  ApplicaAon  life-­‐cycle  management  is  very   expensive   –  ParAAon  resource  management  and   applicaAon  life-­‐cycle  management   –  ApplicaAon  management  is  distributed   –  Hardware  trends  -­‐  Currently  run  clusters  of   4,000  machines   •  6,000  2012  machines  >  12,000  2009  machines   •  <8  cores,  16G,  4TB>  v/s  <16+  cores,  48/96G,   24TB>  
  • 11.  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce   •  Availability     –  ApplicaAon  Master   •  OpAonal  failover  via  applicaAon-­‐specific   checkpoint   •  Map-­‐Reduce  applicaAons  pick  up  where  they   led  off   –  Resource  Manager   •  No  single  point  of  failure  -­‐  failover  via   ZooKeeper   •  ApplicaAon  Masters  are  restarted   automaAcally  
  • 12.  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce   •  Wire  CompaAbility     –  Protocols  are  wire-­‐compaAble   –  Old  clients  can  talk  to  new  servers   –  Rolling  upgrades  
  • 13.  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce   •  Agility  /  EvoluAon     –  Map-­‐Reduce  now  becomes  a  user-­‐land   library   –  MulAple  versions  of  Map-­‐Reduce  can  run   in  the  same  cluster  (ala  Apache  Pig)   •  Faster  deployment  cycles  for  improvements   –  Customers  upgrade  Map-­‐Reduce  versions   on  their  schedule  
  • 14.  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce   •  UAlizaAon   –  Generic  resource  model     •  Memory   •  CPU   •  Disk  b/w   •  Network  b/w   –  Remove  fixed  parAAon  of  map  and  reduce   slots  
  • 15.  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce   •  Support  for  programming  paradigms   other  than  Map-­‐Reduce   –  MPI   –  Master-­‐Worker   –  Machine  Learning   –  IteraAve  processing   –  Enabled  by  allowing  use  of  paradigm-­‐ specific  ApplicaAon  Master   –  Run  all  on  the  same  Hadoop  cluster  
  • 16. Summary   •  The  next  generaAon  of  Map-­‐Reduce  takes   Hadoop  to  the  next  level   –  Scale-­‐out  even  further   –  High  availability   –  Cluster  UAlizaAon     –  Support  for  paradigms  other  than  Map-­‐Reduce  
  • 17. Status   •  Apache  Hadoop  0.23  release  is  out   –  HDFS  FederaAon   –  MRv2   •  Currently  undergoing  tests  on  Small  scale  ~  500  nodes   •  Alpha     –  2000  nodes   –  Q1  2012   •  Beta/ProducAon   –  Variety  of  applicaAons  and  loads     –  4000+  nodes   –  Q2  2012      
  • 18. QuesAons?   Follow  me  on  @twiDer:  sharad_ag