SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Jiaqi Tan, Soila Pertet, Xinghao Pan, Mike Kasick, Keith Bare, Eugene Marinelli, Rajeev Gandhi Priya Narasimhan Carnegie Mellon University
Automated Problem Diagnosis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Challenges in Problem Analysis  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Exploration of Fingerpointing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Why? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Hadoop Failure Survey ,[object Object],Targeted Failures: 66% Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Hadoop Mailing List Survey ,[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
M45 Job Performance Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University ,[object Object],[object Object],[object Object]
BEFORE : Hadoop Web Console ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
AFTER : Goals, Non-Goals ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Target Hadoop Clusters ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Performance Problems Studied Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University Studied Hadoop Issue Tracker (JIRA) from Jan-Dec 2007 Fault Description Resource contention CPU hog External process uses 70% of CPU Packet-loss  5% or 50% of incoming packets dropped Disk hog 20GB file repeatedly written to Disk full Disk full Application bugs  Source: Hadoop JIRA HADOOP-1036 Maps hang due to unhandled exception HADOOP-1152 Reduces fail while copying map output HADOOP-2080 Reduces fail due to incorrect checksum  HADOOP-2051 Jobs hang due to unhandled exception HADOOP-1255 Infinite loop at Nameode
Hadoop: Instrumentation Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University JobTracker NameNode TaskTracker DataNode Map/Reduce tasks HDFS blocks MASTER NODE SLAVE NODES Hadoop logs OS data OS data Hadoop logs
How About Those Metrics? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Intuition for Diagnosis ,[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Log-Analysis Approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Carnegie Mellon University Priya Narasimhan  ©  Oct 25, 2009
Applying SALSA to Hadoop Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University [ t] Launch Map task : [t] Copy Map outputs : [t] Map task done Map outputs to Reduce tasks on other nodes Data-flow view: transfer of data to other nodes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Incoming Map outputs for this Reduce task Control-flow view: state orders, durations
Distributed Control+Data Flow ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],<your name here>  ©  Oct 25, 2009 http://www.pdl.cmu.edu/
Intuition: Peer Similarity Oct 25, 2009 Carnegie Mellon University ,[object Object],[object Object],[object Object],Faulty node Normalized counts (total 1.0) Histograms (distributions) of durations of  WriteBlock  over a 30-second window Normal node Normal node Normalized counts (total 1.0) Normalized counts (total 1.0)
What Else Do We Do? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Putting the Elephant Together Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University TaskTracker heartbeat timestamps Black-box resource usage JobTracker Durations views TaskTracker Durations views JobTracker heartbeat timestamps Job-centric data flows BliMEy:  Bli nd  Me n and the  E lephant Framework [ CMU-CS-09-135  ]
Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Visualization ( timeseries )  Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University DiskHog on slave node visible through lower  heartbeat rate for that node
Visualization( heatmaps )  Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University CPU Hog on node 1 visible on Map-task durations
Visualizations ( swimlanes )  Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University Long-tailed map Delaying overall job completion time
MIROS ,[object Object],[object Object],[object Object],[object Object],[object Object],Jiaqi Tan  © July 09 http://www.pdl.cmu.edu/
Current Developments ,[object Object],[object Object],[object Object],[object Object],<your name here>  ©  Oct 25, 2009 http://www.pdl.cmu.edu/
Briefly: Online Fingerpointing ,[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Hard Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Priya Narasimhan  ©  Oct 25, 2009 Carnegie Mellon University
priya@cs.cmu.edu  Oct 25, 2009 Carnegie Mellon University

Weitere ähnliche Inhalte

Was ist angesagt?

Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...Kalman Graffi
 
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwarePooyan Jamshidi
 
An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...Alexander Decker
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept Miha Ahronovitz
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéJen Aman
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine LearningSudarsun Santhiappan
 
Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017SERC at Carleton College
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learningjie cao
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...asimkadav
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Frederic Desprez
 
Scalable machine learning
Scalable machine learningScalable machine learning
Scalable machine learningArnaud Rachez
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsSrinath Perera
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poliivascucristian
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopbalmanme
 

Was ist angesagt? (20)

Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
 
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based Software
 
An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine Learning
 
Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
 
Scalable machine learning
Scalable machine learningScalable machine learning
Scalable machine learning
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poli
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
 

Andere mochten auch

Hw09 Matchmaking In The Cloud
Hw09   Matchmaking In The CloudHw09   Matchmaking In The Cloud
Hw09 Matchmaking In The CloudCloudera, Inc.
 
Hw09 Cross Data Center Logs Processing
Hw09   Cross Data Center Logs ProcessingHw09   Cross Data Center Logs Processing
Hw09 Cross Data Center Logs ProcessingCloudera, Inc.
 
Hw09 Analytics And Reporting
Hw09   Analytics And ReportingHw09   Analytics And Reporting
Hw09 Analytics And ReportingCloudera, Inc.
 
Hw09 Optimizing Hadoop Deployments
Hw09   Optimizing Hadoop DeploymentsHw09   Optimizing Hadoop Deployments
Hw09 Optimizing Hadoop DeploymentsCloudera, Inc.
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemCloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 

Andere mochten auch (7)

Hw09 Matchmaking In The Cloud
Hw09   Matchmaking In The CloudHw09   Matchmaking In The Cloud
Hw09 Matchmaking In The Cloud
 
Hw09 Cross Data Center Logs Processing
Hw09   Cross Data Center Logs ProcessingHw09   Cross Data Center Logs Processing
Hw09 Cross Data Center Logs Processing
 
Hw09 Analytics And Reporting
Hw09   Analytics And ReportingHw09   Analytics And Reporting
Hw09 Analytics And Reporting
 
Hw09 Optimizing Hadoop Deployments
Hw09   Optimizing Hadoop DeploymentsHw09   Optimizing Hadoop Deployments
Hw09 Optimizing Hadoop Deployments
 
Hadoop Puzzlers
Hadoop PuzzlersHadoop Puzzlers
Hadoop Puzzlers
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop Ecosystem
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 

Ähnlich wie Hw09 Fingerpointing Sourcing Performance Issues

Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters DefensederDoc
 
Cs6703 grid and cloud computing book
Cs6703 grid and cloud computing bookCs6703 grid and cloud computing book
Cs6703 grid and cloud computing bookkaleeswaranme
 
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...Mahmud Hossain
 
An introduction to Workload Modelling for Cloud Applications
An introduction to Workload Modelling for Cloud ApplicationsAn introduction to Workload Modelling for Cloud Applications
An introduction to Workload Modelling for Cloud ApplicationsRavi Yogesh
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEbutest
 
University of Iowa Webmail
University of Iowa WebmailUniversity of Iowa Webmail
University of Iowa WebmailDavid Shafer
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesIan Foster
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" Joshua Bloom
 
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUDEPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUDNexgen Technology
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for FailuresRodolfo Kohn
 
Building ML Pipelines with DCOS
Building ML Pipelines with DCOSBuilding ML Pipelines with DCOS
Building ML Pipelines with DCOSQAware GmbH
 
Using a Cloud to Replenish Parched Groundwater Modeling Efforts
Using a Cloud to Replenish Parched Groundwater Modeling EffortsUsing a Cloud to Replenish Parched Groundwater Modeling Efforts
Using a Cloud to Replenish Parched Groundwater Modeling EffortsJoseph Luchette
 
eResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software developmenteResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software developmentAndrea Wiggins
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
 
Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...
Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...
Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...Splunk
 

Ähnlich wie Hw09 Fingerpointing Sourcing Performance Issues (20)

Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters Defense
 
Cs6703 grid and cloud computing book
Cs6703 grid and cloud computing bookCs6703 grid and cloud computing book
Cs6703 grid and cloud computing book
 
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
 
An introduction to Workload Modelling for Cloud Applications
An introduction to Workload Modelling for Cloud ApplicationsAn introduction to Workload Modelling for Cloud Applications
An introduction to Workload Modelling for Cloud Applications
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
 
Overview of the Data Processing Error Analysis System (DPEAS)
Overview of the Data Processing Error Analysis System (DPEAS)Overview of the Data Processing Error Analysis System (DPEAS)
Overview of the Data Processing Error Analysis System (DPEAS)
 
CS4961-L1.ppt
CS4961-L1.pptCS4961-L1.ppt
CS4961-L1.ppt
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AE
 
University of Iowa Webmail
University of Iowa WebmailUniversity of Iowa Webmail
University of Iowa Webmail
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
 
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUDEPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
 
Building ML Pipelines with DCOS
Building ML Pipelines with DCOSBuilding ML Pipelines with DCOS
Building ML Pipelines with DCOS
 
Using a Cloud to Replenish Parched Groundwater Modeling Efforts
Using a Cloud to Replenish Parched Groundwater Modeling EffortsUsing a Cloud to Replenish Parched Groundwater Modeling Efforts
Using a Cloud to Replenish Parched Groundwater Modeling Efforts
 
eResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software developmenteResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software development
 
Ajug april 2011
Ajug april 2011Ajug april 2011
Ajug april 2011
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
 
Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...
Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...
Virtual Gov Day - IT Operations Breakout - Jennifer Green, R&D Scientist, Los...
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Kürzlich hochgeladen (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Hw09 Fingerpointing Sourcing Performance Issues

  • 1. Jiaqi Tan, Soila Pertet, Xinghao Pan, Mike Kasick, Keith Bare, Eugene Marinelli, Rajeev Gandhi Priya Narasimhan Carnegie Mellon University
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Performance Problems Studied Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University Studied Hadoop Issue Tracker (JIRA) from Jan-Dec 2007 Fault Description Resource contention CPU hog External process uses 70% of CPU Packet-loss 5% or 50% of incoming packets dropped Disk hog 20GB file repeatedly written to Disk full Disk full Application bugs Source: Hadoop JIRA HADOOP-1036 Maps hang due to unhandled exception HADOOP-1152 Reduces fail while copying map output HADOOP-2080 Reduces fail due to incorrect checksum HADOOP-2051 Jobs hang due to unhandled exception HADOOP-1255 Infinite loop at Nameode
  • 13. Hadoop: Instrumentation Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University JobTracker NameNode TaskTracker DataNode Map/Reduce tasks HDFS blocks MASTER NODE SLAVE NODES Hadoop logs OS data OS data Hadoop logs
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University
  • 22. Putting the Elephant Together Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University TaskTracker heartbeat timestamps Black-box resource usage JobTracker Durations views TaskTracker Durations views JobTracker heartbeat timestamps Job-centric data flows BliMEy: Bli nd Me n and the E lephant Framework [ CMU-CS-09-135 ]
  • 23.
  • 24. Visualization ( timeseries ) Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University DiskHog on slave node visible through lower heartbeat rate for that node
  • 25. Visualization( heatmaps ) Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University CPU Hog on node 1 visible on Map-task durations
  • 26. Visualizations ( swimlanes ) Priya Narasimhan © Oct 25, 2009 Carnegie Mellon University Long-tailed map Delaying overall job completion time
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. priya@cs.cmu.edu Oct 25, 2009 Carnegie Mellon University

Hinweis der Redaktion

  1. Quick mention verbally of what Hadoop is: Distributed parallel processing runtime with a master-slave architecture. Focus on limping-but-alive: performance degradations not caught by heartbeats
  2. Describe x and y axes