SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Slow, Stuck, or Runaway Apps?
Learn How to Quickly Fix Problems
While Operationalizing Big Data Apps
Shivnath Babu
CTO @ Unravel Data
shivnath@unraveldata.com
About me
Shivnath Babu
Co-founder/CTO,
Unravel Data Systems
Adjunct Professor,
Duke University
Menlo Park, CA 94025
• R&D on Hadoop, Spark, NoSQL, streaming,
& MPP to simplify ongoing app/system
management
• Led work at Duke on first self-tuning Hadoop
platform: Starfish
• Awards from NSF, IBM, HP
• PhD, Stanford University
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Content
• Challenges in operationalizing big data apps
• How we can improve state-of-the-art
• Real-life examples
Image source: http://bits2byte.blogspot.com/
Alerts
Insights
TYPICAL BIG DATA ENVIRONMENT
BI/SQL
ETL Pipeline
Big Data Apps
ML/Modeling
Pipeline
Data Cubes
Real-time
Stream
Processing
AI / Deep
Learning
Graph
Analytics
Image source: http://bits2byte.blogspot.com/
Alerts
Insights
TYPICAL BIG DATA ENVIRONMENT
Doesn’t Scale
FailedOperationalizing Big Data Apps is
Challenging! Slow
Stuck
Unpredictable /
unreliable
performance
Missed SLA
Rogue /
Contention
Events
Activity
Data
MLlib
Machine
Learning
Analytics
Business
Intelligence
Data
Services
RDBMS
Why is my pipeline slow?
Why is my app failing?
Devs
How do I ensure apps
are meeting SLAs?
How to maximize
resource utilization?
How to plan for growth?
Ops
Big Data Stack
Stack is Complex Expert DataOps
Wanted!
What can go wrong?
• Failures
• My query failed after 6 hours!
• What does this Exception mean?
What can go wrong?
• Failures
• My query failed after 6 hours!
• What does this Exception mean?
• Bad performance
• My app is very slow
• Pipeline is not meeting 4hr SLA
• Unreliable performance
• My app is stuck
• Latency is 3x worse today
• Poor scalability
• Oh, but it worked on the dev cluster!
• Bad App(le)s
• Tom’s query brought the cluster down
• Application Problems
• Poor joins/transformations
• Ineffective caching
• Bloated data structures
• Data/Storage Problems
• Skewed data, load imbalance
• Small files, poor data partitioning
• Configuration Problems
• Suboptimal container sizes
• Scheduler weight/capacity settings
• Resource Problems
• Resource contention
• Service degradation (ex: NameNode)
And why?
Network
Settings
Scheduler
Settings
Machine
Degradation
File Formats
Wrong Results
Data
Layout
Bad Joins
Config
Settings
Code Bugs
So Many Problems, So Little Time
DataOps
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
How do DataOps address this
problem today?
Look at Logs?
Logs in distributed systems are spread out, incomplete,
& usually very difficult to understand
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
There has to be a better way
Full Stack Performance Intelligence
HW HW HW
Hadoop Spark Kafka Cassandra Elasticsearch
MPP
Applications: ETL, BI/SQL, Data Pipelines, Streaming, ML
Cloud
Big Data Stack
Logs,Profiles,Metrics,Events
Cloud
Full Stack Performance Intelligence from 30k ft
ApplyPredictive
Analytics
Intelligence
needed by
DataOps
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Why Full Stack?
• Because problems can happen all over the stack
o Otherwise, we will be blindsided and give wrong insights
• Because it is now possible to:
o Get full-stack telemetry data (high volume, velocity, & variety)
o Reuse distributed systems to process and store this data
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
What is “Intelligence”?
• Not just graphs and time-series charts
• And not simply throwing some AI/ML and seeing what comes out
Intelligence = Automation to Augment DataOps
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Let us Dig Deeper
• We surveyed 250+ DataOps professionals across many verticals to
understand where and how intelligence can benefit them
• Use cases from this survey fall into three categories (aka the Three P’s)
1. I have a Problem that I need to fix
2. I want to be Proactive in detecting and fixing problems
3. I need to Plan for future use
Intelligence = Automation to Augment DataOps
1. Automated Diagnosis
2. Automated Remediation
DataOps Need Intelligence Needed
I have a problem
1. Automated Prediction
2. Automated What-if Analysis
I need to plan
Underutilized clusters
Low throughput
Unused datasets
Poor data layout1. Automated Detection
2. Automated Diagnosis
3. Automated Prevention/Remediation
I want to be proactive
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Real-life Example: Proactive Alerts for SLA
Management
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unpredictable Pipeline Performance is Common
Duration
Anomal
y
Duration trending
upwards
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Duration trending
upwards
Duration
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
I/O
I/O is steady
Bad
Good
Underutilized clusters
Low throughput
But, This is Just One Type of Contention
• At Resource Manager Level
• App admission time
• Container allocation for Application Master
• Container allocation for tasks
• Container allocation for Executor
• At Application Level
• Workflow Scheduler, e.g., Oozie
• Query Engine, e.g., HiveServer2
• At Master Daemon Level
• NameNode
• Hive MetaStore
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Key Takeaways
Resource contention at different levels affects app performance
• Different apps (Oozie workflows, MapReduce, Spark, Tez) are affected differently
• Manual diagnosis can be hard and time-consuming
It is possible to diagnose and remedy such problems automatically
• By analyzing full-stack telemetry data
• By combining: Automated Baselining, Anomaly Detection, & Correlation Analysis
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Real-life Problem: Hive/Spark App Failure
• My SQL query failed. Why?
• A MapReduce job failed. Why?
• A Task failed. Why?
• JVM went Out-of-Memory. Why?
• Data skew. Where?
• Reduce-side. Got it!
• How to Fix it?
1. At Resource layer, e.g., larger containers
2. At Configuration layer, e.g., turn on dynamic adaptation to skew
3. At Data layer, e.g., separate skewed keys from others
4. At App layer, e.g., filter skew keys or change algorithm
5. Some combination of the above
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Real-life Planning: Migrate Apps to Cloud
• How to create perf baselines for on-
prem Vs. cloud comparison?
• What type of instances to get for
same performance on cloud?
• How many permanent Vs. spun-on-
demand instances are needed?
• Which configuration settings will need
tuning for on-prem Vs. cloud?
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
Real-life Planning: Migrate Apps to Cloud
Needs what-if
analysis on
full-stack data
Missed SLAs
Poor performance
Failed applications
Underutilized clusters
Low throughput
Unused datasets
Poor data layout
To Summarize
• Operationalizing big data apps is very challenging for DataOps
• Full Stack Performance Intelligence will augment DataOps to:
1. Deliver quick and high ROI on the Big Data Stack
2. Do more in less time
3. Help them sleep better

Weitere ähnliche Inhalte

Was ist angesagt?

Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msApache Apex
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingApache Apex
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache ApexApache Apex
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataApache Apex
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Apache Apex
 
Java High Level Stream API
Java High Level Stream APIJava High Level Stream API
Java High Level Stream APIApache Apex
 
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentIngesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentApache Apex
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache ApexApache Apex
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Apache Apex
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexApache Apex
 
Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexApache Apex
 
From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017Apache Apex
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Apache Apex
 
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsKafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsApache Apex
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit
 

Was ist angesagt? (20)

Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 ms
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Java High Level Stream API
Java High Level Stream APIJava High Level Stream API
Java High Level Stream API
 
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and EnrichmentIngesting Data from Kafka to JDBC with Transformation and Enrichment
Ingesting Data from Kafka to JDBC with Transformation and Enrichment
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
 
Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache Apex
 
Breaking data
Breaking dataBreaking data
Breaking data
 
From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
 
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data TransformationsKafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
Kafka to Hadoop Ingest with Parsing, Dedup and other Big Data Transformations
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
 

Andere mochten auch

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector Yahoo Developer Network
 
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...Yahoo Developer Network
 
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn Yahoo Developer Network
 
Hadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data ProcessingHadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data ProcessingYahoo Developer Network
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
 
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopOctober 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopYahoo Developer Network
 
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink:  Fast and reliable large-scale data processingJanuary 2015 HUG: Apache Flink:  Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processingYahoo Developer Network
 
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...Yahoo Developer Network
 
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...Yahoo Developer Network
 
Community detection (Поиск сообществ в графах)
Community detection (Поиск сообществ в графах)Community detection (Поиск сообществ в графах)
Community detection (Поиск сообществ в графах)Kirill Rybachuk
 
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...Yahoo Developer Network
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerYahoo Developer Network
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...Yahoo Developer Network
 
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersApril 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersYahoo Developer Network
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieYahoo Developer Network
 

Andere mochten auch (19)

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 
XXL Graph Algorithms__HadoopSummit2010
XXL Graph Algorithms__HadoopSummit2010XXL Graph Algorithms__HadoopSummit2010
XXL Graph Algorithms__HadoopSummit2010
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
 
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
 
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
 
Hadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data ProcessingHadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data Processing
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopOctober 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
 
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink:  Fast and reliable large-scale data processingJanuary 2015 HUG: Apache Flink:  Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
 
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
 
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
February 2016 HUG: Apache Apex (incubating): Stream Processing Architecture a...
 
Community detection (Поиск сообществ в графах)
Community detection (Поиск сообществ в графах)Community detection (Поиск сообществ в графах)
Community detection (Поиск сообществ в графах)
 
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with Docker
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
 
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersApril 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
 

Ähnlich wie February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems while Operationalizing Big Data Apps

Meeting Performance Goals in multi-tenant Hadoop Clusters
Meeting Performance Goals in multi-tenant Hadoop ClustersMeeting Performance Goals in multi-tenant Hadoop Clusters
Meeting Performance Goals in multi-tenant Hadoop ClustersDataWorks Summit/Hadoop Summit
 
LanceShivnathHadoopSummit2015
LanceShivnathHadoopSummit2015LanceShivnathHadoopSummit2015
LanceShivnathHadoopSummit2015Lance Co Ting Keh
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
 
Spark Application Development Made Easy
Spark Application Development Made EasySpark Application Development Made Easy
Spark Application Development Made EasyDataWorks Summit
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applicationsAmit Kejriwal
 
Blackboard DevCon 2011 - Developing B2 for Performance and Scalability
Blackboard DevCon 2011 - Developing B2 for Performance and ScalabilityBlackboard DevCon 2011 - Developing B2 for Performance and Scalability
Blackboard DevCon 2011 - Developing B2 for Performance and ScalabilityNoriaki Tatsumi
 
Better Visibility into Spark Execution for Faster Application Development-(S...
 Better Visibility into Spark Execution for Faster Application Development-(S... Better Visibility into Spark Execution for Faster Application Development-(S...
Better Visibility into Spark Execution for Faster Application Development-(S...Spark Summit
 
SparkApplicationDevMadeEasy_Spark_Summit_2015
SparkApplicationDevMadeEasy_Spark_Summit_2015SparkApplicationDevMadeEasy_Spark_Summit_2015
SparkApplicationDevMadeEasy_Spark_Summit_2015Lance Co Ting Keh
 
Relational data modeling trends for transactional applications
Relational data modeling trends for transactional applicationsRelational data modeling trends for transactional applications
Relational data modeling trends for transactional applicationsIke Ellis
 
Doing DevOps for Big Data? What You Need to Know About AIOps
Doing DevOps for Big Data? What You Need to Know About AIOpsDoing DevOps for Big Data? What You Need to Know About AIOps
Doing DevOps for Big Data? What You Need to Know About AIOpsDevOps.com
 
Performing Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXPerforming Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXDatavail
 
Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityDevOps.com
 
Qiagram
QiagramQiagram
Qiagramjwppz
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOpsSteven Ensslen
 
data science chapter-4,5,6
data science chapter-4,5,6data science chapter-4,5,6
data science chapter-4,5,6varshakumar21
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Miningnabil_alsharafi
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-finalsupportlogic
 

Ähnlich wie February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems while Operationalizing Big Data Apps (20)

Meeting Performance Goals in multi-tenant Hadoop Clusters
Meeting Performance Goals in multi-tenant Hadoop ClustersMeeting Performance Goals in multi-tenant Hadoop Clusters
Meeting Performance Goals in multi-tenant Hadoop Clusters
 
LanceShivnathHadoopSummit2015
LanceShivnathHadoopSummit2015LanceShivnathHadoopSummit2015
LanceShivnathHadoopSummit2015
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
 
Spark Application Development Made Easy
Spark Application Development Made EasySpark Application Development Made Easy
Spark Application Development Made Easy
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Blackboard DevCon 2011 - Developing B2 for Performance and Scalability
Blackboard DevCon 2011 - Developing B2 for Performance and ScalabilityBlackboard DevCon 2011 - Developing B2 for Performance and Scalability
Blackboard DevCon 2011 - Developing B2 for Performance and Scalability
 
Better Visibility into Spark Execution for Faster Application Development-(S...
 Better Visibility into Spark Execution for Faster Application Development-(S... Better Visibility into Spark Execution for Faster Application Development-(S...
Better Visibility into Spark Execution for Faster Application Development-(S...
 
SparkApplicationDevMadeEasy_Spark_Summit_2015
SparkApplicationDevMadeEasy_Spark_Summit_2015SparkApplicationDevMadeEasy_Spark_Summit_2015
SparkApplicationDevMadeEasy_Spark_Summit_2015
 
Relational data modeling trends for transactional applications
Relational data modeling trends for transactional applicationsRelational data modeling trends for transactional applications
Relational data modeling trends for transactional applications
 
Doing DevOps for Big Data? What You Need to Know About AIOps
Doing DevOps for Big Data? What You Need to Know About AIOpsDoing DevOps for Big Data? What You Need to Know About AIOps
Doing DevOps for Big Data? What You Need to Know About AIOps
 
Performing Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEXPerforming Oracle Health Checks Using APEX
Performing Oracle Health Checks Using APEX
 
Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application Quality
 
Qiagram
QiagramQiagram
Qiagram
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
 
Data science unit2
Data science unit2Data science unit2
Data science unit2
 
data science chapter-4,5,6
data science chapter-4,5,6data science chapter-4,5,6
data science chapter-4,5,6
 
unit 1 big data.pptx
unit 1 big data.pptxunit 1 big data.pptx
unit 1 big data.pptx
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Mining
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 

Mehr von Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaYahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanYahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuYahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolYahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathYahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsYahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Yahoo Developer Network
 

Mehr von Yahoo Developer Network (17)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 

Kürzlich hochgeladen

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Kürzlich hochgeladen (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems while Operationalizing Big Data Apps

  • 1. Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems While Operationalizing Big Data Apps Shivnath Babu CTO @ Unravel Data shivnath@unraveldata.com
  • 2. About me Shivnath Babu Co-founder/CTO, Unravel Data Systems Adjunct Professor, Duke University Menlo Park, CA 94025 • R&D on Hadoop, Spark, NoSQL, streaming, & MPP to simplify ongoing app/system management • Led work at Duke on first self-tuning Hadoop platform: Starfish • Awards from NSF, IBM, HP • PhD, Stanford University
  • 3. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Content • Challenges in operationalizing big data apps • How we can improve state-of-the-art • Real-life examples
  • 4. Image source: http://bits2byte.blogspot.com/ Alerts Insights TYPICAL BIG DATA ENVIRONMENT BI/SQL ETL Pipeline Big Data Apps ML/Modeling Pipeline Data Cubes Real-time Stream Processing AI / Deep Learning Graph Analytics
  • 5. Image source: http://bits2byte.blogspot.com/ Alerts Insights TYPICAL BIG DATA ENVIRONMENT Doesn’t Scale FailedOperationalizing Big Data Apps is Challenging! Slow Stuck Unpredictable / unreliable performance Missed SLA Rogue / Contention
  • 6. Events Activity Data MLlib Machine Learning Analytics Business Intelligence Data Services RDBMS Why is my pipeline slow? Why is my app failing? Devs How do I ensure apps are meeting SLAs? How to maximize resource utilization? How to plan for growth? Ops Big Data Stack Stack is Complex Expert DataOps Wanted!
  • 7. What can go wrong? • Failures • My query failed after 6 hours! • What does this Exception mean?
  • 8. What can go wrong? • Failures • My query failed after 6 hours! • What does this Exception mean? • Bad performance • My app is very slow • Pipeline is not meeting 4hr SLA • Unreliable performance • My app is stuck • Latency is 3x worse today • Poor scalability • Oh, but it worked on the dev cluster! • Bad App(le)s • Tom’s query brought the cluster down • Application Problems • Poor joins/transformations • Ineffective caching • Bloated data structures • Data/Storage Problems • Skewed data, load imbalance • Small files, poor data partitioning • Configuration Problems • Suboptimal container sizes • Scheduler weight/capacity settings • Resource Problems • Resource contention • Service degradation (ex: NameNode) And why?
  • 9. Network Settings Scheduler Settings Machine Degradation File Formats Wrong Results Data Layout Bad Joins Config Settings Code Bugs So Many Problems, So Little Time DataOps
  • 10. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout How do DataOps address this problem today?
  • 11. Look at Logs? Logs in distributed systems are spread out, incomplete, & usually very difficult to understand
  • 12.
  • 13.
  • 14. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout There has to be a better way Full Stack Performance Intelligence
  • 15. HW HW HW Hadoop Spark Kafka Cassandra Elasticsearch MPP Applications: ETL, BI/SQL, Data Pipelines, Streaming, ML Cloud Big Data Stack Logs,Profiles,Metrics,Events Cloud Full Stack Performance Intelligence from 30k ft ApplyPredictive Analytics Intelligence needed by DataOps
  • 16. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Why Full Stack? • Because problems can happen all over the stack o Otherwise, we will be blindsided and give wrong insights • Because it is now possible to: o Get full-stack telemetry data (high volume, velocity, & variety) o Reuse distributed systems to process and store this data
  • 17. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout What is “Intelligence”? • Not just graphs and time-series charts • And not simply throwing some AI/ML and seeing what comes out Intelligence = Automation to Augment DataOps
  • 18. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Let us Dig Deeper • We surveyed 250+ DataOps professionals across many verticals to understand where and how intelligence can benefit them • Use cases from this survey fall into three categories (aka the Three P’s) 1. I have a Problem that I need to fix 2. I want to be Proactive in detecting and fixing problems 3. I need to Plan for future use
  • 19. Intelligence = Automation to Augment DataOps 1. Automated Diagnosis 2. Automated Remediation DataOps Need Intelligence Needed I have a problem 1. Automated Prediction 2. Automated What-if Analysis I need to plan Underutilized clusters Low throughput Unused datasets Poor data layout1. Automated Detection 2. Automated Diagnosis 3. Automated Prevention/Remediation I want to be proactive
  • 20. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Real-life Example: Proactive Alerts for SLA Management
  • 21. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unpredictable Pipeline Performance is Common Duration Anomal y Duration trending upwards
  • 22. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Duration trending upwards Duration
  • 23. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout I/O I/O is steady
  • 25.
  • 26.
  • 27. Underutilized clusters Low throughput But, This is Just One Type of Contention • At Resource Manager Level • App admission time • Container allocation for Application Master • Container allocation for tasks • Container allocation for Executor • At Application Level • Workflow Scheduler, e.g., Oozie • Query Engine, e.g., HiveServer2 • At Master Daemon Level • NameNode • Hive MetaStore
  • 28. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Key Takeaways Resource contention at different levels affects app performance • Different apps (Oozie workflows, MapReduce, Spark, Tez) are affected differently • Manual diagnosis can be hard and time-consuming It is possible to diagnose and remedy such problems automatically • By analyzing full-stack telemetry data • By combining: Automated Baselining, Anomaly Detection, & Correlation Analysis
  • 29. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout Real-life Problem: Hive/Spark App Failure • My SQL query failed. Why? • A MapReduce job failed. Why? • A Task failed. Why? • JVM went Out-of-Memory. Why? • Data skew. Where? • Reduce-side. Got it! • How to Fix it? 1. At Resource layer, e.g., larger containers 2. At Configuration layer, e.g., turn on dynamic adaptation to skew 3. At Data layer, e.g., separate skewed keys from others 4. At App layer, e.g., filter skew keys or change algorithm 5. Some combination of the above
  • 30. Underutilized clusters Low throughput Unused datasets Poor data layout Real-life Planning: Migrate Apps to Cloud • How to create perf baselines for on- prem Vs. cloud comparison? • What type of instances to get for same performance on cloud? • How many permanent Vs. spun-on- demand instances are needed? • Which configuration settings will need tuning for on-prem Vs. cloud?
  • 31. Underutilized clusters Low throughput Unused datasets Poor data layout Real-life Planning: Migrate Apps to Cloud Needs what-if analysis on full-stack data
  • 32. Missed SLAs Poor performance Failed applications Underutilized clusters Low throughput Unused datasets Poor data layout To Summarize • Operationalizing big data apps is very challenging for DataOps • Full Stack Performance Intelligence will augment DataOps to: 1. Deliver quick and high ROI on the Big Data Stack 2. Do more in less time 3. Help them sleep better