Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulzar and Matteo Interlandi

Debugging Big Data Analytics in
Apache Spark with BigDebug
MATTEO INTERLANDI - MUHAMMAD ALI GULZAR

Open Source Data-Intensive Scalable Computing (DISC)
Platforms: Hadoop MapReduce and Spark
◦ Severe lack of debugging support in these systems
◦ Programs (i.e., queries, jobs) are batch executed / black boxes
So what to do?
◦ Trial and error debugging on subsample
◦ Post-mortem analysis of error logs
◦ Analyze physical view of the execution (a job id, failed node, etc).
Debugging Cloud Computing Programs

BigDebug Project Overview
Titian: Data Provenance for Fine-
Grained Tracing
[PVLDB 2016]
Vega: Incremental Computation for
Interactive Debugging
[SoCC 2016]
Collaboration with Tyson Condie, Miryung Kim, and Todd Millstein
BigDebug: Debugging Primitives
for Interactive Big Data Processing
in Spark
[ICSE 2016]
Automated Debugging in Data
Intensive Scalable Computing
Systems
[Under Submission]

Simulated Breakpoint1
BigDebug: Interactive Debugger Features

Guarded Watchpoint2

Guarded Watchpoint2
Crash Culprit Identification3

Guarded Watchpoint2
Backward Tracing4
$>Crash inducing input records:
9K23 Cruz TX 1440023645
2FSD Cruz KS 1440026456
9909 Cruz KS 1440023768

Guarded Watchpoint2
Backward Tracing4
Automated Fault Localization5
Test Fails
Split
Test PassesTest Fails
…
..
.

Feature 1:
Simulated
Breakpoint
Stage 0 Stage 1
map
simulated
breakpoint
groupByKey
map
map

Feature 1:
Simulated
Breakpoint
Simulated breakpoint enables user to inspect intermediate
program state without pausing the computation

Feature 2: On Demand
Guarded Watchpoint map
simulated
breakpoint
map
map
groupBy
Key
Stage 0
Stage 1
watch
point

Feature 2: On Demand
Guarded Watchpoint
A user can inspect intermediate data using a guard
and also update it on the fly

Feature 3: Crash Culprit
Identification and
Remediation
map
simulated
breakpoint
map
Stage 0
Stage 1
watch
point
map
groupBy
Key

Feature 3: Crash Culprit
Identification and
Remediation
A user can use BigDebug to identify the crashing records
and remediate from the failure

Feature 4: Backward Tracing
Data Provenance enables users to identify crash inducing
inputs records

Feature 5: Automated Fault Localization
Goal: Given a test function and set of failing results
◦ Identify the minimum set of input records that can reproduce the failure
After all is said
and done more is
said than done
than done
done more is
said
After all is
said and
After
all
is
said
and
done
more
is
said
than
done
(After,1)
(all,1)
(is,1)
(said,1)
(and,1)
(done,1)
(more,1)
(is,1)
(said,1)
(than,1)
(done,1)
(After,1)
(all,1)
(is,2)
(and,1)
(more,1)
(said,2)
(than,1)
(done,2)
After,1
all, 1
is, 2
and, 1
more, 1
said, 2
done, 2
than, 1
FlatMap Map ReduceByKey CollectTextFile
Task 1
Task 3
Task 1
Task 3
Task 2Task 2
Stage 2Stage 1

Goal: Given a test function and set of failing results
◦ Identify the minimum set of input records that can reproduce the failure
◦ We apply data provenance and delta debugging in tandem
After all is said
and done more is
said than done
than done
done more is
said
After all is
said and
After
all
is
said
and
done
more
is
said
than
done
(After,1)
(all,1)
(is,1)
(said,1)
(and,1)
(done,1)
(more,1)
(is,1)
(said,1)
(than,1)
(done,1)
(After,1)
(all,1)
(is,2)
(and,1)
(more,1)
(said,2)
(than,1)
(done,2)
After,1
all, 1
is, 2
and, 1
more, 1
said, 2
done, 2
than, 1
FlatMap Map ReduceByKey CollectTextFile
Task 1
Task 3
Task 1
Task 3
Task 2Task 2
Stage 2Stage 1

We apply data provenance and delta debugging [Zeller et al. ] in tandem

Test

Test Split

Test Split
In average BigDebug is able to localize faults within 63% of
the original job running time

….
Test Split

Running Example
val log = "s3n://xcr:wJY@ws/logs/enroll.log"
val text_file = sc.textFile(log)
text_file
.map{line=>(line.split()[2],line.split()[3])}
.map{t => (t._1 , getYears(t._2))}
.groupByKey()
.map(v => (v._1 , average(v._2)))
.collect()
1 Michael Sophomore 03/12/1996
2 Justin Freshman 05/01/1998
.. .. ..

Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulzar and Matteo Interlandi

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulzar and Matteo Interlandi

Similar to Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulzar and Matteo Interlandi (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulzar and Matteo Interlandi