SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
A Longitudinal Study of
Programmers’ Backtracking
YoungSeokYoon
(youngseok@cs.cmu.edu)
Institute for Software Research
Carnegie Mellon University
Brad Myers
(bam@cs.cmu.edu)
Human-Computer Interaction Institute
Carnegie Mellon University
Background	

VL/HCC 2014 2
What is Backtracking?
•  Reverting code fragments to an earlier state
•  Examples
– Reverting a parameter to a previously used value
– Removing debugging statements after fixing a bug
– Restoring some deleted code
– …
VL/HCC 2014 3
Previous Studies of Backtracking
•  Two qualitative studies of backtracking
[Yoon+, CHASE’12]
1.  Preliminary lab study (12 programmers)
2.  Online survey (48 respondents)
VL/HCC 2014 4
Previous Studies of Backtracking
•  Observation
– Programmers face challenges when backtracking
•  locating the right code to be backtracked
•  restoring some deleted code correctly
•  reverting inter-related code fragments together
– Programmers backtrack relatively often
(75% answered at least “sometimes”)
VL/HCC 2014 5
Limitations of the Previous Studies
•  Lab study tasks required participants to backtrack
•  Survey results may not correctly reflect the reality
(e.g., programmers might backtrack unconsciously)
•  The analyses were mostly qualitative
VL/HCC 2014 6
A Longitudinal Study of
Backtracking	

As a follow-up:
VL/HCC 2014 7
Longitudinal Study of Backtracking
•  Two main goals
– Obtain backtracking statistics in order to quantify
the need for backtracking tools
– Identify backtracking situations that are not very
well supported by existing programming tools
VL/HCC 2014 8
Data Collection – Fluorite Logger
http://www.cs.cmu.edu/~fluorite/	
•  Eclipse logger for fine-grained code
editing data [Yoon+, PLATEAU’11]
•  Information collected:
–  Initial snapshot of each source file
–  All edit operations (insert, delete, or replace)
–  Timestamps, executed editor commands, etc.
•  Distributed to programmers since April 2012
VL/HCC 2014 9
[Image Src:Attribution: Rob Lavinsky, iRocks.com - CC-BY-SA-3.0]
Study Participants
Group Description No. of Participants
Coding Time (hours)	
  
[min	
  /	
  avg	
  /	
  max	
  /	
  sum]	
  
The first author (myself) 1 294	
  /	
  294	
  /	
  294	
  /	
  294	
  
Graduate students @ CMU 13 	
  	
  3	
  /	
  	
  40	
  /	
  216	
  /	
  520	
  
Research programmers /
System scientists @ CMU
5 	
  	
  6	
  /	
  118	
  /	
  446	
  /	
  588	
  
Graduate students @ UPitt 2 	
  	
  6	
  /	
  	
  29	
  /	
  	
  51	
  /	
  	
  57	
  
Total 21 people 1,460 hours
VL/HCC 2014 10
Analysis Process
•  The data was too big for manual inspection
– 1,345,241 coding events in the logs
•  Key idea of the automated analysis
– Keep the evolution history of individual AST nodes
of interest throughout the lifetime of the nodes
– Detect backtracking instances within each node
VL/HCC 2014 11
Analysis Process Illustrated
VL/HCC 2014 12
package example;	
	
public class Example {	
	
public void printRectangleInfo() {	
	
Rectangle rect = getEnclosingRect();	
	
int value = rect.getHeight();	
	
System.out.println("Value:" + value);	
	
}	
	
public Rectangle getEnclosingRect() {	
// return some rectangle here	
// actual code omitted	
// ...	
}	
	
}	
[Example Source Code Being Processed]
S1
S2
S3
Change history of S1
[v1] Rectangle rect = getEnclosingRect();	
Change history of S2
[v1] int value = rect.getHeight();	
[v2] int value = rect.getWidth();	
[v3] int value = rect.getSize();
[v4] int value = rect.getHeight();
Change history of S3
[v1] System.out.println(value);	
[v2] System.out.println("Value:" + value);
[Memory of the Analyzer]
Backtracking
Detected!
Backtracking Instance
A B C A B A
v1 v2 v3 v4 v5 v6
time	
getHeight();	 getWidth();	 getSize();	 getHeight();	 getWidth();	 getHeight();	
Three Backtracking Instances:
•  v1..v4
•  v2..v5
•  v4..v6
NOTE: v1..v6 is NOT a
backtracking instance
VL/HCC 2014 13
Research Questions
1.  How frequently do programmers backtrack in reality?
2.  How large are the backtrackings?
3.  How exactly do programmers perform backtracking?
Are they backtracking manually?
4.  Is there evidence of exploratory programming?
5.  Are there backtrackings performed across multiple editing
sessions?
6.  Are there selective backtrackings, which cannot be
performed by the undo command?
7.  Do programmers backtrack to the same code repeatedly?
VL/HCC 2014 14
1. Frequency of Backtracking
“How frequently do programmers backtrack in reality?”
•  A total of 15,095
backtracking instances
detected
•  10.3 instances/hour
on average
VL/HCC 2014 15
0 10 20 30
P20	
P19	
P18	
P17	
P16	
P15	
P14	
P13	
P12	
P11	
P10	
P9 	
P8 	
P7 	
P6 	
P5 	
P4 	
P3 	
P2 	
P1 	
P0 	
Backtracking Instances per Hour
3.8 (min)
28.4 (max)
Average: 10.3/h
Rate varied across
participants
(min=3.8/h, max=28.4/h),
but all of them backtracked
frequently
2. Size of Backtracking
“How large are the backtrackings?”
•  How did we define the size of a backtracking?
–  Measured the edit distance (Levenshtein distance) between the original
version and the other versions
–  Took the maximum value as the size of backtracking instance
A B C D E A
v1 v2 v3 v4 v5 v6
time
farthest
version
(max edit distance)
forward changes backward changes
original
version
VL/HCC 2014 16
2. Size of Backtracking
“How large are the backtrackings?”
VL/HCC 2014 17
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
•  Method / variable
names
•  String literals
•  Number literals
VL/HCC 2014 18
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
•  Simple parameter
changes
•  Reverting
renaming changes
on methods or
variables
VL/HCC 2014 19
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
•  Single statement
changes
•  Surrounding existing
code (e.g., try-catch)
then reverting
VL/HCC 2014 20
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
•  Adding, removing, or
modifying multiple
statements and
then reverting them
altogether
VL/HCC 2014 21
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
•  Significant
algorithmic
changes
•  Adding / removing /
modifying multiple
methods and then
reverting
VL/HCC 2014 22
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
VL/HCC 2014 23
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
Programmers backtrack at
varying granularities, from
simple name changes to
significant algorithmic changes
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
VL/HCC 2014 24
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
•  Undo (37%)
•  Paste (6%)
•  Redo (3%)
•  Content Assist (2%)
•  Toggle Comment (1%)
VL/HCC 2014 25
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
•  Unidentified (9%)
•  Multiple (4%)
VL/HCC 2014 26
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
•  Manual Deletion (25%)
•  Manual Typing (13%)
VL/HCC 2014 27
38% of the backtracking
instances were NOT
supported by existing tools,
indicating programmers need
better backtracking tools
4. Cross-Run Backtracking
“Is there evidence of exploratory programming?”
•  Make some changes à run the
application à revert the code
back to the way it was before
•  20.4% of all instances were cross-
run instances on average.
VL/HCC 2014 28
0% 10% 20% 30% 40% 50%
P20	
P19	
P18	
P17	
P16	
P15	
P14	
P13	
P12	
P11	
P10	
P9 	
P8 	
P7 	
P6 	
P5 	
P4 	
P3 	
P2 	
P1 	
P0 	
Cross-Run Backtracking Percentage
Average: 20.4%
This provides support that
programmers do this kind of
exploratory programming.
5. Cross-Session Backtracking
“Are there backtrackings performed across multiple editing sessions?”
96.7%
98.2%
98.8%
99.0%
99.2% 99.3%
96%
97%
98%
99%
100%
Same
Session
≤1 ≤2 ≤3 ≤4 ≤5
CumulativePercentageofAllBIs
Editing Session Distance
VL/HCC 2014 29
A backtracking tool would
work for 97% of the cases
with only the history within
the same editing session.
6. Selective Backtracking
“Are there backtrackings that could not have done by regular undo?”
•  Selective backtracking?
–  There are edits in the middle
of a backtracking that change
other parts of the same file, that
are not backtracked together
VL/HCC 2014 30
0% 5% 10% 15% 20%
P20	
P19	
P18	
P17	
P16	
P15	
P14	
P13	
P12	
P11	
P10	
P9 	
P8 	
P7 	
P6 	
P5 	
P4 	
P3 	
P2 	
P1 	
P0 	
Selective Backtracking Percentage
Average: 9.5%
6. Selective Backtracking
“Are there backtrackings that could not have done by regular undo?”
•  Selective backtracking?
–  There are edits in the middle
of a backtracking that change
other parts of the same file, that
are not backtracked together
VL/HCC 2014 31
0% 5% 10% 15% 20%
P20	
P19	
P18	
P17	
P16	
P15	
P14	
P13	
P12	
P11	
P10	
P9 	
P8 	
P7 	
P6 	
P5 	
P4 	
P3 	
P2 	
P1 	
P0 	
Selective Backtracking Percentage
Average: 9.5%
On average, 9.5% of all
backtracking instances were
selective, supporting that
programmers need better
selective backtracking tools
7. Repeat Count
“Do programmers backtrack to the same code repeatedly?”
85.0%
11.1%
2.7% 0.7% 0.6%
0%
20%
40%
60%
80%
100%
1 2 3 4 ≥5
PercentageofBacktrackedNodes
Repeat Count
VL/HCC 2014 32
Most (85%) of the time,
programmers backtrack once
and then never gets back to
the same state after diverging
from it
Wrapping Up	

VL/HCC 2014 33
Limitations of the Analysis
•  Only exact and successful backtracking instances were
detected
•  Only for Java / Eclipse
•  Could not determine the semantic relationships
among the backtracking instances
VL/HCC 2014 34
Main Takeaways
•  Programmers backtrack quite frequently (10.3/hr)
•  38% of the backtrackings are done purely manually
•  9.5% of the backtrackings are selective, meaning that
they are not supported by conventional undo
•  Programmers would benefit from better
backtracking tools!
VL/HCC 2014 35
Azurite – Selective Undo Tool
http://www.cs.cmu.edu/~azurite/
•  A selective undo plug-in for Eclipse IDE
–  can handle the 9.5% of selective backtrackings
•  Presented atVL/HCC
–  Initial User Interfaces of the Tool:
Yoon, Myers, & Koo,“Visualization of Fine-Grained Code Change History”,
Full Paper atVL/HCC’13
–  Tool Demonstration (yesterday):
Yoon & Myers,“A Demonstration of Azurite: Backtracking Tool for Programmers”,
Showpiece atVL/HCC’14
VL/HCC 2014 36
[Image Src:Attribution: cobalt, flickr.com - CC-BY-SA-2.0 ]
ThankYou!
•  FLUORITE: A logging plug-in for Eclipse
(Full of Low-level User Operations Recorded In The Editor)
available at: http://www.cs.cmu.edu/~fluorite/
•  AZURITE: A selective undo plug-in for Eclipse
(Adding Zest to Undoing and Restoring Improves Textual Exploration)
available at: http://www.cs.cmu.edu/~azurite/
•  Thanks for funding from:
VL/HCC 2014 37

Weitere ähnliche Inhalte

Was ist angesagt?

4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures
Islam Samir
 
Process Synchronization And Deadlocks
Process Synchronization And DeadlocksProcess Synchronization And Deadlocks
Process Synchronization And Deadlocks
tech2click
 

Was ist angesagt? (14)

Mutual exclusion and sync
Mutual exclusion and syncMutual exclusion and sync
Mutual exclusion and sync
 
TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...
TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...
TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...
 
TMPA-2017: Live testing distributed system fault tolerance with fault injecti...
TMPA-2017: Live testing distributed system fault tolerance with fault injecti...TMPA-2017: Live testing distributed system fault tolerance with fault injecti...
TMPA-2017: Live testing distributed system fault tolerance with fault injecti...
 
process creation OS
process creation OSprocess creation OS
process creation OS
 
SYNCHRONIZATION
SYNCHRONIZATIONSYNCHRONIZATION
SYNCHRONIZATION
 
Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"
 
Processes and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3eProcesses and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3e
 
Process synchronization in operating system
Process synchronization in operating systemProcess synchronization in operating system
Process synchronization in operating system
 
Building resilient scheduling in distributed systems with Spring
Building resilient scheduling in distributed systems with SpringBuilding resilient scheduling in distributed systems with Spring
Building resilient scheduling in distributed systems with Spring
 
Ipc feb4
Ipc feb4Ipc feb4
Ipc feb4
 
4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures
 
Process Synchronization And Deadlocks
Process Synchronization And DeadlocksProcess Synchronization And Deadlocks
Process Synchronization And Deadlocks
 
Functional Load Testing with Gatling
Functional Load Testing with GatlingFunctional Load Testing with Gatling
Functional Load Testing with Gatling
 
Operating system 23 process synchronization
Operating system 23 process synchronizationOperating system 23 process synchronization
Operating system 23 process synchronization
 

Ähnlich wie VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
inside-BigData.com
 
Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...
INRIA-OAK
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
Neil Chue Hong
 
PETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_PublicationsPETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_Publications
Andrea PETRUCCI
 

Ähnlich wie VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking (20)

Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...
 
Sbst2018 contest2018
Sbst2018 contest2018Sbst2018 contest2018
Sbst2018 contest2018
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
 
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff University
 
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
PETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_PublicationsPETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_Publications
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
 
Code instrumentation
Code instrumentationCode instrumentation
Code instrumentation
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
Making fitting in RooFit faster
Making fitting in RooFit fasterMaking fitting in RooFit faster
Making fitting in RooFit faster
 
ntcir14centre-overview
ntcir14centre-overviewntcir14centre-overview
ntcir14centre-overview
 

VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

  • 1. A Longitudinal Study of Programmers’ Backtracking YoungSeokYoon (youngseok@cs.cmu.edu) Institute for Software Research Carnegie Mellon University Brad Myers (bam@cs.cmu.edu) Human-Computer Interaction Institute Carnegie Mellon University
  • 3. What is Backtracking? •  Reverting code fragments to an earlier state •  Examples – Reverting a parameter to a previously used value – Removing debugging statements after fixing a bug – Restoring some deleted code – … VL/HCC 2014 3
  • 4. Previous Studies of Backtracking •  Two qualitative studies of backtracking [Yoon+, CHASE’12] 1.  Preliminary lab study (12 programmers) 2.  Online survey (48 respondents) VL/HCC 2014 4
  • 5. Previous Studies of Backtracking •  Observation – Programmers face challenges when backtracking •  locating the right code to be backtracked •  restoring some deleted code correctly •  reverting inter-related code fragments together – Programmers backtrack relatively often (75% answered at least “sometimes”) VL/HCC 2014 5
  • 6. Limitations of the Previous Studies •  Lab study tasks required participants to backtrack •  Survey results may not correctly reflect the reality (e.g., programmers might backtrack unconsciously) •  The analyses were mostly qualitative VL/HCC 2014 6
  • 7. A Longitudinal Study of Backtracking As a follow-up: VL/HCC 2014 7
  • 8. Longitudinal Study of Backtracking •  Two main goals – Obtain backtracking statistics in order to quantify the need for backtracking tools – Identify backtracking situations that are not very well supported by existing programming tools VL/HCC 2014 8
  • 9. Data Collection – Fluorite Logger http://www.cs.cmu.edu/~fluorite/ •  Eclipse logger for fine-grained code editing data [Yoon+, PLATEAU’11] •  Information collected: –  Initial snapshot of each source file –  All edit operations (insert, delete, or replace) –  Timestamps, executed editor commands, etc. •  Distributed to programmers since April 2012 VL/HCC 2014 9 [Image Src:Attribution: Rob Lavinsky, iRocks.com - CC-BY-SA-3.0]
  • 10. Study Participants Group Description No. of Participants Coding Time (hours)   [min  /  avg  /  max  /  sum]   The first author (myself) 1 294  /  294  /  294  /  294   Graduate students @ CMU 13    3  /    40  /  216  /  520   Research programmers / System scientists @ CMU 5    6  /  118  /  446  /  588   Graduate students @ UPitt 2    6  /    29  /    51  /    57   Total 21 people 1,460 hours VL/HCC 2014 10
  • 11. Analysis Process •  The data was too big for manual inspection – 1,345,241 coding events in the logs •  Key idea of the automated analysis – Keep the evolution history of individual AST nodes of interest throughout the lifetime of the nodes – Detect backtracking instances within each node VL/HCC 2014 11
  • 12. Analysis Process Illustrated VL/HCC 2014 12 package example; public class Example { public void printRectangleInfo() { Rectangle rect = getEnclosingRect(); int value = rect.getHeight(); System.out.println("Value:" + value); } public Rectangle getEnclosingRect() { // return some rectangle here // actual code omitted // ... } } [Example Source Code Being Processed] S1 S2 S3 Change history of S1 [v1] Rectangle rect = getEnclosingRect(); Change history of S2 [v1] int value = rect.getHeight(); [v2] int value = rect.getWidth(); [v3] int value = rect.getSize(); [v4] int value = rect.getHeight(); Change history of S3 [v1] System.out.println(value); [v2] System.out.println("Value:" + value); [Memory of the Analyzer] Backtracking Detected!
  • 13. Backtracking Instance A B C A B A v1 v2 v3 v4 v5 v6 time getHeight(); getWidth(); getSize(); getHeight(); getWidth(); getHeight(); Three Backtracking Instances: •  v1..v4 •  v2..v5 •  v4..v6 NOTE: v1..v6 is NOT a backtracking instance VL/HCC 2014 13
  • 14. Research Questions 1.  How frequently do programmers backtrack in reality? 2.  How large are the backtrackings? 3.  How exactly do programmers perform backtracking? Are they backtracking manually? 4.  Is there evidence of exploratory programming? 5.  Are there backtrackings performed across multiple editing sessions? 6.  Are there selective backtrackings, which cannot be performed by the undo command? 7.  Do programmers backtrack to the same code repeatedly? VL/HCC 2014 14
  • 15. 1. Frequency of Backtracking “How frequently do programmers backtrack in reality?” •  A total of 15,095 backtracking instances detected •  10.3 instances/hour on average VL/HCC 2014 15 0 10 20 30 P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Backtracking Instances per Hour 3.8 (min) 28.4 (max) Average: 10.3/h Rate varied across participants (min=3.8/h, max=28.4/h), but all of them backtracked frequently
  • 16. 2. Size of Backtracking “How large are the backtrackings?” •  How did we define the size of a backtracking? –  Measured the edit distance (Levenshtein distance) between the original version and the other versions –  Took the maximum value as the size of backtracking instance A B C D E A v1 v2 v3 v4 v5 v6 time farthest version (max edit distance) forward changes backward changes original version VL/HCC 2014 16
  • 17. 2. Size of Backtracking “How large are the backtrackings?” VL/HCC 2014 17 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  • 18. 2. Size of Backtracking “How large are the backtrackings?” •  Method / variable names •  String literals •  Number literals VL/HCC 2014 18 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  • 19. 2. Size of Backtracking “How large are the backtrackings?” •  Simple parameter changes •  Reverting renaming changes on methods or variables VL/HCC 2014 19 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  • 20. 2. Size of Backtracking “How large are the backtrackings?” •  Single statement changes •  Surrounding existing code (e.g., try-catch) then reverting VL/HCC 2014 20 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  • 21. 2. Size of Backtracking “How large are the backtrackings?” •  Adding, removing, or modifying multiple statements and then reverting them altogether VL/HCC 2014 21 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  • 22. 2. Size of Backtracking “How large are the backtrackings?” •  Significant algorithmic changes •  Adding / removing / modifying multiple methods and then reverting VL/HCC 2014 22 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  • 23. 2. Size of Backtracking “How large are the backtrackings?” VL/HCC 2014 23 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters) Programmers backtrack at varying granularities, from simple name changes to significant algorithmic changes
  • 24. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% VL/HCC 2014 24
  • 25. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% •  Undo (37%) •  Paste (6%) •  Redo (3%) •  Content Assist (2%) •  Toggle Comment (1%) VL/HCC 2014 25
  • 26. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% •  Unidentified (9%) •  Multiple (4%) VL/HCC 2014 26
  • 27. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% •  Manual Deletion (25%) •  Manual Typing (13%) VL/HCC 2014 27 38% of the backtracking instances were NOT supported by existing tools, indicating programmers need better backtracking tools
  • 28. 4. Cross-Run Backtracking “Is there evidence of exploratory programming?” •  Make some changes à run the application à revert the code back to the way it was before •  20.4% of all instances were cross- run instances on average. VL/HCC 2014 28 0% 10% 20% 30% 40% 50% P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Cross-Run Backtracking Percentage Average: 20.4% This provides support that programmers do this kind of exploratory programming.
  • 29. 5. Cross-Session Backtracking “Are there backtrackings performed across multiple editing sessions?” 96.7% 98.2% 98.8% 99.0% 99.2% 99.3% 96% 97% 98% 99% 100% Same Session ≤1 ≤2 ≤3 ≤4 ≤5 CumulativePercentageofAllBIs Editing Session Distance VL/HCC 2014 29 A backtracking tool would work for 97% of the cases with only the history within the same editing session.
  • 30. 6. Selective Backtracking “Are there backtrackings that could not have done by regular undo?” •  Selective backtracking? –  There are edits in the middle of a backtracking that change other parts of the same file, that are not backtracked together VL/HCC 2014 30 0% 5% 10% 15% 20% P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Selective Backtracking Percentage Average: 9.5%
  • 31. 6. Selective Backtracking “Are there backtrackings that could not have done by regular undo?” •  Selective backtracking? –  There are edits in the middle of a backtracking that change other parts of the same file, that are not backtracked together VL/HCC 2014 31 0% 5% 10% 15% 20% P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Selective Backtracking Percentage Average: 9.5% On average, 9.5% of all backtracking instances were selective, supporting that programmers need better selective backtracking tools
  • 32. 7. Repeat Count “Do programmers backtrack to the same code repeatedly?” 85.0% 11.1% 2.7% 0.7% 0.6% 0% 20% 40% 60% 80% 100% 1 2 3 4 ≥5 PercentageofBacktrackedNodes Repeat Count VL/HCC 2014 32 Most (85%) of the time, programmers backtrack once and then never gets back to the same state after diverging from it
  • 34. Limitations of the Analysis •  Only exact and successful backtracking instances were detected •  Only for Java / Eclipse •  Could not determine the semantic relationships among the backtracking instances VL/HCC 2014 34
  • 35. Main Takeaways •  Programmers backtrack quite frequently (10.3/hr) •  38% of the backtrackings are done purely manually •  9.5% of the backtrackings are selective, meaning that they are not supported by conventional undo •  Programmers would benefit from better backtracking tools! VL/HCC 2014 35
  • 36. Azurite – Selective Undo Tool http://www.cs.cmu.edu/~azurite/ •  A selective undo plug-in for Eclipse IDE –  can handle the 9.5% of selective backtrackings •  Presented atVL/HCC –  Initial User Interfaces of the Tool: Yoon, Myers, & Koo,“Visualization of Fine-Grained Code Change History”, Full Paper atVL/HCC’13 –  Tool Demonstration (yesterday): Yoon & Myers,“A Demonstration of Azurite: Backtracking Tool for Programmers”, Showpiece atVL/HCC’14 VL/HCC 2014 36 [Image Src:Attribution: cobalt, flickr.com - CC-BY-SA-2.0 ]
  • 37. ThankYou! •  FLUORITE: A logging plug-in for Eclipse (Full of Low-level User Operations Recorded In The Editor) available at: http://www.cs.cmu.edu/~fluorite/ •  AZURITE: A selective undo plug-in for Eclipse (Adding Zest to Undoing and Restoring Improves Textual Exploration) available at: http://www.cs.cmu.edu/~azurite/ •  Thanks for funding from: VL/HCC 2014 37