2. Motivations
âą Software systems lack adequate documentation
âą Developers try to understand systems through
â Static analyses, visualizations built upon static data
â Dynamic analyses, requiring the execution of the system
âą (Dynamic) concept identification
â Identify sets of method calls in execution traces responsible
for the implementation of domain concepts or user-observable
features
â Existing approaches based on static analysis [Anquetil and
Lethbridge (1998)], dynamic analysis [Wilde and Scully (1995)
Tonella and Ceccato (2004)], IR techniques [Poshyvanyk et
al. (2007)], or hybrid ones [Eaddy et al. (2008)]
CSMR 2010 - Madrid (Spain) 2
3. Proposed approach
A novel approach that analyzes execution traces and
groups together method calls that:
(i) sequentially invoked together/in sequence
(ii) cohesive and decoupled from a conceptual point of view
Assumptions
Let us consider a feature is being executed in a scenario
â e.g., âOpen a Web page from a browserâ
or âSave an image in a paint applicationâ
The set of methods related to the feature is likely to be:
â (i) conceptually cohesive
â (ii) decoupled from those of other features
â (iii) sequentially invoked
CSMR 2010 - Madrid (Spain) 3
4. Proposed approach
Step I â System instrumentation
Step II â Execution trace collection
Step III â Trace pruning and compression
Step IV â Textual analysis of methodsâ
source code
Step V â Search-based concept
identification
CSMR 2010 - Madrid (Spain) 4
5. Step I and Step II â Getting Traces
Step I - System instrumentation
System instrumented using the MoDeC instrumentor
â MoDeC tool to extract and model sequence diagrams for
Java systems
Java bytecode instrumentation tool
â Inserts appropriate and dedicated method invocations in the
system to method/constructor entry/exit, points
â Allows for trace tagging
Step II - Execution trace collection
We exercise a system following operation sequences
taken from user manuals or use case descriptions
CSMR 2010 - Madrid (Spain) 5
6. Step III â Trace Pruning and Compression
Removing methods not very useful for feature identification
Methods occurring in many scenarios
â Are often utility methods
â We use the same idea of tf-idf in Information Retrieval
Too frequent methods
â Could be for example related to crosscutting concerns
â We remove methods having a frequency
Q3 + 2 Ă IQR (75% percentile + 2 Ă the interquartile range)
Trace compression
Aim: collapse repetitions in execution traces
Purpose: reduce the search space for Step V
Examples:
â m1(); m1(); m1(); m1();
m1; m2();
â m1(); m2(); m1(); m2();
Performed using the Run Length Encoding (RLE)
Applied for sub-sequences having an arbitrary length
CSMR 2010 - Madrid (Spain) 6
7. Step IV
Conceptual cohesion and coupling determined according
to [Marcus et al., 2008] and [Poshyvanyk et al., 2006]
Index identifiers, comments contained in methods
Extraction of identifiers and comment words
Camel-case splitting of composed identifiers
Stop word removal (English + Java keywords)
Stemming using the Porter stemmer
Indexing using tf-idf
Reduce the term-document space into a (smaller) concept-
document space using Latent Semantic Indexing (LSI)
â Helps to cope with synonymy and homonymy
â Concept space=50
CSMR 2010 - Madrid (Spain) 7
8. Step V
We use a search-based optimization technique based on Genetic
Algorithms (GA) to split traces into segments
Representation: a bit-vector where 1 indicates the end of a segment
Trace splitting m1 m2 m1 m3 m4 m1 m4 m6 m1
Representation 0 1 0 0 1 0 0 0 1
Mutation: randomly flips a bit (i.e., splits or merge segments)
0 1 0 0 1 0 0 0 1 0 0
1 0 0 1 0 0 0 1
Crossover: two-points
0 1 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1
Selection: Roulette Wheel
CSMR 2010 - Madrid (Spain) 8
9. Step V â Quality of the Solution
Fitness Function:
Segment Cohesion is the average (textual) similarity
between any pair of methods in a segment
Segment Coupling is the average (textual) similarity
between a segment and all other segments in the trace
Other GA parameters
200 individuals
2,000 generations for JHotDraw and 3,000 for ArgoUML
5% mutation probability, 70% crossover probability
Distributed GA implementation (across 4 servers)
CSMR 2010 - Madrid (Spain) 9
10. Empirical Study
âą Goal: analyze the novel concept location approach based
âą Purpose: of evaluating its capability of identifying
meaningful concepts
âą Quality focus: accuracy and completeness of the
identified concepts
âą Context: an implementation of our approach and
execution traces extracted from two open source
systems, JHotDraw and ArgoUML
CSMR 2010 - Madrid (Spain) 10
11. Research Questions
RQ1: How stable is the GA, through
multiple runs, when identifying concepts
into execution traces?
RQ2: To what extent the identified
concepts match the ones in the oracle?
RQ3: How accurate is the identification of
concepts in execution traces?
CSMR 2010 - Madrid (Spain) 11
12. RQ1: GA stability
We compute the overlap between segmentations
obtained in multiple runs using the Jaccard overlap
Score
Two segments overlaps when they contain calls in the same position
of the trace
Because a segment of trace T1 overlaps with more segments of T2,
the highest similarity is chosen
Run 1 m1 m2 m1 m3 m4 m1 m4 m6 m1
Run 2 m1 m2 m1 m3 m4 m1 m4 m6 m1
2/3 2/4 3/4
CSMR 2010 - Madrid (Spain) 12
13. RQ1: Results
Average overlap between 72% and 84%
Slightly higher convergence for ArgoUML
Ability of the algorithm to converge, despite the
relatively large search space
CSMR 2010 - Madrid (Spain) 13
14. RQ2: Matching with the Oracle
We manually tag start-end of features while
executing the system
Using the MoDeC instrumentation tool
While executing the instrumented system, the user triggers the
introduction of <Start> and <Stop> tags in the trace
Matching between identified traces and oracle
computed as in RQ1
Run 1 m1 m2 m1 m3 m4 m1 m4 m6 m1
Oracle m1 m2 m1 m3 m4 m1 m4 m6 m1
2/3 2/4 3/4
CSMR 2010 - Madrid (Spain) 14
15. RQ2: Results
High overlap for some features
e.g., Draw rectangle or Draw circle
Lower for features obtained adapting other ones
e.g., Add text obtained adapting Draw rectangle
In other cases, low overlap is due to large segments
split into more smaller and cohesive ones
CSMR 2010 - Madrid (Spain) 15
16. RQ3: Accuracy in trace identification
Computed similarly to RQ2, however we use
Precision instead of Jaccard overlap Score
Run 1 m1 m2 m1 m3 m4 m1 m4 m6 m1
Oracle m1 m2 m1 m3 m4 m1 m4 m6 m1
2/2 2/3 3/4
CSMR 2010 - Madrid (Spain) 16
17. RQ3: Results
Precision often very high
In most cases above 85% and often equal to 100%
Low precision (mean 32%) for Add text
Relatively low (mean 69%) for Draw rectangle
These two features are difficult to be distinguished
CSMR 2010 - Madrid (Spain) 17
18. Inspection of the obtained segments
Add class (ArgoUML)
The approach split this long feature of 199 methods sequence into 5 segments
related to sub-features (creation of objects, adding the project class, handling
namespace, setting object properties, handling persistence of the diagram)
Create note (ArgoUML)
Only the first part (50 methods) of the trace composed of 88 calls was identified
Problems related to multi-threading
Problems related to collapsing (during compression) loops containing variants
Cut rectangle (JHotDraw)
Only the last 39 out of 172 calls were included in the segment
Methods related to adding to the clipboard and showing the rectangle as âcutâ
First methods related to GUI events and split in many small segments
Spawn window (JHotDraw)
72 out of 197 methods included
The remaining ones were related to setting up menu command properties
CSMR 2010 - Madrid (Spain) 18
19. Threats to Validity
Construct validity (relation btw. theory and observation)
Multi-threading can change the ordering of calls in multiple
executions of the same scenario
A better assessment of the actual content of the obtained
segments is needed
Internal validity (presence of confounding factors)
Trace tagging may be imprecise, again due to multi-threading
Noise due to utility methods
GA intrinsic randomness
External validity (generalization of findings)
We analyzed two different systems, multiple traces
As usual, further empirical evaluation is needed
CSMR 2010 - Madrid (Spain) 19
20. Conclusions
We proposed a search-based approach to automatically locate
concepts in execution traces
By splitting traces into conceptually cohesive and decoupled segments
Empirical study on traces from JHotDraw and ArgoUML shows that
The approach is stable
Identified segments highly precise
Finer-splitting wrt. high-level features
Limitations due to: multi-threading, GUI events, feature adaptation..
Work-in-progress:
Improve performance
Use enhanced compression techniques
Automatically label identified concepts
Perform an extensive empirical validation
CSMR 2010 - Madrid (Spain) 20
21. Thank You!
Questions?
CSMR 2010 - Madrid (Spain) 21