SlideShare ist ein Scribd-Unternehmen logo
1 von 80
Downloaden Sie, um offline zu lesen
Data Mining-based Tools to Support Library Update
Oleksandr ZAITSEV
PhD Thesis Defence
Composition of the jury:
Reviewers:
Olga KOUCHNARENKO
Examiner:
Romain ROBBES
Coen DE ROOVER
Stéphane DUCASSE
Nicolas ANQUETIL
Supervisors:
Arnaud THIEFAINE
Co-supervisor:
2
(software company) (research institute)
Advisor: Arnaud THIEFAINE Supervisors: Stéphane DUCASSE
Nicolas ANQUETIL
Paris Lille
Cifre PhD
3
Arolla is a consulting company specialised in
the advanced techniques of software development:
Clean Code, TDD, BDD, Legacy Remediation, etc.
https://www.arolla.fr/
Introduction
Part 1:
5
Library Update Problem
Client
System
Library
v1.0
Client
Developer
Library
Developer
depends
Part 1 / 6
6
Client
System
Library
v1.0
Library
v2.0
Client
Developer
Library
Developer
depends
Library Evolution
Library Update Problem
Part 1 / 6
7
Client
System
Library
v1.0
Library
v2.0
Client
Developer
Library
Developer
depends
Library Evolution
! Errors!
Library Update Problem
Part 1 / 6
8
Updated
Client
System
Client
System
Library
v1.0
Library
v2.0
Client
Developer
Library
Developer
depends
Library Evolution
Library Update
Library Update Problem
Part 1 / 6
9
How to change client code
to use the new version of a library?
Supporting developers during library update
by building tools
My thesis is about:
Library update problem:
Scope of my Thesis
Part 1 / 6
10
How to use the new version of the same library?
How to use a different (although similar) library?
Library Update
Disambiguation
Library Migration
v1.0 v2.0
A B
(e.g., update Struts v1.0 to Struts v1.2)
(e.g., replace EasyMock with Mockito)
Part 1 / 6
11
Motivating Example
from ailib.models import LinearRegression
from ailib.data import readData
data = readData(‘dataset.csv’, type=‘CSV’)
model = LinearRegression()
model.train(data, ycolumn=‘salary’)
salary = model.predict([26, ‘female’])
ailib v1.0
Client
Developer
depends
Part 1 / 6
12
Motivating Example
from ailib.models import LinearRegression
from ailib.data import readData
data = readData(‘dataset.csv’, type=‘CSV’)
model = LinearRegression()
model.train(data, ycolumn=‘salary’)
salary = model.predict([26, ‘female’])
ailib v2.0
Client
Developer
depends
Part 1 / 6
! Error: LinearRegression not found
! Error: readData() not found
! Error: wrong argument passed to predict()
?
13
Motivating Example
ailib v2.0
Client
Developer
depends
Part 1 / 6
1. Rename LinearRegression to
AILinearRegression
2. Replace readData(file, type)
with readCsv(data)
3. predict() should now accept a
list of rows instead of a single row
Library
Developer
from ailib.models import AILinearRegression
from ailib.data import readCsv
data = readCsv(‘dataset.csv’)
model = AILinearRegression()
model.train(data, ycolumn=‘salary’)
salary = model.predict([[26, ‘female’]])
communicate
14
Understand library
update in practice:
‣ What problems do
developers face?
‣ What support do
they need?
Propose a language to
express code
transformations
Automatic
transformations
discovery
3 Aspects of the Problem
Empirical Language Automation
Part 1 / 6
15
Pharo is a pure object-oriented
dynamically-typed programming
language.
We focus on Pharo because:
1. We have access to its
core developers
2. Pharo is convenient for
manipulating source code
Pharo Programming Language
Part 1 / 6
1. Introduction
2. State of the Art
3. Developer Survey of Library Update
4. Deprewriter: Smart deprecation rewriting
5. DepMiner: Recommending transformation rules
6. Conclusion
Plan
State of the Art
Part 2:
18
Technique Perspective
Client developer
Library developer
Both
Developer survey
Code analysis
Empirical Studies
Part 2 / 6
19
Paper Survey Code Analysis Client Dev. Library Dev.
[Robbes 2012a] X ✓ ✓ X
[Jezek 2015] X ✓ ✓ ✓
[Hora 2015] X ✓ ✓ X
[Bogart 2016] ✓ X ✓ ✓
[Sawant 2016] X ✓ ✓ X
[Xavier 2017a] X ✓ ✓ ✓
[Xavier 2017b] ✓ X X ✓
[Hora 2018] X ✓ ✓ X
[Kula 2018a] ✓ ✓ ✓ X
[Kula 2018b] X ✓ X ✓
[Brito 2019] ✓ X ✓ ✓
Empirical Studies
Part 2 / 6
20
Shortcomings:
1. Most surveys analyse specific cases of breaking changes.
2. No survey has asked client developers about what makes
library update hard and what makes it easy.
Empirical Studies
Part 2 / 6
21
Problem Perspective
Client developer
Library developer
Both
Library update
Library migration
v1.0 v2.0
A B
Tools to Support Library Update
Part 2 / 6
22
Sources of Information Technique
exp — developer expertise
doc — documentation
hist — commit history
2v — two versions of source code
TS — textual similarity
SS — structural similarity
CD — call dependency
Tools to Support Library Update
Part 2 / 6
L — library
C — already updated clients
T — unit tests
23
Paper Update Library dev. Source Technique Dynamic Rewriting
[Chow 1996] ✓ ✓ exp(L) — X
[Henkel 2005] ✓ ✓ exp(L) — X
[Kim 2007] ✓ — 2v(L) TS X
[Xing 2007] ✓ X 2v(L) SS, TS X
[Dagenais 2008] ✓ X hist(L) CD X
[Schäfer 2008] ✓ X 2v(C,T) CD X
[Wu 2010] ✓ X 2v(L) CD, TS X
[Nguyen 2010] ✓ X 2v(L,C,T) CD, TS, SS X
[Meng 2012] ✓ X hist(L) CD X
[Teyton 2013] X X hist(L) CD X
[Hora 2014] ✓ X hist(L) CD ✓
[Pandita 2015] X X 2v(L) TS X
[Alrubaye 2019] X X hist(C), doc(L) CD, TS X
[Alrubaye 2020] X X doc(L) CD, TS X
Tools to Support Library Update
Part 2 / 6
24
Shortcomings:
1. No studies propose automated tools to support library developers.
2. Most studies do not consider dynamic rewriting
3. Most studies focus on statically-typed programming languages.
Tools to Support Library Update
Part 2 / 6
(empirical aspect)
Developer Survey
Part 3:
O. Zaitsev, S. Ducasse, N. Anquetil,
and A. Thiefaine. How Libraries Evolve:
A Survey of Two Industrial Companies
and an Open-Source Community.
APSEC (industrial track), 2022.
26
Updated
Client
System
Client
System
Library
v1.0
Library
v2.0
Client
Developer
Library
Developer
depends
Understanding the Developers
Part 3 / 6
27
Updated
Client
System
Client
System
Library
v1.0
Library
v2.0
Client
Developer
Library
Developer
depends
‣ How do they behave
in the context of
library update?
‣ What problems do
they face?
‣ What support do
they need?
Understanding the Developers
Part 3 / 6
28
Open-Source Industry
Library developers 11 5
Client Developers 22 9
We selected
developers from
diverse communities
for our survey
Developer Population
Part 3 / 6
29
Three times a year or more often
Twice a year
Once a year
Less often
We do not do it regularly
Q: How often do they face the problem of library update?
Selected Findings
17
10
3
3
3
Part 3 / 6
Client
Developers
30
Q: What makes library update easy? Q: What makes it hard?
Selected Findings
Part 3 / 6
Factor dev.
Documentation 15
Absence of breaking changes 11
Test coverage 6
Tool support 5
Deprecations 4
Simple breaking changes 4
Community support 3
Factor dev.
Breaking changes 11
Absent or bad documentation 10
Indirect dependencies 7
Big changes to the API 7
Poor test coverage 4
Removed functionality 3
Changed hooks or abs. classes 3
Client
Developers
31
Very small impact
Small impact
Moderate impact
Big impact
Very big impact
Q: What is the impact of breaking changes
on their clients?
Selected Findings
1
0
9
6
2
Part 3 / 6
Library
Developers
32
Q: Is it important for library developers to encourage
their clients to update?
Selected Findings
Not important at all
Of little importance
Of average importance
Very important
Absolutely essential
0
1
5
8
4
Part 3 / 6
Library
Developers
33
Library
Developers
Client
Developers
‣ Often have to deal with the
problem of library update
‣ Need documentation and support
for breaking changes
‣ Want to help their clients
to update
Survey Conclusion
Part 3 / 6
(language aspect)
Deprewriter.
On-the-fly rewriting
method deprecations
Part 4:
S. Ducasse, G. Polito, O. Zaitsev, M.
Denker, and P. Tesone. Deprewriter: On
the fly rewriting method deprecations.
JOT, 2021.
35
Deprecation
Deprecation message
Method Deprecations in Pharo
isSpecial
self deprecated: ‘Renamed to #needsFullDefinition’.
^ self needsFullDefinition
Part 4 / 6
36
Deprecation
Deprecation message
Transformation rule
Rewriting Deprecations
isSpecial
self
deprecated: ‘Renamed to #needsFullDefinition’
transformWith:
‘`@receiver isSpecial’ -> ’`@receiver needsFullDefinition’.
^ self needsFullDefinition
Part 4 / 6
37
isSpecial
self
deprecated: ‘Renamed to #needsFullDefinition’
transformWith:
‘`@receiver isSpecial’ -> ’`@receiver needsFullDefinition’
^ self needsFullDefinition
Antecedent
(left hand side)
matches the method calls
that should be replaced
Consequent
(right hand side)
de
fi
nes the replacement
Transformation Rule
Part 4 / 6
38
‘@receiver whenSelectionIndexChanged: ‘@argument
Expressing Complex Rules
Part 4 / 6
‘@receiver selection whenChangedDo:
[ :selection |
‘@argument value: selection selectedIndex ]
Antecedent:
Consequent:
39
method
“do something”
method
self
deprecated: ‘Explain why’
transformWith:
‘`@rec method’ ->
‘`@rec newMethod’.
Library
Developer
x.method()
x.newMethod()
“do something”
!
Client
Developer
Step 1 Step 2
Step 3
Step 4
Step 5
Client code is
rewritten
Execution
continues
Rewriting in Action
Part 4 / 6
40
Pharo
41 %
59 %
Rewriting deprecations
(contain a transformation rule)
Non-rewriting deprecations
(no transformation rule)
Analysis of Deprecations in Pharo 8
(367)
Part 4 / 6
41
Pharo
9 %
32 %
59 %
Rewriting deprecations
(contain a transformation rule)
Non-rewriting deprecations
(no transformation rule)
Obvious opportunity
Analysis of Deprecations in Pharo 8
(367)
Part 4 / 6
42
Java
33 %
67 %
Deprecations with helpful
replacement messages
Deprecations without helpful
replacement messages
C#
22 %
78 %
JS
33 %
67 %
[Brito et al., 2018] [Brito et al., 2018] [Nascimento et al., 2020]
Replacement Messages
Part 4 / 6
43
Deprewriter Summary
Part 4 / 6
✓ First documentation of Deprewriter approach
✓ Validity criteria for the transformation rules
(detected 8 invalid rules in Pharo — merged PR)
✓ Analysis and discussion of non-rewriting deprecations in Pharo 8
✓ Survey of developers who use Deprewriter
Contributions:
(automation aspect)
DepMiner: Inferring rules
from the commit history
Part 5:
O. Zaitsev, S. Ducasse, N. Anquetil, and
A. Thiefaine. DepMiner: Automatic
Recommendation of Transformation Rules
for Method Deprecation. ICSR, 2022.
45
The Need for Automation
Deprewriter
Library
Developer
Client
System
rules update
Part 5 / 6
46
The Need for Automation
Deprewriter
Library
Developer
Client
System
update
Tool
Commit
History
rules
Part 5 / 6
47
Missing methods — public methods that were present in the old version
and no longer exist in the new version.
new API
old API
Step 1. Detect Breaking Changes
Part 5 / 6
48
Step 1. Detect Breaking Changes
Part 5 / 6
Absence of method visibility.
De
fi
ne language-speci
fi
c heuristics.
https://github.com/olekscode/VisibilityDeductor
Challenge 1:
How we address it:
49
{
Id: ef4fdd35fb05e74aa12aad4d22a37e17a8d87b5b,
Removed methods: […],
Added methods: […],
Modified methods: [
{
Old source code: …,
New source code: …,
Removed method calls: [smartDescription],
Added method calls: [description],
}],
Added classes: […],
Removed classes: […],
…
}
Line-based diffs High-level commits
Which lines of code were added or removed? Which methods, classes, or packages
were added, removed, or modified?
Q:
Q:
Step 2. Data Representation
Part 5 / 6
50
Customer 1:
Customer 2:
Customer 3:
{ bread, butter, avocado }
{ bread, butter, bananas }
{ bread, butter, milk, cereal }
Customer 4: { bread, milk, cereal }
Customer 5: { butter, milk, cereal }
Transactions: Q1: What are the products that are
frequently purchased together?
Q2: What can we recommend to
people who buy bread?
(frequent itemsets)
(association rules)
Step 3. Market Basket Analysis
Part 5 / 6
51
Customer 1:
Customer 2:
Customer 3:
{ bread, butter, avocado }
{ bread, butter, bananas }
{ bread, butter, milk, cereal }
Customer 4: { bread, milk, cereal }
Customer 5: { butter, milk, cereal }
Transactions: Q1: What are the products that are
frequently purchased together?
Q2: What can we recommend to
people who buy bread?
{ bread } { butter }
Con
fi
dence: 75%
{ bread, butter }
{ milk, cereal }
Support: 60%
Support: 60%
Step 3. Market Basket Analysis
Part 5 / 6
52
Q1: What are the operations that frequently appear together in
method changes?
Q2: What can we recommend as a replacement for next() ?
{ next } { nextNode }
Con
fi
dence: 75%
{ remove(next), add(nextNode) }
Support: 60%
Part 5 / 6
Step 3. Market Basket Analysis
53
Node >> next
self
deprecated: ‘Use #nextNode instead.’
transformWith:
‘`@receiver next’ ->
’`@receiver nextNode’.
^ self nextNode
Missing Method
Node >> next
Association Rule
{next} {nextNode}
Step 4. Generate Deprecations
Part 5 / 6
generate
54
Step 4. Generate Deprecations
Part 5 / 6
Absence of static type information.
Retain only those association rules, where
methods in antecedent and consequent
of the rule are de
fi
ned in the same class.
Challenge 2:
How we address it:
55
Prototype Tool for Pharo
Part 5 / 6
56
Famix
6,538
Moose
1,670
Pillar
5,848
Pharo
116,212
DataFrame
661
Tool
Library
SDK
Project type:
N Size
(num. methods)
Reviewers
Selecting Projects
Part 5 / 6
57
113
56
5
52
33
1
32
87
28
40
19
1
1
11
4
7
Pharo Famix DataFrame
Accepted recommendations
Rewriting
Non-rewriting
Rejected recommendations
Moose Pillar
Recommended Deprecations
Part 5 / 6
58
Limitations of DepMiner
Limitation 1 Limitation 2 Limitation 3
Simplified recommendations Unused / untested methods Naive search
Developer may not want
to deprecate.
It’s correct
but I will not
deprecate
Ineffective for methods that
are not called internally
test
test
test
Coveragre:
80%
v1.0 v2.0
breaking
change
effect
Search entire
commit history
Part 5 / 6
Overcoming Limitation 1
Deprecation with a
transformation rule
Deprecation with a
replacement message
Deprecation without a
replacement message
Library update script
with automatic rules
Documentation with
replacement message
Documentation: list of
breaking changes
Does
replacement
exist?
Is it
automatable?
yes no
no
yes
Does
developer want to
deprecate?
yes
no
Identified with the help of developers
from Pharo open-source community
59
Part 5 / 6
Conclusion
Part 6:
61
1. Developer survey
to understand the
practice of library
update
Contributions
2. First documentation
of the Deprewriter
approach.
3. Study of its adoption
by the community
4. DepMiner — a novel
approach to mine
transformation rules.
5. Generalisation of
DepMiner as a
holistic approach.
Empirical Language Automation
Part 6 / 6
Future Work. Improve DepMiner
62
Limitation 1 Limitation 2 Limitation 3
Simplified recommendations Unused / untested methods Naive search
Consider other actions of
library developers
(based on the table)
Combine with existing
approaches to detect
refactoring.
Detect commit that
introduced BC
Search neighbour commits
Part 6 / 6
Future Work. Challenging Cases
63
1. Reassigning the existing name
2. Circular renaming
3. Modifying abstract hooks
4. Cleaning up spurious objects
5. String literals used as identifiers
Method-to-method mapping:
Part 6 / 6
Documented challenges:
old API new API
64
2 Journal Papers
‣ N. Anquetil, J. Delplanque, S. Ducasse, O. Zaitsev, C. Fuhrman, and Y.-G. Guéhéneuc.
What Do Developers Consider Magic literals? A Smalltalk Perspective. IST, 2022.
‣ S. Ducasse, G. Polito, O. Zaitsev, M. Denker, and P. Tesone. Deprewriter: On the fly
rewriting method deprecations. JOT, 2021.
3 Conference Papers
‣ O. Zaitsev, S. Ducasse, N. Anquetil, and A. Thiefaine. How Libraries
Evolve: A Survey of Two Industrial Companies and an Open-Source
Community. APSEC (industrial track), 2022.
‣ O. Zaitsev, S. Ducasse, N. Anquetil, and A. Thiefaine. DepMiner:
Automatic Recommendation of Transformation Rules for Method
Deprecation. ICSR, 2022.
‣ O. Zaitsev, S. Ducasse, A. Bergel, and M. Eveillard. Suggesting
Descriptive Method Names: An Exploratory Study of Two Machine
Learning Approaches. QUATIC, 2020.
+ 2 Workshop Papers
& 1 technical report
2nd best paper award at IWST’22
Best poster award at GDR GPL
Publications (7 papers)
Part 6 / 6
✓ Library update survey of developers.
✓ First documentation of Deprewriter
✓ Analysis of its adoption by the community.
✓ DepMiner — a novel approach to mine rules.
✓ DepMiner as holistic approach.
Summary
Contributions:
Updated
Client
System
Client
System
Library
v1.0
Library
v2.0
Client
Developer
Library
Developer
depends
Library Update:
DepMiner tool:
7
2
> 100
papers
awards
merged pull requests
DepMiner
Support A:
Static & Dynamic Analysis
68
Deprewriter
Client
System
Commit
History
DepMiner
Statically
mine rules
from history
Dynamically
apply rules to
client code
Mining Method Call Replacements
69
Local
subset of
commits
missing
method m()
history
Method changes
that remove a
call to m()
{remove(m), add(m’), add(x)}
{remove(m), remove(n), add(m’)}
…
{remove(m), add(m’)}
{remove(m), add(m’)}
count: 15
{remove(m), remove(n), add(x)}
count: 5
{m} → {m’}
support: 15
confidence: 0.5
{m, n} → {x}
support: 5
confidence: 0.3
A-Priori
Transactions
Frequent Itemsets
Association Rules
70
public static LinkedList insert(LinkedList list, int data)
{
Node new_node = new Node(data);
- new_node.setNext(null);
+ new_node.setNextNode(null);
if (list.head() == null) {
list.setHead(new_node);
}
else {
Node last = list.head;
- while (last.next() != null) {
- last = last.next();
+ while (last.nextNode() != null) {
+ last = last.nextNode();
}
last.next = new_node;
}
return list;
}
Method change —
one method modified
by one commit
Method Change
Part 5 / 6
71
public static LinkedList insert(LinkedList list, int data)
{
Node new_node = new Node(data);
- new_node.setNext(null);
+ new_node.setNextNode(null);
if (list.head() == null) {
list.setHead(new_node);
}
else {
Node last = list.head;
- while (last.next() != null) {
- last = last.next();
+ while (last.nextNode() != null) {
+ last = last.nextNode();
}
last.next = new_node;
}
return list;
}
{
remove(setNext),
add(setNextNode),
remove(next),
remove(next),
add(nextNode),
add(nextNode)
}
Transaction — set of
added and removed
method calls in a
method change:
Method Change as Transaction
Part 5 / 6
Challenging Problems
of Library Update
Support B:
Rewriting Abstract Hooks
73
FileReader
read()
Library v1.0
CSVFileReader
read()
Client
Rewriting Abstract Hooks
74
FileReader
read()
FileReader
readFile()
Library v2.0
Library v1.0
CSVFileReader
read()
CSVFileReader
read()
Client
Client
renamed
Method is
never called
Removing Spurious Objects
75
300 px
200
px
50
px offsets
(a Rectangle)
-71
46
120 px
fractions
(a Rectangle)
0.4
0.25
Removing Spurious Objects
76
300 px
200
px
-71
46
bottomOffset
rightOffset
0.4
0.25
bottomFraction
rightFraction
Removing Spurious Objects
77
frame := LayoutFrame new
topFraction: 0 offset: 0;
leftFraction: 0 offset: 0;
bottomFraction: 0.4 offset: 46;
leftFraction: 0.25 offset: -71;
yourself.
fractionRectangle := 0@0 extent: 0.4@0.25.
offsetRectangle := 0@0 extent: 46@(-71).
frame := LayoutFrame
fractions: fractionRectangle
offsets: offsetRectange.
v1.0
v2.0
Removing Spurious Objects
78
frame := LayoutFrame new
topFraction: 0 offset: 0;
leftFraction: 0 offset: 0;
bottomFraction: 0.4 offset: 46;
leftFraction: 0.25 offset: -71;
yourself.
fractionRectangle := 0@0 extent: 0.4@0.25.
offsetRectangle := 0@0 extent: 46@(-71).
frame := LayoutFrame
fractions: fractionRectangle
offsets: offsetRectange.
v1.0
v2.0
We can rewrite
method calls
Removing Spurious Objects
79
frame := LayoutFrame new
topFraction: 0 offset: 0;
leftFraction: 0 offset: 0;
bottomFraction: 0.4 offset: 46;
leftFraction: 0.25 offset: -71;
yourself.
fractionRectangle := 0@0 extent: 0.4@0.25.
offsetRectangle := 0@0 extent: 46@(-71).
frame := LayoutFrame
fractions: fractionRectangle
offsets: offsetRectange.
v1.0
v2.0
Hard to find
and remove
a
b
b
c
Library v1.0 Library v2.0
Problematic Renaming
80
a
b
b
a
Library v1.0 Library v2.0
Reassigning the Name Circular Renaming

Weitere ähnliche Inhalte

Ähnlich wie Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr ZAITSEV

Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
Neil Chue Hong
 
Reverse engineering and theory building v3
Reverse engineering and theory building v3Reverse engineering and theory building v3
Reverse engineering and theory building v3
ClarkTony
 

Ähnlich wie Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr ZAITSEV (20)

Open Source Library System Software: Libraries Are Doing it For Themselves
Open Source Library System Software: Libraries Are Doing it For ThemselvesOpen Source Library System Software: Libraries Are Doing it For Themselves
Open Source Library System Software: Libraries Are Doing it For Themselves
 
A Practical Road to SaaS in Python
A Practical Road to SaaS in PythonA Practical Road to SaaS in Python
A Practical Road to SaaS in Python
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
 
Lambdas & Streams
Lambdas & StreamsLambdas & Streams
Lambdas & Streams
 
The essence of the VivaCore code analysis library
The essence of the VivaCore code analysis libraryThe essence of the VivaCore code analysis library
The essence of the VivaCore code analysis library
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving System
 
PetroWiki - The Moderator's Guide
PetroWiki - The Moderator's GuidePetroWiki - The Moderator's Guide
PetroWiki - The Moderator's Guide
 
Modern Release Engineering in a Nutshell - Why Researchers should Care!
Modern Release Engineering in a Nutshell - Why Researchers should Care!Modern Release Engineering in a Nutshell - Why Researchers should Care!
Modern Release Engineering in a Nutshell - Why Researchers should Care!
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
Fundamentals of Data Structures Unit 1.pptx
Fundamentals of Data Structures Unit 1.pptxFundamentals of Data Structures Unit 1.pptx
Fundamentals of Data Structures Unit 1.pptx
 
Cnpm bkdn
Cnpm bkdnCnpm bkdn
Cnpm bkdn
 
Digital Fabrication Studio v.0.2: Information
Digital Fabrication Studio v.0.2: InformationDigital Fabrication Studio v.0.2: Information
Digital Fabrication Studio v.0.2: Information
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
Reverse engineering and theory building v3
Reverse engineering and theory building v3Reverse engineering and theory building v3
Reverse engineering and theory building v3
 
Contribute to TYPO3 CMS
Contribute to TYPO3 CMSContribute to TYPO3 CMS
Contribute to TYPO3 CMS
 
Source-to-source transformations: Supporting tools and infrastructure
Source-to-source transformations: Supporting tools and infrastructureSource-to-source transformations: Supporting tools and infrastructure
Source-to-source transformations: Supporting tools and infrastructure
 
How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?
 

Mehr von Oleksandr Zaitsev

Mehr von Oleksandr Zaitsev (14)

Cormas: Modelling for Citizens with Citizens. Building accessible and reliabl...
Cormas: Modelling for Citizens with Citizens. Building accessible and reliabl...Cormas: Modelling for Citizens with Citizens. Building accessible and reliabl...
Cormas: Modelling for Citizens with Citizens. Building accessible and reliabl...
 
Cormas RMoD
Cormas RMoDCormas RMoD
Cormas RMoD
 
Cirad Parcours
Cirad ParcoursCirad Parcours
Cirad Parcours
 
Cirad Concours
Cirad ConcoursCirad Concours
Cirad Concours
 
Agent-Based Modelling in Pharo Using Cormas
Agent-Based Modelling in Pharo Using CormasAgent-Based Modelling in Pharo Using Cormas
Agent-Based Modelling in Pharo Using Cormas
 
AI for Software Engineering:
Research & Innovation
AI for Software Engineering:
Research & InnovationAI for Software Engineering:
Research & Innovation
AI for Software Engineering:
Research & Innovation
 
PolyMath (ESUG 2022)
PolyMath (ESUG 2022)PolyMath (ESUG 2022)
PolyMath (ESUG 2022)
 
How Fast is AI in Pharo? Benchmarking Linear Regression
How Fast is AI in Pharo? Benchmarking Linear RegressionHow Fast is AI in Pharo? Benchmarking Linear Regression
How Fast is AI in Pharo? Benchmarking Linear Regression
 
DepMiner: Automatic Recommendation of Transformation Rules for Method Depreca...
DepMiner: Automatic Recommendation of Transformation Rules for Method Depreca...DepMiner: Automatic Recommendation of Transformation Rules for Method Depreca...
DepMiner: Automatic Recommendation of Transformation Rules for Method Depreca...
 
Suggesting Descriptive Method Names: An Exploratory Study of Two Machine Lear...
Suggesting Descriptive Method Names: An Exploratory Study of Two Machine Lear...Suggesting Descriptive Method Names: An Exploratory Study of Two Machine Lear...
Suggesting Descriptive Method Names: An Exploratory Study of Two Machine Lear...
 
Introduction to Git Version Control System
Introduction to Git Version Control SystemIntroduction to Git Version Control System
Introduction to Git Version Control System
 
PhD Roadmap
PhD RoadmapPhD Roadmap
PhD Roadmap
 
Magic Literals in Pharo
Magic Literals in PharoMagic Literals in Pharo
Magic Literals in Pharo
 
Aspects of software naturalness through the generation of IdentifierNames
Aspects of software naturalness through the generation of IdentifierNamesAspects of software naturalness through the generation of IdentifierNames
Aspects of software naturalness through the generation of IdentifierNames
 

Kürzlich hochgeladen

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Kürzlich hochgeladen (20)

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 

Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr ZAITSEV

  • 1. Data Mining-based Tools to Support Library Update Oleksandr ZAITSEV PhD Thesis Defence Composition of the jury: Reviewers: Olga KOUCHNARENKO Examiner: Romain ROBBES Coen DE ROOVER Stéphane DUCASSE Nicolas ANQUETIL Supervisors: Arnaud THIEFAINE Co-supervisor:
  • 2. 2 (software company) (research institute) Advisor: Arnaud THIEFAINE Supervisors: Stéphane DUCASSE Nicolas ANQUETIL Paris Lille Cifre PhD
  • 3. 3 Arolla is a consulting company specialised in the advanced techniques of software development: Clean Code, TDD, BDD, Legacy Remediation, etc. https://www.arolla.fr/
  • 9. 9 How to change client code to use the new version of a library? Supporting developers during library update by building tools My thesis is about: Library update problem: Scope of my Thesis Part 1 / 6
  • 10. 10 How to use the new version of the same library? How to use a different (although similar) library? Library Update Disambiguation Library Migration v1.0 v2.0 A B (e.g., update Struts v1.0 to Struts v1.2) (e.g., replace EasyMock with Mockito) Part 1 / 6
  • 11. 11 Motivating Example from ailib.models import LinearRegression from ailib.data import readData data = readData(‘dataset.csv’, type=‘CSV’) model = LinearRegression() model.train(data, ycolumn=‘salary’) salary = model.predict([26, ‘female’]) ailib v1.0 Client Developer depends Part 1 / 6
  • 12. 12 Motivating Example from ailib.models import LinearRegression from ailib.data import readData data = readData(‘dataset.csv’, type=‘CSV’) model = LinearRegression() model.train(data, ycolumn=‘salary’) salary = model.predict([26, ‘female’]) ailib v2.0 Client Developer depends Part 1 / 6 ! Error: LinearRegression not found ! Error: readData() not found ! Error: wrong argument passed to predict() ?
  • 13. 13 Motivating Example ailib v2.0 Client Developer depends Part 1 / 6 1. Rename LinearRegression to AILinearRegression 2. Replace readData(file, type) with readCsv(data) 3. predict() should now accept a list of rows instead of a single row Library Developer from ailib.models import AILinearRegression from ailib.data import readCsv data = readCsv(‘dataset.csv’) model = AILinearRegression() model.train(data, ycolumn=‘salary’) salary = model.predict([[26, ‘female’]]) communicate
  • 14. 14 Understand library update in practice: ‣ What problems do developers face? ‣ What support do they need? Propose a language to express code transformations Automatic transformations discovery 3 Aspects of the Problem Empirical Language Automation Part 1 / 6
  • 15. 15 Pharo is a pure object-oriented dynamically-typed programming language. We focus on Pharo because: 1. We have access to its core developers 2. Pharo is convenient for manipulating source code Pharo Programming Language Part 1 / 6
  • 16. 1. Introduction 2. State of the Art 3. Developer Survey of Library Update 4. Deprewriter: Smart deprecation rewriting 5. DepMiner: Recommending transformation rules 6. Conclusion Plan
  • 17. State of the Art Part 2:
  • 18. 18 Technique Perspective Client developer Library developer Both Developer survey Code analysis Empirical Studies Part 2 / 6
  • 19. 19 Paper Survey Code Analysis Client Dev. Library Dev. [Robbes 2012a] X ✓ ✓ X [Jezek 2015] X ✓ ✓ ✓ [Hora 2015] X ✓ ✓ X [Bogart 2016] ✓ X ✓ ✓ [Sawant 2016] X ✓ ✓ X [Xavier 2017a] X ✓ ✓ ✓ [Xavier 2017b] ✓ X X ✓ [Hora 2018] X ✓ ✓ X [Kula 2018a] ✓ ✓ ✓ X [Kula 2018b] X ✓ X ✓ [Brito 2019] ✓ X ✓ ✓ Empirical Studies Part 2 / 6
  • 20. 20 Shortcomings: 1. Most surveys analyse specific cases of breaking changes. 2. No survey has asked client developers about what makes library update hard and what makes it easy. Empirical Studies Part 2 / 6
  • 21. 21 Problem Perspective Client developer Library developer Both Library update Library migration v1.0 v2.0 A B Tools to Support Library Update Part 2 / 6
  • 22. 22 Sources of Information Technique exp — developer expertise doc — documentation hist — commit history 2v — two versions of source code TS — textual similarity SS — structural similarity CD — call dependency Tools to Support Library Update Part 2 / 6 L — library C — already updated clients T — unit tests
  • 23. 23 Paper Update Library dev. Source Technique Dynamic Rewriting [Chow 1996] ✓ ✓ exp(L) — X [Henkel 2005] ✓ ✓ exp(L) — X [Kim 2007] ✓ — 2v(L) TS X [Xing 2007] ✓ X 2v(L) SS, TS X [Dagenais 2008] ✓ X hist(L) CD X [Schäfer 2008] ✓ X 2v(C,T) CD X [Wu 2010] ✓ X 2v(L) CD, TS X [Nguyen 2010] ✓ X 2v(L,C,T) CD, TS, SS X [Meng 2012] ✓ X hist(L) CD X [Teyton 2013] X X hist(L) CD X [Hora 2014] ✓ X hist(L) CD ✓ [Pandita 2015] X X 2v(L) TS X [Alrubaye 2019] X X hist(C), doc(L) CD, TS X [Alrubaye 2020] X X doc(L) CD, TS X Tools to Support Library Update Part 2 / 6
  • 24. 24 Shortcomings: 1. No studies propose automated tools to support library developers. 2. Most studies do not consider dynamic rewriting 3. Most studies focus on statically-typed programming languages. Tools to Support Library Update Part 2 / 6
  • 25. (empirical aspect) Developer Survey Part 3: O. Zaitsev, S. Ducasse, N. Anquetil, and A. Thiefaine. How Libraries Evolve: A Survey of Two Industrial Companies and an Open-Source Community. APSEC (industrial track), 2022.
  • 27. 27 Updated Client System Client System Library v1.0 Library v2.0 Client Developer Library Developer depends ‣ How do they behave in the context of library update? ‣ What problems do they face? ‣ What support do they need? Understanding the Developers Part 3 / 6
  • 28. 28 Open-Source Industry Library developers 11 5 Client Developers 22 9 We selected developers from diverse communities for our survey Developer Population Part 3 / 6
  • 29. 29 Three times a year or more often Twice a year Once a year Less often We do not do it regularly Q: How often do they face the problem of library update? Selected Findings 17 10 3 3 3 Part 3 / 6 Client Developers
  • 30. 30 Q: What makes library update easy? Q: What makes it hard? Selected Findings Part 3 / 6 Factor dev. Documentation 15 Absence of breaking changes 11 Test coverage 6 Tool support 5 Deprecations 4 Simple breaking changes 4 Community support 3 Factor dev. Breaking changes 11 Absent or bad documentation 10 Indirect dependencies 7 Big changes to the API 7 Poor test coverage 4 Removed functionality 3 Changed hooks or abs. classes 3 Client Developers
  • 31. 31 Very small impact Small impact Moderate impact Big impact Very big impact Q: What is the impact of breaking changes on their clients? Selected Findings 1 0 9 6 2 Part 3 / 6 Library Developers
  • 32. 32 Q: Is it important for library developers to encourage their clients to update? Selected Findings Not important at all Of little importance Of average importance Very important Absolutely essential 0 1 5 8 4 Part 3 / 6 Library Developers
  • 33. 33 Library Developers Client Developers ‣ Often have to deal with the problem of library update ‣ Need documentation and support for breaking changes ‣ Want to help their clients to update Survey Conclusion Part 3 / 6
  • 34. (language aspect) Deprewriter. On-the-fly rewriting method deprecations Part 4: S. Ducasse, G. Polito, O. Zaitsev, M. Denker, and P. Tesone. Deprewriter: On the fly rewriting method deprecations. JOT, 2021.
  • 35. 35 Deprecation Deprecation message Method Deprecations in Pharo isSpecial self deprecated: ‘Renamed to #needsFullDefinition’. ^ self needsFullDefinition Part 4 / 6
  • 36. 36 Deprecation Deprecation message Transformation rule Rewriting Deprecations isSpecial self deprecated: ‘Renamed to #needsFullDefinition’ transformWith: ‘`@receiver isSpecial’ -> ’`@receiver needsFullDefinition’. ^ self needsFullDefinition Part 4 / 6
  • 37. 37 isSpecial self deprecated: ‘Renamed to #needsFullDefinition’ transformWith: ‘`@receiver isSpecial’ -> ’`@receiver needsFullDefinition’ ^ self needsFullDefinition Antecedent (left hand side) matches the method calls that should be replaced Consequent (right hand side) de fi nes the replacement Transformation Rule Part 4 / 6
  • 38. 38 ‘@receiver whenSelectionIndexChanged: ‘@argument Expressing Complex Rules Part 4 / 6 ‘@receiver selection whenChangedDo: [ :selection | ‘@argument value: selection selectedIndex ] Antecedent: Consequent:
  • 39. 39 method “do something” method self deprecated: ‘Explain why’ transformWith: ‘`@rec method’ -> ‘`@rec newMethod’. Library Developer x.method() x.newMethod() “do something” ! Client Developer Step 1 Step 2 Step 3 Step 4 Step 5 Client code is rewritten Execution continues Rewriting in Action Part 4 / 6
  • 40. 40 Pharo 41 % 59 % Rewriting deprecations (contain a transformation rule) Non-rewriting deprecations (no transformation rule) Analysis of Deprecations in Pharo 8 (367) Part 4 / 6
  • 41. 41 Pharo 9 % 32 % 59 % Rewriting deprecations (contain a transformation rule) Non-rewriting deprecations (no transformation rule) Obvious opportunity Analysis of Deprecations in Pharo 8 (367) Part 4 / 6
  • 42. 42 Java 33 % 67 % Deprecations with helpful replacement messages Deprecations without helpful replacement messages C# 22 % 78 % JS 33 % 67 % [Brito et al., 2018] [Brito et al., 2018] [Nascimento et al., 2020] Replacement Messages Part 4 / 6
  • 43. 43 Deprewriter Summary Part 4 / 6 ✓ First documentation of Deprewriter approach ✓ Validity criteria for the transformation rules (detected 8 invalid rules in Pharo — merged PR) ✓ Analysis and discussion of non-rewriting deprecations in Pharo 8 ✓ Survey of developers who use Deprewriter Contributions:
  • 44. (automation aspect) DepMiner: Inferring rules from the commit history Part 5: O. Zaitsev, S. Ducasse, N. Anquetil, and A. Thiefaine. DepMiner: Automatic Recommendation of Transformation Rules for Method Deprecation. ICSR, 2022.
  • 45. 45 The Need for Automation Deprewriter Library Developer Client System rules update Part 5 / 6
  • 46. 46 The Need for Automation Deprewriter Library Developer Client System update Tool Commit History rules Part 5 / 6
  • 47. 47 Missing methods — public methods that were present in the old version and no longer exist in the new version. new API old API Step 1. Detect Breaking Changes Part 5 / 6
  • 48. 48 Step 1. Detect Breaking Changes Part 5 / 6 Absence of method visibility. De fi ne language-speci fi c heuristics. https://github.com/olekscode/VisibilityDeductor Challenge 1: How we address it:
  • 49. 49 { Id: ef4fdd35fb05e74aa12aad4d22a37e17a8d87b5b, Removed methods: […], Added methods: […], Modified methods: [ { Old source code: …, New source code: …, Removed method calls: [smartDescription], Added method calls: [description], }], Added classes: […], Removed classes: […], … } Line-based diffs High-level commits Which lines of code were added or removed? Which methods, classes, or packages were added, removed, or modified? Q: Q: Step 2. Data Representation Part 5 / 6
  • 50. 50 Customer 1: Customer 2: Customer 3: { bread, butter, avocado } { bread, butter, bananas } { bread, butter, milk, cereal } Customer 4: { bread, milk, cereal } Customer 5: { butter, milk, cereal } Transactions: Q1: What are the products that are frequently purchased together? Q2: What can we recommend to people who buy bread? (frequent itemsets) (association rules) Step 3. Market Basket Analysis Part 5 / 6
  • 51. 51 Customer 1: Customer 2: Customer 3: { bread, butter, avocado } { bread, butter, bananas } { bread, butter, milk, cereal } Customer 4: { bread, milk, cereal } Customer 5: { butter, milk, cereal } Transactions: Q1: What are the products that are frequently purchased together? Q2: What can we recommend to people who buy bread? { bread } { butter } Con fi dence: 75% { bread, butter } { milk, cereal } Support: 60% Support: 60% Step 3. Market Basket Analysis Part 5 / 6
  • 52. 52 Q1: What are the operations that frequently appear together in method changes? Q2: What can we recommend as a replacement for next() ? { next } { nextNode } Con fi dence: 75% { remove(next), add(nextNode) } Support: 60% Part 5 / 6 Step 3. Market Basket Analysis
  • 53. 53 Node >> next self deprecated: ‘Use #nextNode instead.’ transformWith: ‘`@receiver next’ -> ’`@receiver nextNode’. ^ self nextNode Missing Method Node >> next Association Rule {next} {nextNode} Step 4. Generate Deprecations Part 5 / 6 generate
  • 54. 54 Step 4. Generate Deprecations Part 5 / 6 Absence of static type information. Retain only those association rules, where methods in antecedent and consequent of the rule are de fi ned in the same class. Challenge 2: How we address it:
  • 55. 55 Prototype Tool for Pharo Part 5 / 6
  • 57. 57 113 56 5 52 33 1 32 87 28 40 19 1 1 11 4 7 Pharo Famix DataFrame Accepted recommendations Rewriting Non-rewriting Rejected recommendations Moose Pillar Recommended Deprecations Part 5 / 6
  • 58. 58 Limitations of DepMiner Limitation 1 Limitation 2 Limitation 3 Simplified recommendations Unused / untested methods Naive search Developer may not want to deprecate. It’s correct but I will not deprecate Ineffective for methods that are not called internally test test test Coveragre: 80% v1.0 v2.0 breaking change effect Search entire commit history Part 5 / 6
  • 59. Overcoming Limitation 1 Deprecation with a transformation rule Deprecation with a replacement message Deprecation without a replacement message Library update script with automatic rules Documentation with replacement message Documentation: list of breaking changes Does replacement exist? Is it automatable? yes no no yes Does developer want to deprecate? yes no Identified with the help of developers from Pharo open-source community 59 Part 5 / 6
  • 61. 61 1. Developer survey to understand the practice of library update Contributions 2. First documentation of the Deprewriter approach. 3. Study of its adoption by the community 4. DepMiner — a novel approach to mine transformation rules. 5. Generalisation of DepMiner as a holistic approach. Empirical Language Automation Part 6 / 6
  • 62. Future Work. Improve DepMiner 62 Limitation 1 Limitation 2 Limitation 3 Simplified recommendations Unused / untested methods Naive search Consider other actions of library developers (based on the table) Combine with existing approaches to detect refactoring. Detect commit that introduced BC Search neighbour commits Part 6 / 6
  • 63. Future Work. Challenging Cases 63 1. Reassigning the existing name 2. Circular renaming 3. Modifying abstract hooks 4. Cleaning up spurious objects 5. String literals used as identifiers Method-to-method mapping: Part 6 / 6 Documented challenges: old API new API
  • 64. 64 2 Journal Papers ‣ N. Anquetil, J. Delplanque, S. Ducasse, O. Zaitsev, C. Fuhrman, and Y.-G. Guéhéneuc. What Do Developers Consider Magic literals? A Smalltalk Perspective. IST, 2022. ‣ S. Ducasse, G. Polito, O. Zaitsev, M. Denker, and P. Tesone. Deprewriter: On the fly rewriting method deprecations. JOT, 2021. 3 Conference Papers ‣ O. Zaitsev, S. Ducasse, N. Anquetil, and A. Thiefaine. How Libraries Evolve: A Survey of Two Industrial Companies and an Open-Source Community. APSEC (industrial track), 2022. ‣ O. Zaitsev, S. Ducasse, N. Anquetil, and A. Thiefaine. DepMiner: Automatic Recommendation of Transformation Rules for Method Deprecation. ICSR, 2022. ‣ O. Zaitsev, S. Ducasse, A. Bergel, and M. Eveillard. Suggesting Descriptive Method Names: An Exploratory Study of Two Machine Learning Approaches. QUATIC, 2020. + 2 Workshop Papers & 1 technical report 2nd best paper award at IWST’22 Best poster award at GDR GPL Publications (7 papers) Part 6 / 6
  • 65. ✓ Library update survey of developers. ✓ First documentation of Deprewriter ✓ Analysis of its adoption by the community. ✓ DepMiner — a novel approach to mine rules. ✓ DepMiner as holistic approach. Summary Contributions: Updated Client System Client System Library v1.0 Library v2.0 Client Developer Library Developer depends Library Update: DepMiner tool: 7 2 > 100 papers awards merged pull requests
  • 66.
  • 68. Static & Dynamic Analysis 68 Deprewriter Client System Commit History DepMiner Statically mine rules from history Dynamically apply rules to client code
  • 69. Mining Method Call Replacements 69 Local subset of commits missing method m() history Method changes that remove a call to m() {remove(m), add(m’), add(x)} {remove(m), remove(n), add(m’)} … {remove(m), add(m’)} {remove(m), add(m’)} count: 15 {remove(m), remove(n), add(x)} count: 5 {m} → {m’} support: 15 confidence: 0.5 {m, n} → {x} support: 5 confidence: 0.3 A-Priori Transactions Frequent Itemsets Association Rules
  • 70. 70 public static LinkedList insert(LinkedList list, int data) { Node new_node = new Node(data); - new_node.setNext(null); + new_node.setNextNode(null); if (list.head() == null) { list.setHead(new_node); } else { Node last = list.head; - while (last.next() != null) { - last = last.next(); + while (last.nextNode() != null) { + last = last.nextNode(); } last.next = new_node; } return list; } Method change — one method modified by one commit Method Change Part 5 / 6
  • 71. 71 public static LinkedList insert(LinkedList list, int data) { Node new_node = new Node(data); - new_node.setNext(null); + new_node.setNextNode(null); if (list.head() == null) { list.setHead(new_node); } else { Node last = list.head; - while (last.next() != null) { - last = last.next(); + while (last.nextNode() != null) { + last = last.nextNode(); } last.next = new_node; } return list; } { remove(setNext), add(setNextNode), remove(next), remove(next), add(nextNode), add(nextNode) } Transaction — set of added and removed method calls in a method change: Method Change as Transaction Part 5 / 6
  • 72. Challenging Problems of Library Update Support B:
  • 73. Rewriting Abstract Hooks 73 FileReader read() Library v1.0 CSVFileReader read() Client
  • 74. Rewriting Abstract Hooks 74 FileReader read() FileReader readFile() Library v2.0 Library v1.0 CSVFileReader read() CSVFileReader read() Client Client renamed Method is never called
  • 75. Removing Spurious Objects 75 300 px 200 px 50 px offsets (a Rectangle) -71 46 120 px fractions (a Rectangle) 0.4 0.25
  • 76. Removing Spurious Objects 76 300 px 200 px -71 46 bottomOffset rightOffset 0.4 0.25 bottomFraction rightFraction
  • 77. Removing Spurious Objects 77 frame := LayoutFrame new topFraction: 0 offset: 0; leftFraction: 0 offset: 0; bottomFraction: 0.4 offset: 46; leftFraction: 0.25 offset: -71; yourself. fractionRectangle := 0@0 extent: 0.4@0.25. offsetRectangle := 0@0 extent: 46@(-71). frame := LayoutFrame fractions: fractionRectangle offsets: offsetRectange. v1.0 v2.0
  • 78. Removing Spurious Objects 78 frame := LayoutFrame new topFraction: 0 offset: 0; leftFraction: 0 offset: 0; bottomFraction: 0.4 offset: 46; leftFraction: 0.25 offset: -71; yourself. fractionRectangle := 0@0 extent: 0.4@0.25. offsetRectangle := 0@0 extent: 46@(-71). frame := LayoutFrame fractions: fractionRectangle offsets: offsetRectange. v1.0 v2.0 We can rewrite method calls
  • 79. Removing Spurious Objects 79 frame := LayoutFrame new topFraction: 0 offset: 0; leftFraction: 0 offset: 0; bottomFraction: 0.4 offset: 46; leftFraction: 0.25 offset: -71; yourself. fractionRectangle := 0@0 extent: 0.4@0.25. offsetRectangle := 0@0 extent: 46@(-71). frame := LayoutFrame fractions: fractionRectangle offsets: offsetRectange. v1.0 v2.0 Hard to find and remove
  • 80. a b b c Library v1.0 Library v2.0 Problematic Renaming 80 a b b a Library v1.0 Library v2.0 Reassigning the Name Circular Renaming