SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Heterogeneous Defect
Prediction
ESEC/FSE 2015
September 3, 2015
Jaechang Nam and Sunghun Kim
Department of Computer Science and Engineering
HKUST
2
Predict
Training
?
?
Model
Project A
: Metric value
: Buggy-labeled instance
: Clean-labeled instance
?: Unlabeled instance
Software Defect Prediction
Related Work
Munson@TSE`92, Basili@TSE`95, Menzies@TSE`07,
Hassan@ICSE`09, Bird@FSE`11,D’ambros@EMSE112
Lee@FSE`11,...
What if labeled instances do not
exist?
3
?
?
?
?
?
Project X
Unlabeled
Dataset
?: Unlabeled instance
: Metric value
Model
New projects
Projects lacking in
historical data
Existing Solutions?
4
?
?
?
?
?
(New) Project X
Unlabeled
Dataset
?: Unlabeled instance
: Metric value
Cross-Project Defect Prediction
(CPDP)
5
?
?
?
?
?
Training
Predict
Model
Project A
(source)
Project X
(target)
Unlabeled
Dataset
: Metric value
: Buggy-labeled instance
: Clean-labeled instance
?: Unlabeled instance
Related Work
Watanabe@PROMISE`08, Turhan@EMSE`09
Zimmermann@FSE`09, Ma@IST`12, Zhang@MSR`14
Zhang@MSR`14, Panichella@WCRE`14,
Canfora@STVR15
Challenge
Same metric set
(same feature space)
• Heterogeneous
metrics between
source and target
Motivation
6
?
Training
Test
Model
Project A
(source)
Project C
(target)
?
?
?
?
?
?
?
Heterogeneous metric sets
(different feature spaces
or different domains)
Possible to Reuse all the existing defect datasets for CPDP!
Heterogeneous Defect Prediction (HDP)
Key Idea
• Consistent defect-proneness tendency of
metrics
– Defect prediction metrics measure complexity of
software and its development process.
• e.g.
– The number of developers touching a source code file
(Bird@FSE`11)
– The number of methods in a class (D’Ambroas@ESEJ`12)
– The number of operands (Menzies@TSE`08)
More complexity implies more defect-proneness
(Rahman@ICSE`13)
• Distributions between source and target should
be the same to build a strong prediction model.
7
Match source and target metrics that
have similar distribution
Heterogeneous Defect Prediction (HDP)
- Overview -
8
X1 X2 X3 X4 Label
1 1 3 10 Buggy
8 0 1 0 Clean
⋮ ⋮ ⋮ ⋮ ⋮
9 0 1 1 Clean
Metric
Matching
Source: Project A Target: Project B
Cross-
prediction Model
Build
(training)
Predict
(test)
Metric
Selection
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label
3 1 1 0 2 1 9 ?
1 1 9 0 2 3 8 ?
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 1 1 1 2 1 1 ?
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
9 1 1 ?
8 3 9 ?
⋮ ⋮ ⋮ ⋮
1 1 1 ?
Metric Selection
• Why? (Guyon@JMLR`03)
– Select informative metrics
• Remove redundant and irrelevant metrics
– Decrease complexity of metric matching combination
• Feature Selection Approaches (Gao@SPE`11,Shivaji@TSE`13)
– Gain Ratio
– Chi-square
– Relief-F
– Significance attribute evaluation
9
Metric Matching
10
Source Metrics Target Metrics
X1
X2
Y1
Y2
0.8
0.5
* We can apply different cutoff values of matching score
* It can be possible that there is no matching at all.
Compute Matching Score
KSAnalyzer
• Use p-value of Kolmogorov-Smirnov Test
(Massey@JASA`51)
11
Matching Score M of i-th source and j-th target metrics:
Mij = pij
Heterogeneous Defect Prediction
- Overview -
12
X1 X2 X3 X4 Label
1 1 3 10 Buggy
8 0 1 0 Clean
⋮ ⋮ ⋮ ⋮ ⋮
9 0 1 1 Clean
Metric
Matching
Source: Project A Target: Project B
Cross-
prediction Model
Build
(training)
Predict
(test)
Metric
Selection
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label
3 1 1 0 2 1 9 ?
1 1 9 0 2 3 8 ?
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 1 1 1 2 1 1 ?
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
9 1 1 ?
8 3 9 ?
⋮ ⋮ ⋮ ⋮
1 1 1 ?
EVALUATION
13
Baselines
• WPDP
• CPDP-CM (Turhan@EMSE`09,Ma@IST`12,He@IST`14)
– Cross-project defect prediction using only
common metrics between source and target
datasets
• CPDP-IFS (He@CoRR`14)
– Cross-project defect prediction on
Imbalanced Feature Set (i.e. heterogeneous
metric set)
– 16 distributional characteristics of values of
an instance as features (e.g., mean, std,
maximum,...)
14
Research Questions (RQs)
• RQ1
– Is heterogeneous defect prediction comparable
to WPDP?
• RQ2
– Is heterogeneous defect prediction comparable
to CPDP-CM?
• RQ3
– Is Heterogeneous defect prediction comparable
to CPDP-IFS?
15
Benchmark Datasets
Group Dataset
# of instances # of
metrics
Granularity
All Buggy (%)
AEEEM
EQ 325 129 (39.7%)
61 Class
JDT 997 206 (20.7%)
LC 399 64 (9.36%)
ML 1862 245 (13.2%)
PDE 1492 209 (14.0%)
MORP
H
ant-1.3 125 20 (16.0%)
20 Class
arc 234 27 (11.5%)
camel-1.0 339 13 (3.8%)
poi-1.5 237 141 (75.0%)
redaktor 176 27 (15.3%)
skarbonka 45 9 (20.0%)
tomcat 858 77 (9.0%)
velocity-1.4 196 147 (75.0%)
xalan-2.4 723 110 (15.2%)
xerces-1.2 440 71 (16.1%)
16
Group Dataset
# of instances # of
metrics
Granularity
All Buggy (%)
ReLink
Apache 194 98 (50.5%)
26 FileSafe 56 22 (39.3%)
ZXing 399
118
(29.6%)
NASA
cm1 327 42 (12.8%)
37 Function
mw1 253 27 (10.7%)
pc1 705 61 (8.7%)
pc3 1077
134
(12.4%)
pc4 1458
178
(12.2%)
SOFTLA
B
ar1 121 9 (7.4%)
29 Function
ar3 63 8 (12.7%)
ar4 107 20 (18.7%)
ar5 36 8 (22.2%)
ar6 101 15 (14.9%)
600 prediction combinations in total!
Experimental Settings
• Logistic Regression
• HDP vs. WPDP, CPDP-CM, and CPDP-IFS
17
Test set
(50%)
Training set
(50%)
Project
1
Project
2
Project
n
...
...
X 1000
Project
1
Project
2
Project
n
...
...
CPDP-CM
CPDP-IFS
HDP
WPDP
Project A
Evaluation Measures
• False Positive Rate = FP/(TN+FP)
• True Positive Rate = Recall
• AUC (Area Under receiver operating characteristic Curve)
18
False Positive rate
TruePositiverate
0
1
1
Evaluation Measures
• Win/Tie/Loss (Valentini@ICML`03, Li@JASE`12, Kocaguneli@TSE`13)
– Wilcoxon signed-rank test (p<0.05) for 1000
prediction results
– Win
• # of outperforming HDP prediction combinations with
statistical significance. (p<0.05)
– Tie
• # of HDP prediction combinations with no statistical
significance. (p≥0.05)
– Loss
• # of outperforming baseline prediction results with
statistical significance. (p>0.05)
19
RESULT
20
Prediction Results in median
AUC
Target WPDP
CPDP-
CM
CPDP-
IFS
HDPKS
(cutoff
=0.05)
EQ 0.583 0.776 0.461 0.783
JDT 0.795 0.781 0.543 0.767
MC 0.575 0.636 0.584 0.655
ML 0.734 0.651 0.557 0.692*
PDE 0.684 0.682 0.566 0.717
ant-1.3 0.670 0.611 0.500 0.701
arc 0.670 0.611 0.523 0.701
camel-1.0 0.550 0.590 0.500 0.639
poi-1.5 0.707 0.676 0.606 0.537
redaktor 0.744 0.500 0.500 0.537
skarbonka 0.569 0.736 0.528 0.694*
tomcat 0.778 0.746 0.640 0.818
velocity-
1.4
0.725 0.609 0.500 0.391
xalan-2.4 0.755 0.658 0.499 0.751
xerces-1.2 0.624 0.453 0.500 0.489
21
Target WPDP
CPDP-
CM
CPDP-
IFS
HDPKS
(cutoff
=0.05)
Apache 0.714 0.689 0.635 0.717*
Safe 0.706 0.749 0.616 0.818*
ZXing 0.605 0.619 0.530 0.650*
cm1 0.653 0.622 0.551 0.717*
mw1 0.612 0.584 0.614 0.727
pc1 0.787 0.675 0.564 0.752*
pc3 0.794 0.665 0.500 0.738*
pc4 0.900 0.773 0.589 0.682*
ar1 0.582 0.464 0.500 0.734*
ar3 0.574 0.862 0.682 0.823*
ar4 0.657 0.588 0.575 0.816*
ar5 0.804 0.875 0.585 0.911*
ar6 0.654 0.611 0.527 0.640
All 0.657 0.636 0.555 0.724*
HDPKS: Heterogeneous defect prediction using KSAnalyzer
Win/Tie/Loss Results
Target
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
W T L W T L W T L
EQ 4 0 0 2 2 0 4 0 0
JDT 0 0 5 3 0 2 5 0 0
LC 6 0 1 3 3 1 3 1 3
ML 0 0 6 4 2 0 6 0 0
PDE 3 0 2 2 0 3 5 0 0
ant-1.3 6 0 1 6 0 1 5 0 2
arc 3 1 0 3 0 1 4 0 0
camel-1.0 3 0 2 3 0 2 4 0 1
poi-1.5 2 0 2 3 0 1 2 0 2
redaktor 0 0 4 2 0 2 3 0 1
skarbonka 11 0 0 4 0 7 9 0 2
tomcat 2 0 0 1 1 0 2 0 0
velocity-
1.4
0 0 3 0 0 3 0 0 3
xalan-2.4 0 0 1 1 0 0 1 0 0
xerces-1.2 0 0 3 3 0 0 1 0 2 22
Target
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
W T L W T L W T L
Apach
e
6 0 5 8 1 2 9 0 2
Safe 14 0 3 12 0 5 15 0 2
ZXing 8 0 0 6 0 2 7 0 1
cm1 7 1 2 8 0 2 9 0 1
mw1 5 0 1 4 0 2 4 0 2
pc1 1 0 5 5 0 1 6 0 0
pc3 0 0 7 7 0 0 7 0 0
pc4 0 0 7 2 0 5 7 0 0
ar1 14 0 1 14 0 1 11 0 4
ar3 15 0 0 5 0 10 10 2 3
ar4 16 0 0 14 1 1 15 0 1
ar5 14 0 4 14 0 4 16 0 2
ar6 7 1 7 8 4 3 12 0 3
Total 147 3 72 147 14 61 182 3 35
%
66.2
%
1.4%
32.4
%
66.2
%
6.3%
27.5
%
82.0
%
1.3%
16.7
%
Matched Metrics (Win)
23
MetricValues
Distribution
(Source metric: RFC-the number of method invoked by a class, Target metric: the number of operand
Matching Score = 0.91
AUC = 0.946 (ant1.3  ar5)
Matched Metrics (Loss)
24
MetricValues
Distribution
(Source metric: LOC, Target metric: average number of LOC in a method)
Matching Score = 0.13
AUC = 0.391 (Safe  velocity-1.4)
Different Feature Selections
(median AUCs, Win/Tie/Loss)
25
Approach
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
HDP
AUC Win% AUC Win% AUC Win% AUC
Gain Ratio 0.657 63.7% 0.645 63.2% 0.536 80.2% 0.720
Chi-Square 0.657 64.7% 0.651 66.4% 0.556 82.3% 0.727
Significanc
e
0.657 66.2% 0.636 66.2% 0.553 82.0% 0.724
Relief-F 0.670 57.0% 0.657 63.1% 0.543 80.5% 0.709
None 0.657 47.3% 0.624 50.3% 0.536 66.3% 0.663
Results in Different Cutoffs
26
Cutoff
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
HDP Target
Coverage
AUC Win% AUC Win% AUC Win% AUC
0.05 0.657 66.2% 0.636 66.2% 0.553 82.4% 0.724* 100%
0.90 0.657 100% 0.761 71.4% 0.624 100% 0.852* 21%
Conclusion
• HDP
– Potential for CPDP across datasets with
different metric sets.
• Future work
– Filtering out noisy metric matching
– Determine the best probability threshold
27
Q&A
THANK YOU!
28

Weitere ähnliche Inhalte

Was ist angesagt?

AI search techniques
AI search techniquesAI search techniques
AI search techniquesOmar Isaid
 
Reinforcement learning, Q-Learning
Reinforcement learning, Q-LearningReinforcement learning, Q-Learning
Reinforcement learning, Q-LearningKuppusamy P
 
POST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEMPOST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEMRajendran
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesDr. C.V. Suresh Babu
 
Quick sort-Data Structure
Quick sort-Data StructureQuick sort-Data Structure
Quick sort-Data StructureJeanie Arnoco
 
Queue AS an ADT (Abstract Data Type)
Queue AS an ADT (Abstract Data Type)Queue AS an ADT (Abstract Data Type)
Queue AS an ADT (Abstract Data Type)Self-Employed
 
Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game PlayingAman Patel
 
Tic Tac Toe using Mini Max Algorithm
Tic Tac Toe using Mini Max AlgorithmTic Tac Toe using Mini Max Algorithm
Tic Tac Toe using Mini Max AlgorithmUjjawal Poudel
 
Ll(1) Parser in Compilers
Ll(1) Parser in CompilersLl(1) Parser in Compilers
Ll(1) Parser in CompilersMahbubur Rahman
 
Propositional logic
Propositional logicPropositional logic
Propositional logicRushdi Shams
 
String matching with finite state automata
String matching with finite state automataString matching with finite state automata
String matching with finite state automataAnmol Hamid
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Computer Security Lecture 4: Block Ciphers and the Data Encryption Standard
Computer Security Lecture 4: Block Ciphers and the Data Encryption StandardComputer Security Lecture 4: Block Ciphers and the Data Encryption Standard
Computer Security Lecture 4: Block Ciphers and the Data Encryption StandardMohamed Loey
 
Information and network security 13 playfair cipher
Information and network security 13 playfair cipherInformation and network security 13 playfair cipher
Information and network security 13 playfair cipherVaibhav Khanna
 

Was ist angesagt? (20)

Heap sort
Heap sortHeap sort
Heap sort
 
Data Analysis With Pandas
Data Analysis With PandasData Analysis With Pandas
Data Analysis With Pandas
 
AI search techniques
AI search techniquesAI search techniques
AI search techniques
 
Reinforcement learning, Q-Learning
Reinforcement learning, Q-LearningReinforcement learning, Q-Learning
Reinforcement learning, Q-Learning
 
Naive string matching
Naive string matchingNaive string matching
Naive string matching
 
POST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEMPOST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEM
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
Quick sort-Data Structure
Quick sort-Data StructureQuick sort-Data Structure
Quick sort-Data Structure
 
Queue AS an ADT (Abstract Data Type)
Queue AS an ADT (Abstract Data Type)Queue AS an ADT (Abstract Data Type)
Queue AS an ADT (Abstract Data Type)
 
Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game Playing
 
Tic Tac Toe using Mini Max Algorithm
Tic Tac Toe using Mini Max AlgorithmTic Tac Toe using Mini Max Algorithm
Tic Tac Toe using Mini Max Algorithm
 
Ll(1) Parser in Compilers
Ll(1) Parser in CompilersLl(1) Parser in Compilers
Ll(1) Parser in Compilers
 
Propositional logic
Propositional logicPropositional logic
Propositional logic
 
Stack and queue
Stack and queueStack and queue
Stack and queue
 
String matching with finite state automata
String matching with finite state automataString matching with finite state automata
String matching with finite state automata
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Computer Security Lecture 4: Block Ciphers and the Data Encryption Standard
Computer Security Lecture 4: Block Ciphers and the Data Encryption StandardComputer Security Lecture 4: Block Ciphers and the Data Encryption Standard
Computer Security Lecture 4: Block Ciphers and the Data Encryption Standard
 
Hmm viterbi
Hmm viterbiHmm viterbi
Hmm viterbi
 
Information and network security 13 playfair cipher
Information and network security 13 playfair cipherInformation and network security 13 playfair cipher
Information and network security 13 playfair cipher
 
Hill climbing algorithm
Hill climbing algorithmHill climbing algorithm
Hill climbing algorithm
 

Andere mochten auch

A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Time series classification
Time series classificationTime series classification
Time series classificationSung Kim
 

Andere mochten auch (13)

A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Tensor board
Tensor boardTensor board
Tensor board
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Time series classification
Time series classificationTime series classification
Time series classification
 

Ähnlich wie Heterogeneous Defect Prediction (

ESEC/FSE 2015)

A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchGreg Makowski
 
Artificial software diversity: automatic synthesis of program sosies
Artificial software diversity: automatic synthesis of program sosiesArtificial software diversity: automatic synthesis of program sosies
Artificial software diversity: automatic synthesis of program sosiesFoCAS Initiative
 
Towards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingAkos Hajdu
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
Automation of building reliable models
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable modelsEszter Szabó
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methodsjoycemi_la
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methodsjoycemi_la
 
Comparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningAndrea Gigli
 
DSUS_MAO_2012_Jie
DSUS_MAO_2012_JieDSUS_MAO_2012_Jie
DSUS_MAO_2012_JieMDO_Lab
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CSusumu Tokumoto
 
Two methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersUniversity of Huddersfield
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsRoberto Natella
 
A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...Institute of Agricultural Machinery, NARO
 
Deep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfasdfasdf214078
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelDavid Ritchie
 

Ähnlich wie Heterogeneous Defect Prediction (

ESEC/FSE 2015) (20)

A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient search
 
Artificial software diversity: automatic synthesis of program sosies
Artificial software diversity: automatic synthesis of program sosiesArtificial software diversity: automatic synthesis of program sosies
Artificial software diversity: automatic synthesis of program sosies
 
Podem_Report
Podem_ReportPodem_Report
Podem_Report
 
Towards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model Checking
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
Automation of building reliable models
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable models
 
CodeChecker summary 21062021
CodeChecker summary 21062021CodeChecker summary 21062021
CodeChecker summary 21062021
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methods
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methods
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
 
Comparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text Mining
 
DSUS_MAO_2012_Jie
DSUS_MAO_2012_JieDSUS_MAO_2012_Jie
DSUS_MAO_2012_Jie
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
Two methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parameters
 
BLAST and sequence alignment
BLAST and sequence alignmentBLAST and sequence alignment
BLAST and sequence alignment
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
 
A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...
 
Deep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdf
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
 

Mehr von Sung Kim

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesSung Kim
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksSung Kim
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionSung Kim
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012Sung Kim
 
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Sung Kim
 
Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Sung Kim
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Sung Kim
 
Software Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSoftware Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSung Kim
 
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)Sung Kim
 
Self-defending software: Automatically patching errors in deployed software ...
Self-defending software: Automatically patching  errors in deployed software ...Self-defending software: Automatically patching  errors in deployed software ...
Self-defending software: Automatically patching errors in deployed software ...Sung Kim
 
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)Sung Kim
 

Mehr von Sung Kim (16)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social Networks
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash Reproduction
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012
 
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
 
Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
 
Software Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSoftware Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of Crowds
 
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
 
Self-defending software: Automatically patching errors in deployed software ...
Self-defending software: Automatically patching  errors in deployed software ...Self-defending software: Automatically patching  errors in deployed software ...
Self-defending software: Automatically patching errors in deployed software ...
 
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
 

Kürzlich hochgeladen

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 

Kürzlich hochgeladen (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 

Heterogeneous Defect Prediction (

ESEC/FSE 2015)

  • 1. Heterogeneous Defect Prediction ESEC/FSE 2015 September 3, 2015 Jaechang Nam and Sunghun Kim Department of Computer Science and Engineering HKUST
  • 2. 2 Predict Training ? ? Model Project A : Metric value : Buggy-labeled instance : Clean-labeled instance ?: Unlabeled instance Software Defect Prediction Related Work Munson@TSE`92, Basili@TSE`95, Menzies@TSE`07, Hassan@ICSE`09, Bird@FSE`11,D’ambros@EMSE112 Lee@FSE`11,...
  • 3. What if labeled instances do not exist? 3 ? ? ? ? ? Project X Unlabeled Dataset ?: Unlabeled instance : Metric value Model New projects Projects lacking in historical data
  • 4. Existing Solutions? 4 ? ? ? ? ? (New) Project X Unlabeled Dataset ?: Unlabeled instance : Metric value
  • 5. Cross-Project Defect Prediction (CPDP) 5 ? ? ? ? ? Training Predict Model Project A (source) Project X (target) Unlabeled Dataset : Metric value : Buggy-labeled instance : Clean-labeled instance ?: Unlabeled instance Related Work Watanabe@PROMISE`08, Turhan@EMSE`09 Zimmermann@FSE`09, Ma@IST`12, Zhang@MSR`14 Zhang@MSR`14, Panichella@WCRE`14, Canfora@STVR15 Challenge Same metric set (same feature space) • Heterogeneous metrics between source and target
  • 6. Motivation 6 ? Training Test Model Project A (source) Project C (target) ? ? ? ? ? ? ? Heterogeneous metric sets (different feature spaces or different domains) Possible to Reuse all the existing defect datasets for CPDP! Heterogeneous Defect Prediction (HDP)
  • 7. Key Idea • Consistent defect-proneness tendency of metrics – Defect prediction metrics measure complexity of software and its development process. • e.g. – The number of developers touching a source code file (Bird@FSE`11) – The number of methods in a class (D’Ambroas@ESEJ`12) – The number of operands (Menzies@TSE`08) More complexity implies more defect-proneness (Rahman@ICSE`13) • Distributions between source and target should be the same to build a strong prediction model. 7 Match source and target metrics that have similar distribution
  • 8. Heterogeneous Defect Prediction (HDP) - Overview - 8 X1 X2 X3 X4 Label 1 1 3 10 Buggy 8 0 1 0 Clean ⋮ ⋮ ⋮ ⋮ ⋮ 9 0 1 1 Clean Metric Matching Source: Project A Target: Project B Cross- prediction Model Build (training) Predict (test) Metric Selection Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label 3 1 1 0 2 1 9 ? 1 1 9 0 2 3 8 ? ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 1 1 1 2 1 1 ? 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 9 1 1 ? 8 3 9 ? ⋮ ⋮ ⋮ ⋮ 1 1 1 ?
  • 9. Metric Selection • Why? (Guyon@JMLR`03) – Select informative metrics • Remove redundant and irrelevant metrics – Decrease complexity of metric matching combination • Feature Selection Approaches (Gao@SPE`11,Shivaji@TSE`13) – Gain Ratio – Chi-square – Relief-F – Significance attribute evaluation 9
  • 10. Metric Matching 10 Source Metrics Target Metrics X1 X2 Y1 Y2 0.8 0.5 * We can apply different cutoff values of matching score * It can be possible that there is no matching at all.
  • 11. Compute Matching Score KSAnalyzer • Use p-value of Kolmogorov-Smirnov Test (Massey@JASA`51) 11 Matching Score M of i-th source and j-th target metrics: Mij = pij
  • 12. Heterogeneous Defect Prediction - Overview - 12 X1 X2 X3 X4 Label 1 1 3 10 Buggy 8 0 1 0 Clean ⋮ ⋮ ⋮ ⋮ ⋮ 9 0 1 1 Clean Metric Matching Source: Project A Target: Project B Cross- prediction Model Build (training) Predict (test) Metric Selection Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label 3 1 1 0 2 1 9 ? 1 1 9 0 2 3 8 ? ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 1 1 1 2 1 1 ? 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 9 1 1 ? 8 3 9 ? ⋮ ⋮ ⋮ ⋮ 1 1 1 ?
  • 14. Baselines • WPDP • CPDP-CM (Turhan@EMSE`09,Ma@IST`12,He@IST`14) – Cross-project defect prediction using only common metrics between source and target datasets • CPDP-IFS (He@CoRR`14) – Cross-project defect prediction on Imbalanced Feature Set (i.e. heterogeneous metric set) – 16 distributional characteristics of values of an instance as features (e.g., mean, std, maximum,...) 14
  • 15. Research Questions (RQs) • RQ1 – Is heterogeneous defect prediction comparable to WPDP? • RQ2 – Is heterogeneous defect prediction comparable to CPDP-CM? • RQ3 – Is Heterogeneous defect prediction comparable to CPDP-IFS? 15
  • 16. Benchmark Datasets Group Dataset # of instances # of metrics Granularity All Buggy (%) AEEEM EQ 325 129 (39.7%) 61 Class JDT 997 206 (20.7%) LC 399 64 (9.36%) ML 1862 245 (13.2%) PDE 1492 209 (14.0%) MORP H ant-1.3 125 20 (16.0%) 20 Class arc 234 27 (11.5%) camel-1.0 339 13 (3.8%) poi-1.5 237 141 (75.0%) redaktor 176 27 (15.3%) skarbonka 45 9 (20.0%) tomcat 858 77 (9.0%) velocity-1.4 196 147 (75.0%) xalan-2.4 723 110 (15.2%) xerces-1.2 440 71 (16.1%) 16 Group Dataset # of instances # of metrics Granularity All Buggy (%) ReLink Apache 194 98 (50.5%) 26 FileSafe 56 22 (39.3%) ZXing 399 118 (29.6%) NASA cm1 327 42 (12.8%) 37 Function mw1 253 27 (10.7%) pc1 705 61 (8.7%) pc3 1077 134 (12.4%) pc4 1458 178 (12.2%) SOFTLA B ar1 121 9 (7.4%) 29 Function ar3 63 8 (12.7%) ar4 107 20 (18.7%) ar5 36 8 (22.2%) ar6 101 15 (14.9%) 600 prediction combinations in total!
  • 17. Experimental Settings • Logistic Regression • HDP vs. WPDP, CPDP-CM, and CPDP-IFS 17 Test set (50%) Training set (50%) Project 1 Project 2 Project n ... ... X 1000 Project 1 Project 2 Project n ... ... CPDP-CM CPDP-IFS HDP WPDP Project A
  • 18. Evaluation Measures • False Positive Rate = FP/(TN+FP) • True Positive Rate = Recall • AUC (Area Under receiver operating characteristic Curve) 18 False Positive rate TruePositiverate 0 1 1
  • 19. Evaluation Measures • Win/Tie/Loss (Valentini@ICML`03, Li@JASE`12, Kocaguneli@TSE`13) – Wilcoxon signed-rank test (p<0.05) for 1000 prediction results – Win • # of outperforming HDP prediction combinations with statistical significance. (p<0.05) – Tie • # of HDP prediction combinations with no statistical significance. (p≥0.05) – Loss • # of outperforming baseline prediction results with statistical significance. (p>0.05) 19
  • 21. Prediction Results in median AUC Target WPDP CPDP- CM CPDP- IFS HDPKS (cutoff =0.05) EQ 0.583 0.776 0.461 0.783 JDT 0.795 0.781 0.543 0.767 MC 0.575 0.636 0.584 0.655 ML 0.734 0.651 0.557 0.692* PDE 0.684 0.682 0.566 0.717 ant-1.3 0.670 0.611 0.500 0.701 arc 0.670 0.611 0.523 0.701 camel-1.0 0.550 0.590 0.500 0.639 poi-1.5 0.707 0.676 0.606 0.537 redaktor 0.744 0.500 0.500 0.537 skarbonka 0.569 0.736 0.528 0.694* tomcat 0.778 0.746 0.640 0.818 velocity- 1.4 0.725 0.609 0.500 0.391 xalan-2.4 0.755 0.658 0.499 0.751 xerces-1.2 0.624 0.453 0.500 0.489 21 Target WPDP CPDP- CM CPDP- IFS HDPKS (cutoff =0.05) Apache 0.714 0.689 0.635 0.717* Safe 0.706 0.749 0.616 0.818* ZXing 0.605 0.619 0.530 0.650* cm1 0.653 0.622 0.551 0.717* mw1 0.612 0.584 0.614 0.727 pc1 0.787 0.675 0.564 0.752* pc3 0.794 0.665 0.500 0.738* pc4 0.900 0.773 0.589 0.682* ar1 0.582 0.464 0.500 0.734* ar3 0.574 0.862 0.682 0.823* ar4 0.657 0.588 0.575 0.816* ar5 0.804 0.875 0.585 0.911* ar6 0.654 0.611 0.527 0.640 All 0.657 0.636 0.555 0.724* HDPKS: Heterogeneous defect prediction using KSAnalyzer
  • 22. Win/Tie/Loss Results Target Against WPDP Against CPDP-CM Against CPDP-IFS W T L W T L W T L EQ 4 0 0 2 2 0 4 0 0 JDT 0 0 5 3 0 2 5 0 0 LC 6 0 1 3 3 1 3 1 3 ML 0 0 6 4 2 0 6 0 0 PDE 3 0 2 2 0 3 5 0 0 ant-1.3 6 0 1 6 0 1 5 0 2 arc 3 1 0 3 0 1 4 0 0 camel-1.0 3 0 2 3 0 2 4 0 1 poi-1.5 2 0 2 3 0 1 2 0 2 redaktor 0 0 4 2 0 2 3 0 1 skarbonka 11 0 0 4 0 7 9 0 2 tomcat 2 0 0 1 1 0 2 0 0 velocity- 1.4 0 0 3 0 0 3 0 0 3 xalan-2.4 0 0 1 1 0 0 1 0 0 xerces-1.2 0 0 3 3 0 0 1 0 2 22 Target Against WPDP Against CPDP-CM Against CPDP-IFS W T L W T L W T L Apach e 6 0 5 8 1 2 9 0 2 Safe 14 0 3 12 0 5 15 0 2 ZXing 8 0 0 6 0 2 7 0 1 cm1 7 1 2 8 0 2 9 0 1 mw1 5 0 1 4 0 2 4 0 2 pc1 1 0 5 5 0 1 6 0 0 pc3 0 0 7 7 0 0 7 0 0 pc4 0 0 7 2 0 5 7 0 0 ar1 14 0 1 14 0 1 11 0 4 ar3 15 0 0 5 0 10 10 2 3 ar4 16 0 0 14 1 1 15 0 1 ar5 14 0 4 14 0 4 16 0 2 ar6 7 1 7 8 4 3 12 0 3 Total 147 3 72 147 14 61 182 3 35 % 66.2 % 1.4% 32.4 % 66.2 % 6.3% 27.5 % 82.0 % 1.3% 16.7 %
  • 23. Matched Metrics (Win) 23 MetricValues Distribution (Source metric: RFC-the number of method invoked by a class, Target metric: the number of operand Matching Score = 0.91 AUC = 0.946 (ant1.3  ar5)
  • 24. Matched Metrics (Loss) 24 MetricValues Distribution (Source metric: LOC, Target metric: average number of LOC in a method) Matching Score = 0.13 AUC = 0.391 (Safe  velocity-1.4)
  • 25. Different Feature Selections (median AUCs, Win/Tie/Loss) 25 Approach Against WPDP Against CPDP-CM Against CPDP-IFS HDP AUC Win% AUC Win% AUC Win% AUC Gain Ratio 0.657 63.7% 0.645 63.2% 0.536 80.2% 0.720 Chi-Square 0.657 64.7% 0.651 66.4% 0.556 82.3% 0.727 Significanc e 0.657 66.2% 0.636 66.2% 0.553 82.0% 0.724 Relief-F 0.670 57.0% 0.657 63.1% 0.543 80.5% 0.709 None 0.657 47.3% 0.624 50.3% 0.536 66.3% 0.663
  • 26. Results in Different Cutoffs 26 Cutoff Against WPDP Against CPDP-CM Against CPDP-IFS HDP Target Coverage AUC Win% AUC Win% AUC Win% AUC 0.05 0.657 66.2% 0.636 66.2% 0.553 82.4% 0.724* 100% 0.90 0.657 100% 0.761 71.4% 0.624 100% 0.852* 21%
  • 27. Conclusion • HDP – Potential for CPDP across datasets with different metric sets. • Future work – Filtering out noisy metric matching – Determine the best probability threshold 27

Hinweis der Redaktion

  1. Oggioni Room 17 PM (Session in 16:30 – 18:00)
  2. Here is Project A and some software entities. Let say these entities are source code files. I want to predict whether these files are buggy or clean. To do this, we need a prediction model. Since defect prediction models are trained by machine learning algorithms, we need labeled instances collected from previous releases. This is an labeled instance. An instance consists of features and labels. Various software metrics such as LoC, # of functions in a file, and # of authors touching a source file, are used as features for machine learning. Software metrics measure complexity of software and its development process Each instance can be labeled by past bug information. Software metrics and past bug information can be collected from software archives such as version control systems and bug report systems. With these labeled instances, we can build a prediction model and predict the unlabeled instances. This prediction is conducted within the same project. So, we call this Within-project defect prediction (WPDP). There are many studies about WPDP and showed good prediction performance. ( like prediction accuracy is 0.7.)
  3. What if there are no labeled instances. This can happen in new projects and projects lacking in historical data. New projects do not have past defect information to label instances. Some projects also does not have defect information because of lacking in historical data from software archives. When I participated in an industrial project for Samsung electronics, it was really difficult to generate labeled instances because their software archives are not well managed by developers. So, in some real industrial projects, we may not generate labeled instances to build a prediction model. Without labeled instances, we can not build a prediction model. After experiencing this limitation form the industry, I decided to address this problem.
  4. There are existing solutions to build a prediction model for unlabeled datasets. The first solution is cross-project defect prediction. We can reuse labeled instances from other projects.
  5. Various feature selection approaches can be applied
  6. By doing that, we can investigate how higher matching scores can impact defect prediction performance.
  7. 16 distribution characteristics: mode, median, mean, harmonic mean, minimum, maximum, range, variation ratio, first quartile, third quartile, interquartile range, vari- ance, standard deviation, coefficient of variance, skewness, and kurtosis
  8. AEEEM: object- oriented (OO) metrics, previous-defect metrics, entropy met- rics of change and code, and churn-of-source-code metrics [4]. MORPH: McCabe’s cyclomatic metrics, CK metrics, and other OO metrics [36]. ReLink: code complexity metrics NASA: Halstead metrics and McCabe’s cyclomatic metrics, additional complexity metrics such as parameter count and percentage of comments SOFTLAB: Halstead metrics and McCabe’s cyclomatic metrics
  9. all 222 prediction combinations among 600 predictions