3. Introduction
Ø Software defect prediction models reveal defect
prone parts of the software to guide managers in
allocating testing resources efficiently
Ø Popular studies
Ø Estimate number of defects remaining in software systems
Ø Discover defect associations
Ø Classify defect-proneness of software components into two classes,
defect-prone and not defect-prone
Ø Metrics
Ø Static code
Ø History
Ø Social
4. Introduction
Ø Numerous defect prediction research in the last 40
years
Ø Statistical techniques with machine learning algorithms are adopted
Ø Nagappan et al., Ostrand et al., Zimmermann et al., Fenton et al.,
Khoshgoftaar et al.
Ø Benchmarking studies
Ø Lessmann et al. and Menzies et al.
Ø Systematic literature surveys
Ø Hall et al.
Ø Industrial case studies
Ø Tosun et al.
5. Introduction
Ø Major challenges in building defect prediction models:
Ø High dimensionality of software defect data
Ø The number of available software metrics is too large for a classifier to work
Ø Skewed, imbalanced data sets
Ø Proportion of one of the classes is quite larger than the proportion of the
other class.
Ø Performance limitations
Ø Limited information content
Ø Performance ceiling effect
Ø Incomplete datasets
Ø Features of the train set may differ from the features of test set
Ø Some of the test set attributes may be missing
Ø There may be extra attributes in test sets
Ø Building model with several datasets.
Ø Different datasets may have different attribute sets.
6. Introduction
Ø Missing value pattern may be in
different forms:
Ø Data may be missing at individual points
Ø Some attribute values may be considered as
outliers. Data may be missing in chunks
Ø You may want to build your model with several
datasets and the attributes of these datasets
may differ.
Ø When these datasets are concatenated, there
will probably be missing chunks.
Ø Solution might be:
Ø To use the largest common attribute set OR
Ø To introduce imputation to the missing
attributes
8. Related Work
• Recommendation systems
o Netflix Prize competition
• Koren, Bell, and Volinsky
o Collective Matrix Factorization
• Singh et al. and Lippert et al.
9. Matrix Factorization
• Netflix competition
o Matrix Factorization models are actually superior to classical nearest-neighbor
techniques as they offer incorporation of an additional information and
scalable predictive accuracy (Bell et al.)
• Matrix factorization is basically factorizing a
large matrix into two smaller matrices called
factors.
• Factors are multiplied to obtain the original matrix.
10. Matrix Factorization
• Nonnegative MF
Algorithms (Berry et al.)
o Multiplicative update
algorithms
o Gradient descent algorithms
• Easiest to implement
and to scale
o Alternating least square
algorithms
• Multi Relational Matrix
Factorization by Lippert
et al.
o Low-norm Matrix
Factorization based on
gradient descent algorithm
11. Experiment
Datasets
Static Code
Metrics
Churn
Metrics
Social
Metrics
Instances
Defective %
Android
106
15
25
12981
6.4
Linux
Kernel
106
15
25
14801
5.5
Perl
106
15
25
125
61.6
VLC
106
15
25
936
39.2
Datasets
• Android
o Open source Operating System designed for mobile devices
• Linux Kernel
o Open source operating system
• Perl
o Stable, cross-platform, open source interpreted language
• VLC
o Open source multimedia player
13. Experiment
Experiment 1
• The performance of Naive Bayes
algorithm is explored
• Run 10 times 10-fold cross
validation while gradually
removing attributes from datasets
• Attributes are removed according
to their correlation with the class
attribute
• Pearson correlation is used
• 4(datasets)x10(removal
steps)x10x10(fold size)=4000 Naive
Bayes prediction models are built
Experiment 2
• The performances of Naive Bayes
with Imputation and Matrix
Factorization are compared
• Attributes are chosen according to
their correlation with the class
attribute
o Pearson correlation is used
• Imputation or removal procedure is
done on the chosen attributes in the
increasing proportion
• 4(datasets)x10(attribute selection
steps)x10(imputation steps)x10(fold
size)=4000 Naive Bayes and Matrix
Factorization models are built
14. Results (Exp. 1)
Balance values of Naive Bayes with respect to feature reduction percentage
Android
Kernel
Perl
VLC
15. Results (Exp. 2)
Android
Kernel
Perl
VLC
Balance values of MF with respect to the missing Churn and Social Attribute data
and NB with imputation on Churn and Social Attributes
16. Threats to Validity
• Internal Validity
o Naive Bayes, Mean-Value Imputation and Matrix Factorization are used largely
in previous studies.
o Performance measurements used for evaluation are also adopted by several
researchers in the past.
o The number of studies discussing static code, history and social metrics is quite
abundant.
o The datasets are extracted from open source project repositories and they are
also used in previous studies.
• External validity
o Four different datasets extracted from open source project repositories.
o Nevertheless, our results are limited to the analyzed data and context
17. Conclusion
• Collective matrix factorization from recommender systems for
missing data problem in defect prediction
• Two experiments conducted
o The performance of NB with feature reduction
o The performance of NB with mean-value imputation vs. the performance of MF with
missing data
• NB performance decreases while the number of features are
reduced.
• Matrix Factorization performs better on datasets with missing
data than the benchmark model with imputation
• Future Work
o Support the findings with using complex imputation techniques
o Different missing data scenarios may be adopted