SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Customization Support for CBR-Based Defect Prediction Elham Paikari Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaepaikari@ucalgary.ca   Bo Sun Department of Computer ScienceUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadasbo@ucalgary.ca Guenther Ruhe Department of Computer Science & Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaruhe@ucalgary.ca Emadoddin Livani Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaelivani@ucalgary.ca
Agenda Parameters of a CBR Model Parameters Instantiation Weighting Method SANN Frequency Analysis Dependency Network and the Customization Support Rules Transferability Conclusions and Future Work 2
CBR and the Parameters ,[object Object]
Instantiation of the general CBR-based prediction method Solution Algorithm Similarity Function Prediction Performance of CBR model Number of Nearest Neighbor Case Weighting Technique used for Attributes 3
Instantiation Parameters of the CBR  4
Sensitivity Analysis Based On NeuralNetwork (SANN) 5 Dataset CC …………… LOC Xmin(A1) NN OUTPUTmin(A1) ∆1= |OUTPUTmin(A1) - OUTPUTmax(A1)| Xmax (A1) NN OUTPUTmax(A1)
6 What is the evaluation result in comparison with existing methods  (un-weighted) What is the evaluation result in comparison with existing methods  (MLR) How different numbers of the nearest neighbors  can affect the results?
Instantiation Parameters of the CBR  7
Data Repository PROMISE Repository  120 different CBR instantiations were created  and applied to 11 data sets from PROMISE repository Characterization of data sets 8
Is One Instantiation Always the Best? 9 MW1 PC1 MC2 KC3 AR5 CM1
Experimental Design for Frequency Analysis 10 Min(MMRE) Dataset 120 different instantiation Max(Pred(0.25)) Min(MMRE) Dataset 120 different instantiation Max(Pred(0.25)) 11different Datasets Min(MMRE) Dataset 120 different instantiation Max(Pred(0.25))
Frequency Analysis Frequency of the best performance in single attribute analysis  Neural network based sensitivity analysis (as the weighting technique) Un-weighted average (as the solution algorithm) Maximum number of nearest neighbors (as the number of nearest neighbors) 11
12
13 Customization Support Using DNA Dataset Eight attributes defined as condition attributes Four data set-related attributes: 		(NumOfModule),(DefectRatio),(Language),(LOC) Four CBR-related attributes: 	(SimFunc),(WeightingTech),(NumOfNN),(SolutionAlgorithm)  The decision  attributes: Pred(0.25) and MMRE (a1,a2,a3,a4) (p1,p2,p3,p4) Rule Induction Customization Support CBR model instantiated by (p1,p2,p3,p4)   Data set DNA New data (a1,a2,a3,a4) Recommendation  f (a1,a2,a3,a4)   Rule Set
14 Application of DNA Results  Generation of the Decision Trees Given: NumOfModule = 	High DefectRatio = 	High LOC = 		Medium Language = 	JAVA Question: How to customize a CBR defect prediction model towards achieving high prediction accuracy measured in MMRE? Recommendation: Customize CBR model by means of: WeightingTech = 	SANN NumOfNN  ≥ 10 SolutionAlgorithm = Rank-weighted Average Justification: Based on the data set characteristics, assumptions of rules 3, 4, 5, 11 and 12 are fulfilled.  By comparing the probability distributions of MMRE rule No. 11 is the best in terms of having the highest probability (69.2%) to achieve “Low” MMRE.
Transferability of Rules across Sets of Data 15 ,[object Object]
 Prediction  on MMRE = 62.92%  ,[object Object],Rule Induction Customization Support Min(MMRE) Dataset 120 different instantiation CBR model instantiated by (p1,p2,p3,p4)   Data set Max(Pred(0.25)) 9 different Datasets DNA New data (a1,a2,a3,a4) Min(MMRE) Dataset 120 different instantiation Recommendation  f (a1,a2,a3,a4)   Max(Pred(0.25)) Rule Set
Validation and Limitations  Tools used for attribute selection, and modeling tasks Neural network, regression analysis, CBR, and dependency network analysis  Only four parameters of the CBR instantiation The composition of the training and testing data sets Another aspect of the analysis undertaken is the definition of classification intervals for dependency networks,  Two discretization algorithms Sensitivity analysis 16
Conclusions and Future Work Starting with 11 data sets from the PROMISE repository Calculating the prediction performance of 120 instantiations of the CBR-based defect prediction model based on the value of the MMRE and Pred(0.25)  The frequency analysis on the top performances Generating the DNA to provide a customization support for a new data set The compatibility of rule sets extracted from different contexts Enhancement of the validity with inclusion of further data sets Comparing the performance against other measures  Other methods for rule induction 17

Weitere ähnliche Inhalte

Was ist angesagt?

A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal
 
Neural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variablesNeural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variablesJonathan D'Cruz
 
Artificial neural network for load forecasting in smart grid
Artificial neural network for load forecasting in smart gridArtificial neural network for load forecasting in smart grid
Artificial neural network for load forecasting in smart gridEhsan Zeraatparvar
 
Learning from data for wind–wave forecasting
Learning from data for wind–wave forecastingLearning from data for wind–wave forecasting
Learning from data for wind–wave forecastingJonathan D'Cruz
 
One–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networksOne–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networksJonathan D'Cruz
 
Novel framework of retaining maximum data quality and energy efficiency in re...
Novel framework of retaining maximum data quality and energy efficiency in re...Novel framework of retaining maximum data quality and energy efficiency in re...
Novel framework of retaining maximum data quality and energy efficiency in re...IJECEIAES
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Punit Sharnagat
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesLionel Briand
 
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...IRJET Journal
 

Was ist angesagt? (13)

A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
Database Searching
Database SearchingDatabase Searching
Database Searching
 
Neural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variablesNeural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variables
 
Artificial neural network for load forecasting in smart grid
Artificial neural network for load forecasting in smart gridArtificial neural network for load forecasting in smart grid
Artificial neural network for load forecasting in smart grid
 
Learning from data for wind–wave forecasting
Learning from data for wind–wave forecastingLearning from data for wind–wave forecasting
Learning from data for wind–wave forecasting
 
One–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networksOne–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networks
 
Novel framework of retaining maximum data quality and energy efficiency in re...
Novel framework of retaining maximum data quality and energy efficiency in re...Novel framework of retaining maximum data quality and energy efficiency in re...
Novel framework of retaining maximum data quality and energy efficiency in re...
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
 
2015-03-31_MotifGP
2015-03-31_MotifGP2015-03-31_MotifGP
2015-03-31_MotifGP
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
 
Ju3517011704
Ju3517011704Ju3517011704
Ju3517011704
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
 
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
 

Ähnlich wie Promise 2011: "Customization Support for CBR-Based Defect Prediction"

A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...IOSR Journals
 
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Arinze Akutekwe
 
Assisting Code Search with Automatic Query Reformulation for Bug Localization
Assisting Code Search with Automatic Query Reformulation for Bug LocalizationAssisting Code Search with Automatic Query Reformulation for Bug Localization
Assisting Code Search with Automatic Query Reformulation for Bug LocalizationBunyamin Sisman
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Projectbutest
 
COSMOS1_Scitech_2014_Ali
COSMOS1_Scitech_2014_AliCOSMOS1_Scitech_2014_Ali
COSMOS1_Scitech_2014_AliMDO_Lab
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
AIAA-SciTech-ModelSelection-2014-Mehmani
AIAA-SciTech-ModelSelection-2014-MehmaniAIAA-SciTech-ModelSelection-2014-Mehmani
AIAA-SciTech-ModelSelection-2014-MehmaniOptiModel
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET Journal
 
Cukic Promise08 V3
Cukic Promise08 V3Cukic Promise08 V3
Cukic Promise08 V3gregoryg
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint febimu409
 
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...IJCI JOURNAL
 
An Automated Tool for MC/DC Test Data Generation
An Automated Tool for MC/DC Test Data GenerationAn Automated Tool for MC/DC Test Data Generation
An Automated Tool for MC/DC Test Data GenerationAriful Haque
 
A survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsA survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsAhmed Magdy Ezzeldin, MSc.
 
Multidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertaintyMultidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertaintyChen Liang
 
APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...
APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...
APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...Shibaprasad Bhattacharya
 
A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...Vrije Universiteit Brussel
 
COSMOS-ASME-IDETC-2014
COSMOS-ASME-IDETC-2014COSMOS-ASME-IDETC-2014
COSMOS-ASME-IDETC-2014OptiModel
 

Ähnlich wie Promise 2011: "Customization Support for CBR-Based Defect Prediction" (20)

A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
 
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
 
Assisting Code Search with Automatic Query Reformulation for Bug Localization
Assisting Code Search with Automatic Query Reformulation for Bug LocalizationAssisting Code Search with Automatic Query Reformulation for Bug Localization
Assisting Code Search with Automatic Query Reformulation for Bug Localization
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
 
COSMOS1_Scitech_2014_Ali
COSMOS1_Scitech_2014_AliCOSMOS1_Scitech_2014_Ali
COSMOS1_Scitech_2014_Ali
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
AIAA-SciTech-ModelSelection-2014-Mehmani
AIAA-SciTech-ModelSelection-2014-MehmaniAIAA-SciTech-ModelSelection-2014-Mehmani
AIAA-SciTech-ModelSelection-2014-Mehmani
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
 
Cukic Promise08 V3
Cukic Promise08 V3Cukic Promise08 V3
Cukic Promise08 V3
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint feb
 
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...
 
An Automated Tool for MC/DC Test Data Generation
An Automated Tool for MC/DC Test Data GenerationAn Automated Tool for MC/DC Test Data Generation
An Automated Tool for MC/DC Test Data Generation
 
A survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsA survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithms
 
Multidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertaintyMultidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertainty
 
APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...
APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...
APPLICATION OF STATISTICAL LEARNING TECHNIQUES AS PREDICTIVE TOOLS FOR MACHIN...
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...
 
COSMOS-ASME-IDETC-2014
COSMOS-ASME-IDETC-2014COSMOS-ASME-IDETC-2014
COSMOS-ASME-IDETC-2014
 

Mehr von CS, NcState

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdecCS, NcState
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...CS, NcState
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9CS, NcState
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).CS, NcState
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceCS, NcState
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab templateCS, NcState
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUCS, NcState
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements EngineeringCS, NcState
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginiaCS, NcState
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software EngineeringCS, NcState
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)CS, NcState
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceCS, NcState
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1CS, NcState
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataCS, NcState
 

Mehr von CS, NcState (20)

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdec
 
Future se oct15
Future se oct15Future se oct15
Future se oct15
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
Goldrush
GoldrushGoldrush
Goldrush
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 

Kürzlich hochgeladen

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Kürzlich hochgeladen (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Promise 2011: "Customization Support for CBR-Based Defect Prediction"

  • 1. Customization Support for CBR-Based Defect Prediction Elham Paikari Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaepaikari@ucalgary.ca   Bo Sun Department of Computer ScienceUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadasbo@ucalgary.ca Guenther Ruhe Department of Computer Science & Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaruhe@ucalgary.ca Emadoddin Livani Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaelivani@ucalgary.ca
  • 2. Agenda Parameters of a CBR Model Parameters Instantiation Weighting Method SANN Frequency Analysis Dependency Network and the Customization Support Rules Transferability Conclusions and Future Work 2
  • 3.
  • 4. Instantiation of the general CBR-based prediction method Solution Algorithm Similarity Function Prediction Performance of CBR model Number of Nearest Neighbor Case Weighting Technique used for Attributes 3
  • 6. Sensitivity Analysis Based On NeuralNetwork (SANN) 5 Dataset CC …………… LOC Xmin(A1) NN OUTPUTmin(A1) ∆1= |OUTPUTmin(A1) - OUTPUTmax(A1)| Xmax (A1) NN OUTPUTmax(A1)
  • 7. 6 What is the evaluation result in comparison with existing methods (un-weighted) What is the evaluation result in comparison with existing methods (MLR) How different numbers of the nearest neighbors can affect the results?
  • 9. Data Repository PROMISE Repository 120 different CBR instantiations were created and applied to 11 data sets from PROMISE repository Characterization of data sets 8
  • 10. Is One Instantiation Always the Best? 9 MW1 PC1 MC2 KC3 AR5 CM1
  • 11. Experimental Design for Frequency Analysis 10 Min(MMRE) Dataset 120 different instantiation Max(Pred(0.25)) Min(MMRE) Dataset 120 different instantiation Max(Pred(0.25)) 11different Datasets Min(MMRE) Dataset 120 different instantiation Max(Pred(0.25))
  • 12. Frequency Analysis Frequency of the best performance in single attribute analysis Neural network based sensitivity analysis (as the weighting technique) Un-weighted average (as the solution algorithm) Maximum number of nearest neighbors (as the number of nearest neighbors) 11
  • 13. 12
  • 14. 13 Customization Support Using DNA Dataset Eight attributes defined as condition attributes Four data set-related attributes: (NumOfModule),(DefectRatio),(Language),(LOC) Four CBR-related attributes: (SimFunc),(WeightingTech),(NumOfNN),(SolutionAlgorithm) The decision attributes: Pred(0.25) and MMRE (a1,a2,a3,a4) (p1,p2,p3,p4) Rule Induction Customization Support CBR model instantiated by (p1,p2,p3,p4)   Data set DNA New data (a1,a2,a3,a4) Recommendation f (a1,a2,a3,a4)   Rule Set
  • 15. 14 Application of DNA Results Generation of the Decision Trees Given: NumOfModule = High DefectRatio = High LOC = Medium Language = JAVA Question: How to customize a CBR defect prediction model towards achieving high prediction accuracy measured in MMRE? Recommendation: Customize CBR model by means of: WeightingTech = SANN NumOfNN ≥ 10 SolutionAlgorithm = Rank-weighted Average Justification: Based on the data set characteristics, assumptions of rules 3, 4, 5, 11 and 12 are fulfilled. By comparing the probability distributions of MMRE rule No. 11 is the best in terms of having the highest probability (69.2%) to achieve “Low” MMRE.
  • 16.
  • 17.
  • 18. Validation and Limitations Tools used for attribute selection, and modeling tasks Neural network, regression analysis, CBR, and dependency network analysis Only four parameters of the CBR instantiation The composition of the training and testing data sets Another aspect of the analysis undertaken is the definition of classification intervals for dependency networks, Two discretization algorithms Sensitivity analysis 16
  • 19. Conclusions and Future Work Starting with 11 data sets from the PROMISE repository Calculating the prediction performance of 120 instantiations of the CBR-based defect prediction model based on the value of the MMRE and Pred(0.25) The frequency analysis on the top performances Generating the DNA to provide a customization support for a new data set The compatibility of rule sets extracted from different contexts Enhancement of the validity with inclusion of further data sets Comparing the performance against other measures Other methods for rule induction 17
  • 20. References Brady, A. and Menzies, T. 2010. Case-based reasoning vs parametric models for software quality optimization. In Proceedings of the 6thInternational Conference on Predictive Models in Software Engineering, pp. 3:1-3:10. Catal, C. and Diri, B. 2009. A systematic review of software fault prediction studies. Expert Systems with Applications, vol. 36 (4), pp. 7346-7354. El Emam, K., Benlarbi, S., Goel, N., and Rai, S. N. 2001. Comparing case-based reasoning classifiers for predicting high risk software components. The Journal of Systems and Software, vol. 55, pp. 301-320. Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit , I. 2003. A simulation study of the model evaluation criterion MMRE. IEEE Transactions on Software Engineering, vol. 29 (11), pp. 985- 995. Ganesan, K., Khoshgoftaar, T. M., and Allen, E. B. 2000. Case-based software quality prediction. International Journal of Software Engineering and Knowledge Engineering, vol. 10(2), pp. 139–152. Paikari, E., Richter, M. M., and Ruhe, G. 2010. A comparative study of attribute weighting techniques for software defect prediction using case-based reasoning. In Proceeding of the 22nd International Conference on Software Engineering and Knowledge Engineering, pp. 380-386. 18