by Raja Balusamy, Group Manager & Shivakumar Balur, Senior Chief Engineer, Samsung R&D at STeP-IN SUMMIT 2018 - 15th International Conference on Software Testing on August 30, 2018 at Taj, MG Road, Bengaluru
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Predictive Analytics based Regression Test Optimization
1. Raja Balusamy, Group Manager
Shivakumar Balur, Senior Chief Engineer
Samsung R&D Institute India - Bangalore
Predictive Analytics based
Regression Test Optimization
1
2. • Why Predictions Important
• Predictions in different industry
• Confidence Level and Data Sampling
• Preprocessing
• ML
• Our Model
• Sample Study
2
Contents
4. Predictive Modeling Is..
Predictive modeling uses statistics to predict outcomes.
The goal is to go beyond knowing what has happened to
providing a best assessment of what will happen in the
future.
Reports /
Records
Predict
Results
Review
Predictions
Past Current Future
Action 4
10. Machine Learning Algorithms to Predict
• Fast Training Time
• Mostly Used
Logistic Regression
• More Accurate
• Training Time is slow
Artificial Neural Network
• Fast Training Time
• Pre-requisite of high memory footprint
Decision Tree
• Quicker than other alternatives
• Highly feature dependent
Naïve Bayes Classifier
• Highly Accurate
• Pre-requisite of 100+ independent variables
Support Vector Machine
• High Cost of computation
• Applicability in diverse training set data
K-Nearest Neighbors
Selected for our model as
Mostly used and Fast
Training Time
10
11. Defects History
Change List
Module Mapping
Code Changes
∑Weight
based on
priority or AI)
Algorithms:
Logistic
Regression
Optimized TCs
Suite
Our Prediction Model || Optimized Test Suite
Input Training Model Output
Release Info
# New Features
Test cases
w1
w2
w3
w4
w5
w6
w7
x1
x2
x3
x4
x5
x6
x7
Raw Data Remove Errors
Input for
Training Model
Data Set:
11
12. Input || X1: Defects Analysis
• Code Changed Defects /
Documents
Considered
• Defects Severity (High / Medium / Low)
• Occur. Freq.
• Defects Classification (Display, Fatal, Function, Performance, Text,
Usability)
• Resolved Option (Code Changes, UI/UX)
Summation ∑ = Weighted Average
Note: The defect which accepted from Development Team are considered. Invalid defects are not considered including 3rd party.12
13. Input || X2: Change List Information
Considered
• [Title] Feature Change /
Defects Fix / Etc..
• [Checking Method] Steps to
check
• [Type & Feature Name]
Feature Change
• [Cause & Measure] Exception
not handled
• [Developer] Name
• [Modules Affected] Module A
and Module B
Note: Change list data taken from Configuration Management Tool which submitted by Development Team
Narrow ‘s
Down the
TCs Group
Set selection
Weightage
evaluation
Key Words
been used to
identify
Module Fix
info
13
14. Input || X3: Module Mapping
Possible Scenarios:
• No Interaction
• Average Interaction
• Less Interaction
• More Interaction
No interaction Consider Only A1
Average interaction Consider A2 + C2 + C3
Less interaction Consider A4 + C4
More interaction Consider A3 + C1 + C2 + C4
Note: Module Mapping done by an Experts or Class Diagram or by Code Coverage to find out the interaction between modules.
Modules/ C
Sub Modules C1 C2 C3 C4
A
A1 X X X X
A2 X Y Y X
A3 Y Y X Y
A4 X X X Y
C2 & C4 are duplicate as
these are considered along
with A2 & A4
14
15. Lines of Code
• Added
• Deleted
• Modified
• Release version
• Type of Releases
(Sanity / Full / Patch)
Release Information
Input || X4: Code Changes Input || X5: Release Info
Note: SLOC Tool used for calculating KLOC Changes Note: Release information and type taken from Internal Tools
15
16. TestcasesNo.of New Features Implemented
• Impact of new
features
• Module Name
• Sub Module Name
• Priority
• Title
• Steps to Execute
• Failed Test cases
Input || X6: New Features Input || X7: Test Cases
Note: PM and Development Team shared the new features Note: Test case data prepared by Test Team.
16
17. Release
Cycle
Regression Test Cycle
(4200 Test cases)
Manual Prediction Model
A1 6300 2654
A2 6300 2250
A3 6300 1973
A4 6300 1275
A5 6300 960
Input
Predicted Test Suite
• Other than source code changes defects not predicted using above
model (Ex; Document Related Defects (Requirements Documentation,
UI) and 3rd Party defects.
Ex: Text cut, document mismatch, dependent on 3rd party..
Non Predicted Defects Category
• 3536 Defects
• 2932 Change List
• 43 Modules
• 1245 KLOC Changes
• 32 Releases
• 5 New Features
• 6300 Test cases
Release
Cycle
Defects Identified
Prediction
Model
Non -
Prediction
%
Accuracy
A1 415 353 54%
A2 365 215 63%
A3 300 130 70%
A4 199 30 87%
A5 144 20 89%
17
Sample Study
19. Limitations..
History cannot always predict 100% future accurate.
The issue of unknown unknowns.
Self-defeat of an algorithm.
19
Defects Resolution comments from Developer
Module Mapping
New Features TC Optimization (LOC, etc.)..
Challenges..