Fuzzing for CPS Mutation Testing

Lionel Briand
Lionel BriandProfessor, Canada Research Chair (Tier 1), ERC Advanced grant recipient um EECS, U. of Ottawa and SnT Centre, U. of Luxembourg
Fuzzing for CPS Mutation Testing
Jaekwon Lee1,2, Enrico Viganò1, Oscar Cornejo1,
Fabrizio Pastore1, Lionel Briand1,2
1 University of Luxembourg, 2 University of Ottawa
ASE 2023 - September 14th, 2023
2
Mutation Testing
SUT SUT SUT SUT
Test
suite
Test
suite
Test
suite
Test
suite
SUT
Test
suite
FAIL PASS FAIL FAIL PASS
SUT
Test
suite
PASS
New test 1 New test 2
New test 3
FAIL FAIL FAIL
Improve with automatically generated test cases
SUT
3
Our focus:
C/C++ software deployed on CPS
4
State-of-the-art Tool for C: SEMu
§ Based on symbolic execution (KLEE)
§ Excellent for testing command line utilities
§ Inapplicable to CPS (limitations of KLEE)
§ Unable to test functions with floating point
parameters
§ Unable to test functions communicating
over network
§ Dependency on LLVM
5
Grey-box Fuzzing
§ Can be an ideal solution for mutation testing
§ Generates test cases by exercising the compiled software
§ Not affected by the limitations of symbolic execution
6
Evolutionary Process in Grey-box Fuzzing
SUT
SUT
New behaviour observed?
(number of times branches are covered)
Seed
files Queue
Test and collect
coverage
Select from
queue
Randomly
modify file
.
.
Test
Crash
SUT
SUT
Crashing inputs
Yes:
add to queue
No: discard
The grey-box fuzzing process
demonstrated useful
to generate diverse inputs
that
expose different faults
7
Grey-box fuzzing
may Facilitate Mutation Testing
§ When an input leads to a program state that differs for
the original and the mutated function,
differences in code coverage might be observed
§ Additional modifications of such input may help
propagating the infection and kill the mutant
8
Fuzzed Input 1 Fuzzed Input 2 Fuzzed Input 3
x=1,y=1,z=5 x=2,y=1,z=5 x=2,y=1,z=0
int max( int x, int y, int z){
int t;
if (x >= y)
t = x; branch 1: 1 1 1
else
t = y; branch 2: 0 0 0
if (t >= z)
return t; branch 3: 0 0 1
return z; branch 4: 1 1 0
} return 5 return 5 return 2
int mut_max( int x, int y, int z){
int t;
if (x <= y)
t = x; branch 5: 1 0 0
else
t = y; branch 6: 0 1 1
if (t >= z)
return t; branch 7: 0 0 1
return z; branch 8: 1 1 0
} return 5 return 5 return 1
9
Fuzzed Input 1 Fuzzed Input 2 Fuzzed Input 3
x=1,y=1,z=5 x=4,y=1,z=5 x=4,y=1,z=0
int max( int x, int y, int z){
int t;
if (x >= y)
t = x; branch 1: 1 1 1
else
t = y; branch 2: 0 0 0
if (t >= z)
return t; branch 3: 0 0 1
return z; branch 4: 1 1 0
} return 5 return 5 return 2
int mut_max( int x, int y, int z){
int t;
if (x <= y)
t = x; branch 5: 1 0 0
else
t = y; branch 6: 0 1 1
if (t >= z)
return t; branch 7: 0 0 1
return z; branch 8: 1 1 0
} return 5 return 5 return 1
infected state
10
DDCM payload data
Sun sensor data
S-band antenna data
Fuzzing is not for
System-level CPS Testing
11
We aim to
generate test
cases at unit level,
but it is not supported
by grey-box fuzzers
12
MutatiOn Testing wIth Fuzzing (MOTIF)
1. Generate fuzzing driver
Live Mutant SUT source
Fuzzing driver
int main(...){
double x = load(..);
double y = load(..);
int z = load(..);
double m_x = load(..);
double m_y = load(..);
int m_z = load(..);
ret = max(x,y,z);
mut_ret = mut_max(m_x,m_y,m_z);
if( ! match ( ret, mut_ret ) ){ abort(); }
0100100001110110000
1100101101110110000
0101101101110110000
if( ! match (x, m_x ){ abort() };
if( ! match (y, m_y) { abort() };
if( ! match (z, m_z) { abort() };
13
1. Generate fuzzing driver
Live mutant SUT source 4. Mutation testing
2. Generate seed inputs
3. Compile
Fuzzing driver
Seed file
Seed file
Seed file
Executable fuzzing driver
Execute fuzzer
(AFL++)
Executable fuzzing driver
Fuzzed file
Post-processing
Crashing file
File killing mutant
File killing mutant
Crashing file
Crashing file
5. Inspection
Generate test case Test case
MutatiOn Testing wIth Fuzzing (MOTIF)
14
1. Generate fuzzing driver
Live Mutant SUT source 4. Mutation testing
2. Generate seed inputs
3. Compile
Fuzzing driver
Seed file
Seed file
Seed file
Executable fuzzing driver
Execute fuzzer
(AFL++)
Executable fuzzing driver
Fuzzed file
Post-processing
Crashing file
File killing mutant
File killing mutant
Crashing file
Crashing file
5. Inspection
Engineer compare outputs with specifications
Bug found
Generate test case Test case
New regression test
MutatiOn Testing wIth Fuzzing (MOTIF)
Assign fuzzer
inputs to
input variables
Inspect results
15
Empirical Evaluation
RQ1. How does MOTIF compare to mutation testing based on
symbolic execution?
RQ2. How does MOTIF perform with software that cannot be tested
with symbolic execution?
RQ3. How does MOTIF’s seeding strategy contribute to its results?
16
Case Study Subjects
Software deployed on space CPS from ESA project:
§ MLFS: the Mathematical Library for Flight Software
§ LIBU: a utility library from one of our industry partner
§ ASN1lib: a serialization/deserialization library generated with the
ASN1SCC compiler
17
RQ1. MOTIF vs Symb. Execution
§ We created SEMuP: a modified MOTIF pipeline that instead of
using AFL++ relies on KLEE/SEMu to generate test case
§ We considered subjects where symbolic execution is applicable
(e.g., no floating-point var):
§ ASN1Lib
§ 27 source files of LIBU
§ 1,499 mutants not killed by existing test suites
§ Executed both approaches for 10,000 seconds for each mutant
§ Repeated 10 times
18
RQ1 Results
Plots with datapoints belonging to each of the 10 runs
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MOTIF SEMuP
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MOTIF SEMuP
ASN1Lib LIBU
10.5
46.8
MOTIF kills 86.08% and 73.79% mutants (avg). It outperforms symbolic execution.
19
RQ1 Results
Plots with datapoints belonging to each of the 10 runs
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MOTIF SEMuP
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MOTIF SEMuP
ASN1Lib
10.5
46.8
MOTIF kills 252 mutants not killed by SEMuP.
SEMuP kills 103 mutants not killed by MOTIF.
Complementarity
MOTIF kills 74 mutants not killed by SEMuP.
SEMuP kills 1 mutants not killed by MOTIF.
LIBU
11
1. Generate fuzzing driver
Live Mutant SUT source 4. Mutation Testing
2. Generate seed inputs
3. Compile
Fuzzing driver
Seed file
Seed file
Seed file
Executable fuzzing driver
Execute fuzzer
(AFL++)
Executable fuzzing driver
Fuzzed file
Post-processing
Crashing file
File killing mutant
File killing mutant
Crashing file
Crashing file
5. Inspection
Engineer compare outputs with specifications
Bug found
Generate test case Test case
New regression test
MutatiOn Testing wIth Fuzzing (MOTIF)
https://github.com/SNTSVV/MOTIF
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MOTIF SEMuP
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MOTIF SEMuP
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MLFS LIBU
Results
RQ2
RQ1
https://faqas.uni.lu
Fuzzing for CPS Mutation Testing
Jaekwon Lee1,2, Enrico Viganò1, Oscar Cornejo1,
Fabrizio Pastore1, Lionel Briand1,2
1 University of Luxembourg, 2 University of Ottawa
ASE 2023 - September 14th, 2023
22
Backup Slides
23
RQ2
Executed MOTIF on subjects where symbolic execution is not applicable
0%
25%
50%
75%
100%
0 2,000 4,000 6,000 8,000 10,000
Execution time (seconds)
Killed
mutants
MLFS LIBU
§ Percentage of killed mutants is lower than for the
other cases but MOTIF is still effective
§ Especially considering that MLFS is a math
library with MC/DC test suite
§ Live mutants are hard to kill
§ Some of the mutants can be killed only with inputs
belonging to a narrow portion of a large input
domain
§ Numbers in a small range
§ Input strings that match a string stored in a
global variable.
24
RQ3: Seed Inputs vs Fuzzing
§ We focus on the proportion of mutants killed with seed inputs in
the experiments for RQ1 and RQ2
§ Mutants killed by seeds:
§ RQ1 experiments:
§ LIBU: one mutant (less than 1% of all the mutants killed)
§ ASN1Lib: 280 mutants (24.15%)
§ RQ2 experiments:
§ MLFS: 76 mutants (5.43%)
§ LIBU: 26 mutants (21.66%)
1 von 24

Más contenido relacionado

Similar a Fuzzing for CPS Mutation Testing(20)

DSR Testing (Part 1)DSR Testing (Part 1)
DSR Testing (Part 1)
Steve Upton1.1K views
Review of an open source unit test tool- Cucumber_PresentationReview of an open source unit test tool- Cucumber_Presentation
Review of an open source unit test tool- Cucumber_Presentation
Jabeen Shazia Posses H1 B Visa (Jazz)283 views
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talk
Abhik Roychoudhury5.7K views
Google test trainingGoogle test training
Google test training
Thierry Gayet1.4K views
Unit testing - A&BP CCUnit testing - A&BP CC
Unit testing - A&BP CC
JWORKS powered by Ordina722 views
Unit testing   php-unit - phing - selenium_v2Unit testing   php-unit - phing - selenium_v2
Unit testing php-unit - phing - selenium_v2
Tricode (part of Dept)2.5K views
Software Engineering - RS3Software Engineering - RS3
Software Engineering - RS3
Atakan Aral558 views

Más de Lionel Briand(20)

Fuzzing for CPS Mutation Testing

  • 1. Fuzzing for CPS Mutation Testing Jaekwon Lee1,2, Enrico Viganò1, Oscar Cornejo1, Fabrizio Pastore1, Lionel Briand1,2 1 University of Luxembourg, 2 University of Ottawa ASE 2023 - September 14th, 2023
  • 2. 2 Mutation Testing SUT SUT SUT SUT Test suite Test suite Test suite Test suite SUT Test suite FAIL PASS FAIL FAIL PASS SUT Test suite PASS New test 1 New test 2 New test 3 FAIL FAIL FAIL Improve with automatically generated test cases SUT
  • 3. 3 Our focus: C/C++ software deployed on CPS
  • 4. 4 State-of-the-art Tool for C: SEMu § Based on symbolic execution (KLEE) § Excellent for testing command line utilities § Inapplicable to CPS (limitations of KLEE) § Unable to test functions with floating point parameters § Unable to test functions communicating over network § Dependency on LLVM
  • 5. 5 Grey-box Fuzzing § Can be an ideal solution for mutation testing § Generates test cases by exercising the compiled software § Not affected by the limitations of symbolic execution
  • 6. 6 Evolutionary Process in Grey-box Fuzzing SUT SUT New behaviour observed? (number of times branches are covered) Seed files Queue Test and collect coverage Select from queue Randomly modify file . . Test Crash SUT SUT Crashing inputs Yes: add to queue No: discard The grey-box fuzzing process demonstrated useful to generate diverse inputs that expose different faults
  • 7. 7 Grey-box fuzzing may Facilitate Mutation Testing § When an input leads to a program state that differs for the original and the mutated function, differences in code coverage might be observed § Additional modifications of such input may help propagating the infection and kill the mutant
  • 8. 8 Fuzzed Input 1 Fuzzed Input 2 Fuzzed Input 3 x=1,y=1,z=5 x=2,y=1,z=5 x=2,y=1,z=0 int max( int x, int y, int z){ int t; if (x >= y) t = x; branch 1: 1 1 1 else t = y; branch 2: 0 0 0 if (t >= z) return t; branch 3: 0 0 1 return z; branch 4: 1 1 0 } return 5 return 5 return 2 int mut_max( int x, int y, int z){ int t; if (x <= y) t = x; branch 5: 1 0 0 else t = y; branch 6: 0 1 1 if (t >= z) return t; branch 7: 0 0 1 return z; branch 8: 1 1 0 } return 5 return 5 return 1
  • 9. 9 Fuzzed Input 1 Fuzzed Input 2 Fuzzed Input 3 x=1,y=1,z=5 x=4,y=1,z=5 x=4,y=1,z=0 int max( int x, int y, int z){ int t; if (x >= y) t = x; branch 1: 1 1 1 else t = y; branch 2: 0 0 0 if (t >= z) return t; branch 3: 0 0 1 return z; branch 4: 1 1 0 } return 5 return 5 return 2 int mut_max( int x, int y, int z){ int t; if (x <= y) t = x; branch 5: 1 0 0 else t = y; branch 6: 0 1 1 if (t >= z) return t; branch 7: 0 0 1 return z; branch 8: 1 1 0 } return 5 return 5 return 1 infected state
  • 10. 10 DDCM payload data Sun sensor data S-band antenna data Fuzzing is not for System-level CPS Testing
  • 11. 11 We aim to generate test cases at unit level, but it is not supported by grey-box fuzzers
  • 12. 12 MutatiOn Testing wIth Fuzzing (MOTIF) 1. Generate fuzzing driver Live Mutant SUT source Fuzzing driver int main(...){ double x = load(..); double y = load(..); int z = load(..); double m_x = load(..); double m_y = load(..); int m_z = load(..); ret = max(x,y,z); mut_ret = mut_max(m_x,m_y,m_z); if( ! match ( ret, mut_ret ) ){ abort(); } 0100100001110110000 1100101101110110000 0101101101110110000 if( ! match (x, m_x ){ abort() }; if( ! match (y, m_y) { abort() }; if( ! match (z, m_z) { abort() };
  • 13. 13 1. Generate fuzzing driver Live mutant SUT source 4. Mutation testing 2. Generate seed inputs 3. Compile Fuzzing driver Seed file Seed file Seed file Executable fuzzing driver Execute fuzzer (AFL++) Executable fuzzing driver Fuzzed file Post-processing Crashing file File killing mutant File killing mutant Crashing file Crashing file 5. Inspection Generate test case Test case MutatiOn Testing wIth Fuzzing (MOTIF)
  • 14. 14 1. Generate fuzzing driver Live Mutant SUT source 4. Mutation testing 2. Generate seed inputs 3. Compile Fuzzing driver Seed file Seed file Seed file Executable fuzzing driver Execute fuzzer (AFL++) Executable fuzzing driver Fuzzed file Post-processing Crashing file File killing mutant File killing mutant Crashing file Crashing file 5. Inspection Engineer compare outputs with specifications Bug found Generate test case Test case New regression test MutatiOn Testing wIth Fuzzing (MOTIF) Assign fuzzer inputs to input variables Inspect results
  • 15. 15 Empirical Evaluation RQ1. How does MOTIF compare to mutation testing based on symbolic execution? RQ2. How does MOTIF perform with software that cannot be tested with symbolic execution? RQ3. How does MOTIF’s seeding strategy contribute to its results?
  • 16. 16 Case Study Subjects Software deployed on space CPS from ESA project: § MLFS: the Mathematical Library for Flight Software § LIBU: a utility library from one of our industry partner § ASN1lib: a serialization/deserialization library generated with the ASN1SCC compiler
  • 17. 17 RQ1. MOTIF vs Symb. Execution § We created SEMuP: a modified MOTIF pipeline that instead of using AFL++ relies on KLEE/SEMu to generate test case § We considered subjects where symbolic execution is applicable (e.g., no floating-point var): § ASN1Lib § 27 source files of LIBU § 1,499 mutants not killed by existing test suites § Executed both approaches for 10,000 seconds for each mutant § Repeated 10 times
  • 18. 18 RQ1 Results Plots with datapoints belonging to each of the 10 runs 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MOTIF SEMuP 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MOTIF SEMuP ASN1Lib LIBU 10.5 46.8 MOTIF kills 86.08% and 73.79% mutants (avg). It outperforms symbolic execution.
  • 19. 19 RQ1 Results Plots with datapoints belonging to each of the 10 runs 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MOTIF SEMuP 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MOTIF SEMuP ASN1Lib 10.5 46.8 MOTIF kills 252 mutants not killed by SEMuP. SEMuP kills 103 mutants not killed by MOTIF. Complementarity MOTIF kills 74 mutants not killed by SEMuP. SEMuP kills 1 mutants not killed by MOTIF. LIBU
  • 20. 11 1. Generate fuzzing driver Live Mutant SUT source 4. Mutation Testing 2. Generate seed inputs 3. Compile Fuzzing driver Seed file Seed file Seed file Executable fuzzing driver Execute fuzzer (AFL++) Executable fuzzing driver Fuzzed file Post-processing Crashing file File killing mutant File killing mutant Crashing file Crashing file 5. Inspection Engineer compare outputs with specifications Bug found Generate test case Test case New regression test MutatiOn Testing wIth Fuzzing (MOTIF) https://github.com/SNTSVV/MOTIF 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MOTIF SEMuP 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MOTIF SEMuP 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MLFS LIBU Results RQ2 RQ1 https://faqas.uni.lu
  • 21. Fuzzing for CPS Mutation Testing Jaekwon Lee1,2, Enrico Viganò1, Oscar Cornejo1, Fabrizio Pastore1, Lionel Briand1,2 1 University of Luxembourg, 2 University of Ottawa ASE 2023 - September 14th, 2023
  • 23. 23 RQ2 Executed MOTIF on subjects where symbolic execution is not applicable 0% 25% 50% 75% 100% 0 2,000 4,000 6,000 8,000 10,000 Execution time (seconds) Killed mutants MLFS LIBU § Percentage of killed mutants is lower than for the other cases but MOTIF is still effective § Especially considering that MLFS is a math library with MC/DC test suite § Live mutants are hard to kill § Some of the mutants can be killed only with inputs belonging to a narrow portion of a large input domain § Numbers in a small range § Input strings that match a string stored in a global variable.
  • 24. 24 RQ3: Seed Inputs vs Fuzzing § We focus on the proportion of mutants killed with seed inputs in the experiments for RQ1 and RQ2 § Mutants killed by seeds: § RQ1 experiments: § LIBU: one mutant (less than 1% of all the mutants killed) § ASN1Lib: 280 mutants (24.15%) § RQ2 experiments: § MLFS: 76 mutants (5.43%) § LIBU: 26 mutants (21.66%)