SlideShare a Scribd company logo
1 of 45
Software Vulnerability
Detection and Repair
Prof. Abhik Roychoudhury
National University of Singapore
1
KLEE Workshop London 2018
2
Vulnerability
Discovery
Binary
Hardening
Verification
Data
Protection
Agency
Collaboration
Industry
Collaboration
Education – NUS (Bachelors
in Infosec)
Research Outputs – Publications, Tools, Academic Collaboration,
Exchanges, Seminars, Workshops
Enhancing local
capabilities
Space of Problems
 Fuzz Testing
 Feed semi-random inputs to find hangs and crashes
 Continuous fuzzing
 Incrementally find new “problems” in software
 Crash reproduction
 Re-construct a reported crash, crashing input not included due to privacy
 Reaching nooks and corners
 Localizing reported observable errors
 Patching reported errors from input-output examples
3
KLEE Workshop London 2018
Space of Techniques
Search
 Random
 Biased-random
 Genetic (AFL Fuzzer)
 …
 Low set-up overhead
 Fast, less accurate
 Use objective function to steer
Symbolic Execution
 Dynamic Symbolic execution
 Concolic Execution
 Cluster paths based on symbolic
expressions of variables
 ....
 High set-up overhead
 Slow, more accurate
 Use logical formula to steer
4
KLEE Workshop London 2018
In this talk …
Search
 Enhance the effectiveness of search
techniques, with symbolic execution
as inspiration
 [CCS16, CCS17, ICSE15]
Symbolic Execution
 Explore capabilities of symbolic
execution beyond search, in program
repair
 [ICSE13, 15, 16, 18]
5
KLEE Workshop London 2018
History of fuzzing
Developed by Barton Miller, see
http://pages.cs.wisc.edu/~bart/fuzz/
Fuzz testing is a simple technique for feeding random input to applications. The
approach has three characteristics.
 The input is random. We do not use any model of program behavior, application
type, or system description. This is sometimes called black box testing.
 The reliability criteria is simple: if the application crashes or hangs, it is
considered to fail the test, otherwise it passes. Note that the application does not
have to respond in a sensible manner to the input, and it can even quietly exit.
 As a result of the first two characteristics, fuzz testing can be automated to a high
degree and results can be compared across applications, operating systems, and
vendors.
6
KLEE Workshop London 2018
Grey-box Fuzzing, as in AFL7
Mutators
Test suite
Mutated files
Input Queue
EnqueueDequeue
KLEE Workshop London 2018
Grey-box Fuzzing Algorithm8
• Input: Seed Inputs S
• 1: T✗ = ∅
• 2: T = S
• 3: if T = ∅ then
• 4: add empty file to T
• 5: end if
• 6: repeat
• 7: t = chooseNext(T)
• 8: p = assignEnergy(t)
• 9: for i from 1 to p do
• 10: t0 = mutate_input(t)
• 11: if t0 crashes then
• 12: add t0 to T✗
• 13: else if isInteresting(t0 ) then
• 14: add t0 to T
• 15: end if
• 16: end for
• 17: until timeout reached or abort-signal
• Output: Crashing Inputs T✗
KLEE Workshop London 2018
Programming by experienced people
Schematic
 if (condition1)
 return // short path, frequented by many many inputs
 else if (condition2)
 exit // short paths, frequented by many inputs
 else ….
9
Prioritize low probability paths10
 Use grey-box fuzzer which keeps track of path id for a test.
 Find probabilities that fuzzing a test t which exercises π leads to an
input which exercises π’
 Higher weightage to low probability paths discovered, to gravitate to
those -> discover new states in Markov Chain with minimal effort.
π π'
1 void crashme (char* s) {
2 if (s[0] == ’b’)
3 if (s[1] == ’a’)
4 if (s[2] == ’d’)
5 if (s[3] == ’!’)
6 abort ();
7 }
p
Power-Schedules
11
Constant:
AFL uses this schedule (fuzzing ~1 minute)
 (i) .. how AFL judges fuzzing time for the test exercising path i
Cut-off Exponential:
p(i) = (i)
p(i) = 0, if f(i) > µ
min( (i)/β*2s(i), M) otherwise
β is a constant
s(i) #times the input exercising path i has been chosen for fuzzing
f(i) #fuzz exercising path i (path-frequency)
µ mean #fuzz exercising a discovered path (avg. path-frequency)
M maximum energy expendable on a state
KLEE Workshop London 2018
Results12
Independent evaluation found crashes 19x faster on
DARPA Cyber Grand Challenge (CGC) binaries
Integrated into main-line of AFL fuzzer within a year of publication (CCS16), which is
used on a daily basis by corporations for finding vulnerabilities
KLEE Workshop London 2018
Comments on the technologies
1
Use of Grey-box Fuzzing
KLEE Workshop London 2018
14
 Greybox Fuzzing is frequently used, daily in corporations
 State-of-the-art in automated vulnerability detection
 Extremely efficient coverage-based input generation
 All program analysis before/at instrumentation time.
 Start with a seed corpus, choose a seed file, fuzz it.
 Add to corpus only if new input increases coverage.
 Cannot be directed, unlike symbolic execution!
In this talk …
Search
 Enhance the effectiveness of search
techniques, with symbolic execution
as inspiration
 Enhance coverage, how to make it
directed?
Symbolic Execution
 Explore capabilities of symbolic
execution beyond directed search
15
KLEE Workshop London 2018
Directed Fuzzing instead of Coverage
16
Crash reproducing supports
- In-house debugging and
fixing
- Vulnerability checkingKLEE Workshop London 2018
Using symbolic execution17
Program binary
Benign input files
(Crash instruction, loaded
modules, call stack, register
values)
Crash input files
Hercules
Toolset
1. Directed Search Algorithm
2. Guided Selective Symbolic Execution
KLEE Workshop London 2018
Symbolic Analyzer18
Reproduced vulnerabilities in Acrobat Reader, Media Player with 24 hour time bound
KLEE Workshop London 2018
(Earlier) View-point19
 Directed Fuzzing: classical constraint satisfaction prob.
 Program analysis to identify program paths
that reach given program locations.
 Symbolic Execution to derive path conditions
for any of the identified paths.
 Constraint Solving to find an input that
 satisfies the path condition and thus
 reaches a program location that was given.
φ1 = (x>y)∧(x+y>10)
φ2 = ¬(x>y)∧(x+y>10)
x > y
a = x a = y
x+y>10
b = a
return b
KLEE Workshop London 2018
(Later) View-point20
 Directed Fuzzing as optimization problem!
1. Instrumentation Time:
• Instrument program to aggregate distance values.
2. Runtime, for each input
• decide how long to be fuzzed based on distance.
• If input is closer to the targets, it is fuzzed for longer.
• If input is further away from the targets, it is fuzzed for shorter.
KLEE Workshop London 2018
Power Schedules - Recap21
• Input: Seed Inputs S
• 1: T✗ = ∅
• 2: T = S
• 3: if T = ∅ then
• 4: add empty file to T
• 5: end if
• 6: repeat
• 7: t = chooseNext(T)
• 8: p = assignEnergy(t)
• 9: for i from 1 to p do
• 10: t0 = mutate_input(t)
• 11: if t0 crashes then
• 12: add t0 to T✗
• 13: else if isInteresting(t0 ) then
• 14: add t0 to T
• 15: end if
• 16: end for
• 17: until timeout reached or abort-signal
• Output: Crashing Inputs T✗
KLEE Workshop London 2018
Instrumentation22
 Function-level target distance using call graph (CG)
 BB-level target distance using control-flow graph (CFG)
1. Identify target BBs and
assign distance 0
2. Identify BBs that
call functions and
assign 10*FLTD
3. For each BB, compute harmonic
mean of (length of shortest path to
any function-calling BB + 10*FLTD).
CFG for function b
8.7
11
10
30
13
12
N/A
KLEE Workshop London 2018
Directed fuzzing as optimization23
 Integrating Simulated Annealing as power schedule
 In the beginning (t = 0min),
assign the same energy
to all seeds.
 Later (t=10min), assign
a bit more energy to
seeds that are closer.
 At exploitation (t=80min),
assign maximal energy to
seeds that are closest.
KLEE Workshop London 2018
In this talk …
Search
 Enhance the effectiveness of search
techniques, with symbolic execution
as inspiration
 Enhance coverage
 Achieve directed search
Symbolic Execution
 Explore capabilities of symbolic
execution beyond directed search
24
84 139 59
AFLGo KLEE
KLEE Workshop London 2018
Program Synthesis and Repair25
Synthesizer
Input-
output
Examples
Quality
Criterion
Program
Vulnerable
program
Tests
Repaired
Program
Repair
System
KLEE Workshop London 2018
Search-based approach26
 2009: GenProg [Weimer-et-al-ICSE]
EVALUATE
FITNESS
DISCAR
D
ACCEPT
MUTATEKLEE Workshop London 2018
Over-fitting in Tests -> Program27
Avoid
if (input1) return output1
else if (input2) return output2
else if (input3) return output3
….
Vulnerable
program
Tests
Repaired
Program
Repair
System
ARTIFACT
S
(symbolic
formulae)
Generalize beyond the provided tests
using symbolic reasoning.
KLEE Workshop London 2018
View-point on Repair
KLEE Workshop London 2018
28
1. Where to fix –
in which line?
2. Generate the
candidate patches in
this line.
3. Validate the
candidate patches.
Syntactic approach
Semantic approach
Example29
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = down_sep; // bias= up_sep + 100
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
inhibit up_sep down_se
p
Observed
output
Expected
Output
Result
1 0 100 0 0 pass
1 11 110 0 1 fail
0 100 50 1 1 pass
1 -20 60 0 1 fail
0 0 10 0 0 passKLEE Workshop London 2018
Patch synthesis
30
KLEE Workshop London 2018
Patch synthesis31
 Accumulated constraints
 f(1,11, 110) > 110 
 f(1,0,100) ≤ 100 
 …
 Find a f satisfying this constraint
 By fixing the set of operators appearing in f
 Candidate methods
 Search over the space of expressions
 Program synthesis with fixed set of operators
 Generated fix
 f(inhibit,up_sep,down_sep) = up_sep + 100
KLEE Workshop London 2018
Semantic Repair32
Test input
Concrete
values
Expected output of program
Output:
Value-set or Constraint
Symbolic
execution
Program
Concrete Execution
KLEE Workshop London 2018
High-level view33
Buggy Program
…
var = a + b – c;x
Test input
Concrete Execution
Symbolic Execution with x as the
only unknown
Path conditions,
Output Expressions
x = f(Live Vars)
Get properties of
function f via
symbolic
execution.
Construct a
function f which
satisfies these
properties !
KLEE Workshop London 2018
Program Repair given tests34
 Generate –and-test patches (GenProg, CMU/Michigan)
 Specification inference and patch synthesis
 Infer specification or properties about the patch to be synthesized.
 Meet the specification by enumeration, or by solving constraints.
 Various works – SemFix, Angelix (NUS), Nopol (KTH), SPR (MIT), …
 Ordering of search space of patches
 Use minimality to prioritize the search space. (NUS)
 Use learning approaches to prioritize the search space. (MIT)
 Patch templates can be learnt from human fixes. (HKUST)
KLEE Workshop London 2018
http://angelix.io [ICSE13,16]35
35KLEE
Clang
Runtime
Synthesis
Z3
Buggy
Source
Instrumented
Source
Suspicious
Locations
Debugger
Angelic
Forest
Clang
Instrumented
Source
Patch
KLEE Workshop London 2018
State-of-the technology36
Defect Fixed
Expressions
Libtiff-4a24508-cc79c2b 2
Libtiff-829d8c4-036d7bb 2
CoreUtils-00743a1f-ec48bead 3
CoreUtils-1dd8a331-d461bfd2 2
CoreUtils-c5ccf29b-a04ddb8d 3
Subject LoC Repair time (min)
wireshark 2814K 23
php 1046K 62
gzip 491K 4
gmp 145K 14
libtiff 77K 14
Scalability
Quality: Less functionality-deleting repair than any other tool.
KLEE Workshop London 2018
Heartbleed37
1 i f ( hbtype == TLS1 HB REQUEST) {
2 . . .
3 memcpy (bp , pl , payload ) ;
4 . . .
5 }
(a) The buggy part of the Heartbleed-
vulnerable OpenSSL
1 i f ( hbtype == TLS1 HB REQUEST
2 && payload + 18 < s->s3->rrec.length) {
3 . . .
4 }
(b) A fix generated automatically
1 if (1 + 2 + payload + 16 > s->s3->rrec.length)
2 return 0;
3 . . .
4 i f ( hbtype == TLS1_HB_REQUEST) {
5 . . .
6 }
7 e l s e i f ( hbtype == TLS1_HB_RESPONSE) {
8 . . .
9 }
10 r e t u r n 0 ;
(c) The developer-provided repair
KLEE Workshop London 2018
Use a reference implementation38
1 int search(int x, int a[], int length) {
2 int i;
3 for (i=0; i<length; i++) {
4 if (x == a[i])
5 return i;
6 }
7 return −1;
8 }
(a) Correct linear search
1 int search(int x, int a[], int length) {
2 int L = 0;
3 int R = length-1;
4 do {
5 int m = (L+R)/2;
6 if (x == a[m]) {
7 return m;
8 } else if (x < a[m]) { // bug fix: x > a[m]
9 L = m+1;
10 } else {
11 R = m-1;
12 }
13 } while (L <= R);
14 return -1;
15 }
(b) Buggy binary search
User-define condition: length = 3 & a[0] < a[1] < a[2]
Verification condition
Experiments on embedded Linux Busybox
KLEE Workshop London 2018
SemGraft (ICSE18)39
Verification
condition
Counterexample
Is
SAT?
Negate
Patch found
Buggy program
Is
SAT?
Angelic
forest
Is
SAT?
Component
library
Candidate
patch
No
Yes
Yes
Yes
Buggy program
Reference
program
Symbolic
analysis
KLEE Workshop London 2018
SemGraft results40
Program Commit Bug Angelix SemGraft
sed c35545a Handle empty match Correct Correct
seq f7d1c59 Wrong output Correct Correct
sed 7666fa1 Wrong output Incorrect Correct
sort d1ed3e6 Wrong output Incorrect Correct
seq d86d20b Don’t accepts 0 Incorrect Correct
sed 3a9365e Handle s/// Incorrect Correct
Program Commit Bug Angelix SemGraft
mkdir f7d1c59 Segmentation fault Incorrect Correct
mkfifo cdb1682 Segmentation fault Incorrect Correct
mknod cdb1682 Segmentation fault Incorrect Correct
copy f3653f0 Failed to copy a file Correct Correct
md5sum 739cf4e Segmentation fault Correct Correct
cut 6f374d7 Wrong output Incorrect Correct
GNU Coreutils
as reference
Linux Busybox
as reference
Novel applications41
Use program repair in intelligent
tutoring systems to give the
students’ individual attention.
Study in IIT-Kanpur (FSE17)KLEE Workshop London 2018
KLEE Workshop London 2018
Acknowledgments42
Co-authors:
Umair Ahmed & Amey Karkare (IIT-K)
Marcel Boehme (Monash),
Satish Chandra (Facebook),
Lars Grunske & Yannic Noller(Humboldt)
Sergey Mechtaev (NUS)
HDT Nguyen and Dawei Qi
Manh-Dung Nguyen & Van-Thuan Pham (NUS)
Mukul Prasad & Hiroaki Yoshida (Fujitsu),
Shin Hwei Tan (SUSTech)
Jooyong Yi (Innopolis)
Relevant papers:
http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html
http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/
Grants:
NRF NCR program TSUNAMi
project (2015-2020)
DSO grant (2013-15).
Airbus grant (2017-18).
KLEE Workshop London 2018
Vulnerability detection and patching
Search
 Enhance the effectiveness of search
techniques, with symbolic execution
as inspiration
 Enhance coverage
 Achieve directed search
Symbolic Execution
 Explore capabilities of symbolic
execution beyond directed search
 Specification inference
 Program synthesis/ repair
43
KLEE Workshop London 2018
For more details
Own website
 http://www.comp.nus.edu.sg/~abhik
Project website
 http://www.comp.nus.edu.sg/~tsunami/
KLEE Workshop London 2018
44
Links on Repair
http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html
Links on Fuzzing
http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/
Let us talk in the reception now, or tomorrow – if you are interested.
Reflections on Symbolic Execution
KLEE Workshop London 2018
45
Reachability of a location. e.g. KATCH, AFLGo
Bug finding. e.g. KLEE, AFLFast
Specification Inference from buggy program e.g. SemFix, Angelix

More Related Content

What's hot

Automated Program Repair, Distinguished lecture at MPI-SWS
Automated Program Repair, Distinguished lecture at MPI-SWSAutomated Program Repair, Distinguished lecture at MPI-SWS
Automated Program Repair, Distinguished lecture at MPI-SWSAbhik Roychoudhury
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairDongsun Kim
 
Slicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsSlicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsPraveen Penumathsa
 
Programing Slicing and Its applications
Programing Slicing and Its applicationsPrograming Slicing and Its applications
Programing Slicing and Its applicationsAnkur Jain
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours? Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours? Rogue Wave Software
 
Stephan berg track f
Stephan berg   track fStephan berg   track f
Stephan berg track fAlona Gradman
 
Singapore International Cyberweek 2020
Singapore International Cyberweek 2020Singapore International Cyberweek 2020
Singapore International Cyberweek 2020Abhik Roychoudhury
 
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk Andrii Vozniuk
 
20100522 software verification_sharygina_lecture02
20100522 software verification_sharygina_lecture0220100522 software verification_sharygina_lecture02
20100522 software verification_sharygina_lecture02Computer Science Club
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundAnnibale Panichella
 
Thesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.pptThesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.pptPtidej Team
 
#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria
#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria
#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, InriaParis Open Source Summit
 

What's hot (20)

Dagstuhl2021
Dagstuhl2021Dagstuhl2021
Dagstuhl2021
 
Automated Program Repair, Distinguished lecture at MPI-SWS
Automated Program Repair, Distinguished lecture at MPI-SWSAutomated Program Repair, Distinguished lecture at MPI-SWS
Automated Program Repair, Distinguished lecture at MPI-SWS
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
 
Slicing of Object-Oriented Programs
Slicing of Object-Oriented ProgramsSlicing of Object-Oriented Programs
Slicing of Object-Oriented Programs
 
Programing Slicing and Its applications
Programing Slicing and Its applicationsPrograming Slicing and Its applications
Programing Slicing and Its applications
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours? Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours?
 
Binary Analysis - Luxembourg
Binary Analysis - LuxembourgBinary Analysis - Luxembourg
Binary Analysis - Luxembourg
 
A new approach of program slicing
A new approach of program slicingA new approach of program slicing
A new approach of program slicing
 
Stephan berg track f
Stephan berg   track fStephan berg   track f
Stephan berg track f
 
Singapore International Cyberweek 2020
Singapore International Cyberweek 2020Singapore International Cyberweek 2020
Singapore International Cyberweek 2020
 
CSMR13b.ppt
CSMR13b.pptCSMR13b.ppt
CSMR13b.ppt
 
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
 
20100522 software verification_sharygina_lecture02
20100522 software verification_sharygina_lecture0220100522 software verification_sharygina_lecture02
20100522 software verification_sharygina_lecture02
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
 
Erlang real time
Erlang real timeErlang real time
Erlang real time
 
Thesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.pptThesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.ppt
 
#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria
#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria
#OSSPARIS19: Introduction to scikit-learn - Olivier Grisel, Inria
 

Similar to Symbexecsearch

Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 EstimationLawrence Bernstein
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works奈良先端大 情報科学研究科
 
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...Ganesan Narayanasamy
 
iFL: An Interactive Environment for Understanding Feature Implementations
iFL: An Interactive Environment for Understanding Feature ImplementationsiFL: An Interactive Environment for Understanding Feature Implementations
iFL: An Interactive Environment for Understanding Feature ImplementationsICSM 2010
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksYuki Ueda
 
Python for Machine Learning
Python for Machine LearningPython for Machine Learning
Python for Machine LearningStudent
 
Certified Reasoning for Automated Verification
Certified Reasoning for Automated VerificationCertified Reasoning for Automated Verification
Certified Reasoning for Automated VerificationAsankhaya Sharma
 
Pitfalls of machine learning in production
Pitfalls of machine learning in productionPitfalls of machine learning in production
Pitfalls of machine learning in productionAntoine Sauray
 
Prespective analytics with DOcplex and pandas
Prespective analytics with DOcplex and pandasPrespective analytics with DOcplex and pandas
Prespective analytics with DOcplex and pandasPyDataParis
 
Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Felix Z. Hoffmann
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventuremylittleadventure
 
Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)Babul Mirdha
 

Similar to Symbexecsearch (20)

Fuzzing.pptx
Fuzzing.pptxFuzzing.pptx
Fuzzing.pptx
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 Estimation
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works
 
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
 
iFL: An Interactive Environment for Understanding Feature Implementations
iFL: An Interactive Environment for Understanding Feature ImplementationsiFL: An Interactive Environment for Understanding Feature Implementations
iFL: An Interactive Environment for Understanding Feature Implementations
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works
 
Python for Machine Learning
Python for Machine LearningPython for Machine Learning
Python for Machine Learning
 
markomanolis_phd_defense
markomanolis_phd_defensemarkomanolis_phd_defense
markomanolis_phd_defense
 
Certified Reasoning for Automated Verification
Certified Reasoning for Automated VerificationCertified Reasoning for Automated Verification
Certified Reasoning for Automated Verification
 
Code metrics in PHP
Code metrics in PHPCode metrics in PHP
Code metrics in PHP
 
Raising the Bar
Raising the BarRaising the Bar
Raising the Bar
 
Code Tuning
Code TuningCode Tuning
Code Tuning
 
Pitfalls of machine learning in production
Pitfalls of machine learning in productionPitfalls of machine learning in production
Pitfalls of machine learning in production
 
Prespective analytics with DOcplex and pandas
Prespective analytics with DOcplex and pandasPrespective analytics with DOcplex and pandas
Prespective analytics with DOcplex and pandas
 
Ch1
Ch1Ch1
Ch1
 
Ch1
Ch1Ch1
Ch1
 
Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?
 
Programas y Pruebas en Dafny
Programas y Pruebas en DafnyProgramas y Pruebas en Dafny
Programas y Pruebas en Dafny
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
 
Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)
 

More from Abhik Roychoudhury

More from Abhik Roychoudhury (7)

16May_ICSE_MIP_APR_2023.pptx
16May_ICSE_MIP_APR_2023.pptx16May_ICSE_MIP_APR_2023.pptx
16May_ICSE_MIP_APR_2023.pptx
 
IFIP2023-Abhik.pptx
IFIP2023-Abhik.pptxIFIP2023-Abhik.pptx
IFIP2023-Abhik.pptx
 
Art of Computer Science Research Planning
Art of Computer Science Research PlanningArt of Computer Science Research Planning
Art of Computer Science Research Planning
 
Issta13 workshop on debugging
Issta13 workshop on debuggingIssta13 workshop on debugging
Issta13 workshop on debugging
 
Repair dagstuhl
Repair dagstuhlRepair dagstuhl
Repair dagstuhl
 
PAS 2012
PAS 2012PAS 2012
PAS 2012
 
Pas oct12
Pas oct12Pas oct12
Pas oct12
 

Recently uploaded

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 

Recently uploaded (20)

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 

Symbexecsearch

  • 1. Software Vulnerability Detection and Repair Prof. Abhik Roychoudhury National University of Singapore 1 KLEE Workshop London 2018
  • 2. 2 Vulnerability Discovery Binary Hardening Verification Data Protection Agency Collaboration Industry Collaboration Education – NUS (Bachelors in Infosec) Research Outputs – Publications, Tools, Academic Collaboration, Exchanges, Seminars, Workshops Enhancing local capabilities
  • 3. Space of Problems  Fuzz Testing  Feed semi-random inputs to find hangs and crashes  Continuous fuzzing  Incrementally find new “problems” in software  Crash reproduction  Re-construct a reported crash, crashing input not included due to privacy  Reaching nooks and corners  Localizing reported observable errors  Patching reported errors from input-output examples 3 KLEE Workshop London 2018
  • 4. Space of Techniques Search  Random  Biased-random  Genetic (AFL Fuzzer)  …  Low set-up overhead  Fast, less accurate  Use objective function to steer Symbolic Execution  Dynamic Symbolic execution  Concolic Execution  Cluster paths based on symbolic expressions of variables  ....  High set-up overhead  Slow, more accurate  Use logical formula to steer 4 KLEE Workshop London 2018
  • 5. In this talk … Search  Enhance the effectiveness of search techniques, with symbolic execution as inspiration  [CCS16, CCS17, ICSE15] Symbolic Execution  Explore capabilities of symbolic execution beyond search, in program repair  [ICSE13, 15, 16, 18] 5 KLEE Workshop London 2018
  • 6. History of fuzzing Developed by Barton Miller, see http://pages.cs.wisc.edu/~bart/fuzz/ Fuzz testing is a simple technique for feeding random input to applications. The approach has three characteristics.  The input is random. We do not use any model of program behavior, application type, or system description. This is sometimes called black box testing.  The reliability criteria is simple: if the application crashes or hangs, it is considered to fail the test, otherwise it passes. Note that the application does not have to respond in a sensible manner to the input, and it can even quietly exit.  As a result of the first two characteristics, fuzz testing can be automated to a high degree and results can be compared across applications, operating systems, and vendors. 6 KLEE Workshop London 2018
  • 7. Grey-box Fuzzing, as in AFL7 Mutators Test suite Mutated files Input Queue EnqueueDequeue KLEE Workshop London 2018
  • 8. Grey-box Fuzzing Algorithm8 • Input: Seed Inputs S • 1: T✗ = ∅ • 2: T = S • 3: if T = ∅ then • 4: add empty file to T • 5: end if • 6: repeat • 7: t = chooseNext(T) • 8: p = assignEnergy(t) • 9: for i from 1 to p do • 10: t0 = mutate_input(t) • 11: if t0 crashes then • 12: add t0 to T✗ • 13: else if isInteresting(t0 ) then • 14: add t0 to T • 15: end if • 16: end for • 17: until timeout reached or abort-signal • Output: Crashing Inputs T✗ KLEE Workshop London 2018
  • 9. Programming by experienced people Schematic  if (condition1)  return // short path, frequented by many many inputs  else if (condition2)  exit // short paths, frequented by many inputs  else …. 9
  • 10. Prioritize low probability paths10  Use grey-box fuzzer which keeps track of path id for a test.  Find probabilities that fuzzing a test t which exercises π leads to an input which exercises π’  Higher weightage to low probability paths discovered, to gravitate to those -> discover new states in Markov Chain with minimal effort. π π' 1 void crashme (char* s) { 2 if (s[0] == ’b’) 3 if (s[1] == ’a’) 4 if (s[2] == ’d’) 5 if (s[3] == ’!’) 6 abort (); 7 } p
  • 11. Power-Schedules 11 Constant: AFL uses this schedule (fuzzing ~1 minute)  (i) .. how AFL judges fuzzing time for the test exercising path i Cut-off Exponential: p(i) = (i) p(i) = 0, if f(i) > µ min( (i)/β*2s(i), M) otherwise β is a constant s(i) #times the input exercising path i has been chosen for fuzzing f(i) #fuzz exercising path i (path-frequency) µ mean #fuzz exercising a discovered path (avg. path-frequency) M maximum energy expendable on a state KLEE Workshop London 2018
  • 12. Results12 Independent evaluation found crashes 19x faster on DARPA Cyber Grand Challenge (CGC) binaries Integrated into main-line of AFL fuzzer within a year of publication (CCS16), which is used on a daily basis by corporations for finding vulnerabilities KLEE Workshop London 2018
  • 13. Comments on the technologies 1
  • 14. Use of Grey-box Fuzzing KLEE Workshop London 2018 14  Greybox Fuzzing is frequently used, daily in corporations  State-of-the-art in automated vulnerability detection  Extremely efficient coverage-based input generation  All program analysis before/at instrumentation time.  Start with a seed corpus, choose a seed file, fuzz it.  Add to corpus only if new input increases coverage.  Cannot be directed, unlike symbolic execution!
  • 15. In this talk … Search  Enhance the effectiveness of search techniques, with symbolic execution as inspiration  Enhance coverage, how to make it directed? Symbolic Execution  Explore capabilities of symbolic execution beyond directed search 15 KLEE Workshop London 2018
  • 16. Directed Fuzzing instead of Coverage 16 Crash reproducing supports - In-house debugging and fixing - Vulnerability checkingKLEE Workshop London 2018
  • 17. Using symbolic execution17 Program binary Benign input files (Crash instruction, loaded modules, call stack, register values) Crash input files Hercules Toolset 1. Directed Search Algorithm 2. Guided Selective Symbolic Execution KLEE Workshop London 2018
  • 18. Symbolic Analyzer18 Reproduced vulnerabilities in Acrobat Reader, Media Player with 24 hour time bound KLEE Workshop London 2018
  • 19. (Earlier) View-point19  Directed Fuzzing: classical constraint satisfaction prob.  Program analysis to identify program paths that reach given program locations.  Symbolic Execution to derive path conditions for any of the identified paths.  Constraint Solving to find an input that  satisfies the path condition and thus  reaches a program location that was given. φ1 = (x>y)∧(x+y>10) φ2 = ¬(x>y)∧(x+y>10) x > y a = x a = y x+y>10 b = a return b KLEE Workshop London 2018
  • 20. (Later) View-point20  Directed Fuzzing as optimization problem! 1. Instrumentation Time: • Instrument program to aggregate distance values. 2. Runtime, for each input • decide how long to be fuzzed based on distance. • If input is closer to the targets, it is fuzzed for longer. • If input is further away from the targets, it is fuzzed for shorter. KLEE Workshop London 2018
  • 21. Power Schedules - Recap21 • Input: Seed Inputs S • 1: T✗ = ∅ • 2: T = S • 3: if T = ∅ then • 4: add empty file to T • 5: end if • 6: repeat • 7: t = chooseNext(T) • 8: p = assignEnergy(t) • 9: for i from 1 to p do • 10: t0 = mutate_input(t) • 11: if t0 crashes then • 12: add t0 to T✗ • 13: else if isInteresting(t0 ) then • 14: add t0 to T • 15: end if • 16: end for • 17: until timeout reached or abort-signal • Output: Crashing Inputs T✗ KLEE Workshop London 2018
  • 22. Instrumentation22  Function-level target distance using call graph (CG)  BB-level target distance using control-flow graph (CFG) 1. Identify target BBs and assign distance 0 2. Identify BBs that call functions and assign 10*FLTD 3. For each BB, compute harmonic mean of (length of shortest path to any function-calling BB + 10*FLTD). CFG for function b 8.7 11 10 30 13 12 N/A KLEE Workshop London 2018
  • 23. Directed fuzzing as optimization23  Integrating Simulated Annealing as power schedule  In the beginning (t = 0min), assign the same energy to all seeds.  Later (t=10min), assign a bit more energy to seeds that are closer.  At exploitation (t=80min), assign maximal energy to seeds that are closest. KLEE Workshop London 2018
  • 24. In this talk … Search  Enhance the effectiveness of search techniques, with symbolic execution as inspiration  Enhance coverage  Achieve directed search Symbolic Execution  Explore capabilities of symbolic execution beyond directed search 24 84 139 59 AFLGo KLEE KLEE Workshop London 2018
  • 25. Program Synthesis and Repair25 Synthesizer Input- output Examples Quality Criterion Program Vulnerable program Tests Repaired Program Repair System KLEE Workshop London 2018
  • 26. Search-based approach26  2009: GenProg [Weimer-et-al-ICSE] EVALUATE FITNESS DISCAR D ACCEPT MUTATEKLEE Workshop London 2018
  • 27. Over-fitting in Tests -> Program27 Avoid if (input1) return output1 else if (input2) return output2 else if (input3) return output3 …. Vulnerable program Tests Repaired Program Repair System ARTIFACT S (symbolic formulae) Generalize beyond the provided tests using symbolic reasoning. KLEE Workshop London 2018
  • 28. View-point on Repair KLEE Workshop London 2018 28 1. Where to fix – in which line? 2. Generate the candidate patches in this line. 3. Validate the candidate patches. Syntactic approach Semantic approach
  • 29. Example29 1 int is_upward( int inhibit, int up_sep, int down_sep){ 2 int bias; 3 if (inhibit) 4 bias = down_sep; // bias= up_sep + 100 5 else bias = up_sep ; 6 if (bias > down_sep) 7 return 1; 8 else return 0; 9 } inhibit up_sep down_se p Observed output Expected Output Result 1 0 100 0 0 pass 1 11 110 0 1 fail 0 100 50 1 1 pass 1 -20 60 0 1 fail 0 0 10 0 0 passKLEE Workshop London 2018
  • 31. Patch synthesis31  Accumulated constraints  f(1,11, 110) > 110   f(1,0,100) ≤ 100   …  Find a f satisfying this constraint  By fixing the set of operators appearing in f  Candidate methods  Search over the space of expressions  Program synthesis with fixed set of operators  Generated fix  f(inhibit,up_sep,down_sep) = up_sep + 100 KLEE Workshop London 2018
  • 32. Semantic Repair32 Test input Concrete values Expected output of program Output: Value-set or Constraint Symbolic execution Program Concrete Execution KLEE Workshop London 2018
  • 33. High-level view33 Buggy Program … var = a + b – c;x Test input Concrete Execution Symbolic Execution with x as the only unknown Path conditions, Output Expressions x = f(Live Vars) Get properties of function f via symbolic execution. Construct a function f which satisfies these properties ! KLEE Workshop London 2018
  • 34. Program Repair given tests34  Generate –and-test patches (GenProg, CMU/Michigan)  Specification inference and patch synthesis  Infer specification or properties about the patch to be synthesized.  Meet the specification by enumeration, or by solving constraints.  Various works – SemFix, Angelix (NUS), Nopol (KTH), SPR (MIT), …  Ordering of search space of patches  Use minimality to prioritize the search space. (NUS)  Use learning approaches to prioritize the search space. (MIT)  Patch templates can be learnt from human fixes. (HKUST) KLEE Workshop London 2018
  • 36. State-of-the technology36 Defect Fixed Expressions Libtiff-4a24508-cc79c2b 2 Libtiff-829d8c4-036d7bb 2 CoreUtils-00743a1f-ec48bead 3 CoreUtils-1dd8a331-d461bfd2 2 CoreUtils-c5ccf29b-a04ddb8d 3 Subject LoC Repair time (min) wireshark 2814K 23 php 1046K 62 gzip 491K 4 gmp 145K 14 libtiff 77K 14 Scalability Quality: Less functionality-deleting repair than any other tool. KLEE Workshop London 2018
  • 37. Heartbleed37 1 i f ( hbtype == TLS1 HB REQUEST) { 2 . . . 3 memcpy (bp , pl , payload ) ; 4 . . . 5 } (a) The buggy part of the Heartbleed- vulnerable OpenSSL 1 i f ( hbtype == TLS1 HB REQUEST 2 && payload + 18 < s->s3->rrec.length) { 3 . . . 4 } (b) A fix generated automatically 1 if (1 + 2 + payload + 16 > s->s3->rrec.length) 2 return 0; 3 . . . 4 i f ( hbtype == TLS1_HB_REQUEST) { 5 . . . 6 } 7 e l s e i f ( hbtype == TLS1_HB_RESPONSE) { 8 . . . 9 } 10 r e t u r n 0 ; (c) The developer-provided repair KLEE Workshop London 2018
  • 38. Use a reference implementation38 1 int search(int x, int a[], int length) { 2 int i; 3 for (i=0; i<length; i++) { 4 if (x == a[i]) 5 return i; 6 } 7 return −1; 8 } (a) Correct linear search 1 int search(int x, int a[], int length) { 2 int L = 0; 3 int R = length-1; 4 do { 5 int m = (L+R)/2; 6 if (x == a[m]) { 7 return m; 8 } else if (x < a[m]) { // bug fix: x > a[m] 9 L = m+1; 10 } else { 11 R = m-1; 12 } 13 } while (L <= R); 14 return -1; 15 } (b) Buggy binary search User-define condition: length = 3 & a[0] < a[1] < a[2] Verification condition Experiments on embedded Linux Busybox KLEE Workshop London 2018
  • 39. SemGraft (ICSE18)39 Verification condition Counterexample Is SAT? Negate Patch found Buggy program Is SAT? Angelic forest Is SAT? Component library Candidate patch No Yes Yes Yes Buggy program Reference program Symbolic analysis KLEE Workshop London 2018
  • 40. SemGraft results40 Program Commit Bug Angelix SemGraft sed c35545a Handle empty match Correct Correct seq f7d1c59 Wrong output Correct Correct sed 7666fa1 Wrong output Incorrect Correct sort d1ed3e6 Wrong output Incorrect Correct seq d86d20b Don’t accepts 0 Incorrect Correct sed 3a9365e Handle s/// Incorrect Correct Program Commit Bug Angelix SemGraft mkdir f7d1c59 Segmentation fault Incorrect Correct mkfifo cdb1682 Segmentation fault Incorrect Correct mknod cdb1682 Segmentation fault Incorrect Correct copy f3653f0 Failed to copy a file Correct Correct md5sum 739cf4e Segmentation fault Correct Correct cut 6f374d7 Wrong output Incorrect Correct GNU Coreutils as reference Linux Busybox as reference
  • 41. Novel applications41 Use program repair in intelligent tutoring systems to give the students’ individual attention. Study in IIT-Kanpur (FSE17)KLEE Workshop London 2018 KLEE Workshop London 2018
  • 42. Acknowledgments42 Co-authors: Umair Ahmed & Amey Karkare (IIT-K) Marcel Boehme (Monash), Satish Chandra (Facebook), Lars Grunske & Yannic Noller(Humboldt) Sergey Mechtaev (NUS) HDT Nguyen and Dawei Qi Manh-Dung Nguyen & Van-Thuan Pham (NUS) Mukul Prasad & Hiroaki Yoshida (Fujitsu), Shin Hwei Tan (SUSTech) Jooyong Yi (Innopolis) Relevant papers: http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/ Grants: NRF NCR program TSUNAMi project (2015-2020) DSO grant (2013-15). Airbus grant (2017-18). KLEE Workshop London 2018
  • 43. Vulnerability detection and patching Search  Enhance the effectiveness of search techniques, with symbolic execution as inspiration  Enhance coverage  Achieve directed search Symbolic Execution  Explore capabilities of symbolic execution beyond directed search  Specification inference  Program synthesis/ repair 43 KLEE Workshop London 2018
  • 44. For more details Own website  http://www.comp.nus.edu.sg/~abhik Project website  http://www.comp.nus.edu.sg/~tsunami/ KLEE Workshop London 2018 44 Links on Repair http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html Links on Fuzzing http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/ Let us talk in the reception now, or tomorrow – if you are interested.
  • 45. Reflections on Symbolic Execution KLEE Workshop London 2018 45 Reachability of a location. e.g. KATCH, AFLGo Bug finding. e.g. KLEE, AFLFast Specification Inference from buggy program e.g. SemFix, Angelix

Editor's Notes

  1. More satisfying to me as a security researcher than any academic award.