[Abstract]
When developing a 1-day exploit code, patch diffing (binary diffing) is one of the major techniques to identify the part that security fixes are applied. This technique is well-known since long ago among reverse engineers, and thus to support the diffing, various tools such as BinDiff, TurboDiff, and Diaphora have been developed. However, although those fantastic tools greatly support the analysis, patch diffing is still a difficult task because it requires deep knowledge and experience. In order to address this issue, we conducted a pilot study with the goal to achieve a semi-automated patch diffing by applying machine-learning techniques. Based on the hypothesis that “similar types of vulnerabilities will be fixed in a similar manner,” we have applied the unsupervised machine learning technique to extract those patterns and considered the way to achieve semi-automated patch diffing. In the talk, we will show the details of our pilot study and share the insights that we have gained it. We believe that our insights will help other researchers who will conduct similar research in the future.
[ROOTCON13] Pilot Study on Semi-Automated Patch Diffing by Applying Machine-Learning Techniques
1. Pilot Study on
Semi-Automated Patch Diffing by
Applying Machine-Learning Techniques
Asuka Nakajima (@AsuNa_jp)
NTT Secure Platform Laboratories
ROOTCON 13
2. 2
#whoami
Asuka Nakajima (@AsuNa_jp)
Security Researcher at NTT Secure Platform Laboratories
Vulnerability Discovery / Reverse Engineering / IoT Security
Founder of “CTF for GIRLS”
First female infosec community in Japan (est.2014)
Black Hat Asia Review Board
From 2018-Present
Veteran Conference/Event Speaker
BlackHatUSA 2019, AsiaCCS 2019, AIS3 2018/2016, PHDays IV, SECCON,etc
3. 3
Agenda
Background
Extracting Security Fix Patterns Using
Unsupervised Machine Learning Algorithm*
Classifying Security Fixes and Other Fixes*
Conclusion
PART 1
PART 2
*Original Paper
Asuka Nakajima, Ren Kimura, Yuhei Kawakoya, Makoto Iwamura, Takeo
Hariu, “An Investigation of Method to Assist Identification of Patched Part of the Vulnerable Software Based on Patch
Diffing” Multimedia, Distributed, Cooperative, and Mobile Symposium, June 2017, Japan
4. 4
What is Patch Diffing?
Before Patched After Patched
Identify vulnerable part & Create 1-day exploit
Compare
5. 5
Example: CVE-2006-4691 (MS06-70)
Before Patched (netapi32.dll) After Patched (netapi32.dll)
if ( !v5 )
{
_wcscpy(&Dest, L"");
v6 = (wchar_t *)&v24;
}
if ( _wcslen(Str) > 0x101 ){
NetpLogPrintHelper(“NetpManageIPCConnect:
server name %ws too long
- error outn“, (char)Str);
return 87;
}
if ( *Str != 92 ){
_wcscpy(&Dest, L"");
v4 = (wchar_t *)&v24;
}
Assembly
Pseudo Code
Assembly
Pseudo Code
Stacked-Based Buffer overflow in NetpManageIPCConnect Function
Windows
Security Check
6. 6
Tools for Patch Diffing
However, patch diffing is still a difficult task
because it requires deep knowledge and experience
Bindiff (Zynamics)
https://www.zynamics.com/bindiff.html
Turbodiff (Core SECURITY)
https://www.coresecurity.com/corelabs-research/open-source-tools/turbodiff
Diaphora (Joxean Koret)
http://diaphora.re/
Acquired by Google
Semi-automated patch diffing
7. 7
Previous Work
Machine learning techniques could be applied
DarunGrim
Shows the candidate functions that
security fixes might have been applied
Approach
Use heuristics pattern-matching rules to identify the candidate functions
These patterns are manually defined by the developer
Pattern Type Score
cmp Opcode +1
test Opcode +1
0xFFFFFFF Immediate Value +3
wcslen Function Name +2
strlen Function Name +2
StringCchCopyW Function Name +2
ULongLongToUlong Function Name +2
DarunGrim (Jeongwook Oh)
9. 9
Hypothesis
@@ -672,10 +675,6 @@ static int do_ssl3_write(SSL *s,
+ if (wb->buf == NULL)
+ if (!ssl3_setup_write_buffer(s))
+ return -1;
+
if (len == 0 && !create_empty_fragment)
return 0;
CVE-2014-0198
@@ -92,8 +92,6 @@ X509_REQ *X509_to_X509_REQ(X509 *x,
pktmp = X509_get_pubkey(x);
+ if (pktmp == NULL)
+ goto err;
i = X509_REQ_set_pubkey(ret, pktmp);
EVP_PKEY_free(pktmp);
CVE-2015-0288
Similar Types of Vulnerabilities
will be Fixed in a Similar Manner
Null Pointer Dereference
Extract Fix Patterns Using Unsupervised
Machine Learning Algorithm (Cluster Analysis)
Occurs when a program
attempts to read or write
to memory with a NULL
pointer
Check weather the
pointer is NULL or not
10. 10
Challenges
Challenge 1 : Optimization
Challenge 2 : Other Fixes May Have Been Applied
1. Basic Block Reordering
2. Instruction Reordering
3. Operand Changes
4. Inline Expansion / Loop Unrolling
15. 15
Challenge 2
Other fixes may have been applied
1. Bug Fixes
2. Refactoring
3. Feature Updates
16. 16
Experiment Overview
Dataset
Target Software: OpenSSL 1.0.1
Collected 62 Security Fixes
Cluster Analysis
Hierarchical Clustering Algorithm
Extract security fix patterns which could be used
to support the semi-automated patch diffing
GOAL
17. 17
Data Collection Method [1/3]
OpenSSL 1.0.1 (git / 4675a56 (openssl 1.0.1 stable)
CVE-
ID Type Hash value
CVE-
2012-
2110
Before
Patched
d36e0ee460f41d6b64015
455c4f5414a319865c3
After
Patched
8d5505d099973a06781b7
e0e5b65861859a7d994
CVE-
2016-
6304
Before
Patched
151adf2e5cc23284a059e0
f155505006a1c9fad9
After
Patched
2c0d295e26306e15a92eb
23a84a1802005c1c137
@@ -260,7 +265,11 @@ int BN_dec2bn(BIGNUM **bn, const char *a)
a++;
}
- for (i = 0; isdigit((unsigned char)a[i]); i++) ;
+ for (i = 0; i <= (INT_MAX/4) && isdigit((unsigned char)a[i]); i++)
+ continue;
+
+ if (i > INT_MAX/4)
Compile/Disassemble before & after patched source code
Diff the source code &
Identify the patched part (Function)
Analyze Commit Log &
Release note*
*OpenSSL 1.0.1 Series Release Notes
https://web.archive.org/web/20170208161711/https://www.openssl.org/news/openssl-1.0.1-notes.html
STEP 1 STEP 2
STEP 3
18. 18
STEP 1 STEP 2
Extract the increased
instructions
Normalize the
instructions
After
Normalization
mov
mov
cmp
jne
test
je
push
jmp
lea
pop
…
trans
trans
cmp
jump
cmp
jump
stack
jump
lea
stack
…
# of occurrences
(each instruction)
trans: 2
cmp: 2
jump: 3
stack:2
lea: 1
…
Before
Normalization
Count the number of
occurrences of each
normalized instruction
trans
trans
cmp
jump
cmp
jump
stack
jump
lea
stack
…
STEP 3
mov
mov
cmp
jne
test
je
push
jmp
lea
pop
…
Before
Patched
After
Patched
push
mov
sub
jmp
…
push
mov
sub
+ mov
+ mov
jmp
…
Data Collection Method [2/3]
Feature Extraction Method
Normalized
Instructions
Increased
Instructions
19. 19
Normalized
Instruction
Type of Instruction Target Instuctions
jump Branch jns, jle, jne, jge, jae, jmp, js, jl, je, jg, ja, jb, jbe
trans Data Transfer movxz, mov, movsx, xchg, cdq
ctrans Conditional Data Transfer cmovge, cmovae, cmovs, cmovns, cmove, cmovne
stack Stack Manipulation push,pop
logical Logical Operation and, xor, or, not
arith Arithmetic Operation sub, add, imul, neg, adc
nop No Operation nop
bop Bit/Byte Operation bt, setne, sete
shift Shift Operation shr, shl, sar
func Function Operation call, ret
str String Operation repz *
cmp Comparison test, cmp
lea Address Computation lea
Opcode (Instruction) Normalization
Summarized and expressed the instructions that fall into similar
categories by one (normalized) instruction
e.g.) Branch instruction such as jns,jle,jne,jge are normalized as “jump”
Normalized the instructions which appeared in the security fixes (Function)
Data Collection Method [3/3]
20. 20
Cluster Analysis [1/2]
Divides data into groups that are meaningful/useful
Before Clustering After Clustering
Cluster 1
Cluster 2
Cluster 3
21. 21
Cluster Analysis [2/2]
Hierarchical Clustering Algorithm
Produce a classification in which small clusters of very similar data points
are nested within larger clusters of less closely-related data points*
e.g.) Agglomerative Hierarchical Clustering
Non-Hierarchical Clustering Algorithm
Generates a classification by partitioning dataset*
e.g.) K-means Clustering
*Hierarchical and non-Hierarchical Clustering
https://www.daylight.com/meetings/mug96/barnard/E-MUG95.html
Divides data into groups that are meaningful/useful
22. 22
Cluster Analysis [2/2]
Hierarchical Clustering Algorithm
Produce a classification in which small clusters of very similar data points
are nested within larger clusters of less closely-related data points*
e.g.) Agglomerative Hierarchical Clustering
*Hierarchical and non-Hierarchical Clustering
https://www.daylight.com/meetings/mug96/barnard/E-MUG95.html
Divides data into groups that are meaningful/useful
Euclidean Distance
&
Ward’s Method
d(A,B) = E(AUB) – E(P) – E(Q)
Cluster A
Cluster B
23. 23
CWE [1/2]
Classic Buffer Over flow
CWE (Common Weakness Enumeration)
List of Software Weakness Types
Gives a unique identifier (CWE-ID) to each types
e.g, CWE-120:Buffer Copy without Checking Size of Input
Latest Version:3.4 / Total 808 Weaknesses.
https://cwe.mitre.org/data/definitions/120.html
Used as a label
for each data
24. 24
CWE [2/2]
CWE organizes a wide variety of
weakness types in a hierarchical structure
*CWE Overview, IPA
https://www.ipa.go.jp/security/english/vuln/CWE_en.html
The weakness types at higher
levels in the structure gives a more
abstract and broader concept*
Structure Types
Development Concepts
Research Concepts
Architectural Concepts
[Parent] CWE-119 -> [Child] CWE-120
25. 25
CWE [2/2]
CWE organizes a wide variety of
weakness types in a hierarchical structure
*CWE Overview, IPA
https://www.ipa.go.jp/security/english/vuln/CWE_en.html
The weakness types at higher
levels in the structure gives a more
abstract and broader concept*
Structure Types
Development Concepts
Research Concepts
Architectural Concepts
CWE-ID Description
CWE-17 Code
CWE-19 Data Processing Error
CWE-254 Security Features
CWE-361 Time and State
CWE-398 Indicator of Poor Code Quality
CWE-399 Resource Management Errors
Used these root
CWE-IDs as labels
[Parent] CWE-119 -> [Child] CWE-120
27. 27
Details of the Cluster 1
CWE-ID CVE-ID
Feature Vectors
(Normalized Instruction:# of Occurrences)
CWE-19 CVE-2016-6306 jump: 9, trans: 7, cmp: 6, lea: 4, arith: 3, func: 1
CWE-19 CVE-2016-0797 jump: 7, lea: 4, cmp: 4, trans: 2, arith: 2
CWE-19 CVE-2015-0206 jump: 7, trans: 5, func: 4, stack: 4, cmp: 4, arith: 2, lea: 1
CWE-19 CVE-2014-3508
jump: 9, trans: 5, cmp: 5, stack: 4, nop: 3, lea: 2, arith: 2,
bop: 1, func: 1
CWE-398 CVE-2014-5139
jump: 7, stack: 4, cmp: 3, logical: 2, trans: 2, nop: 2,
func': 1
Most of the labels are CWE-19 (Data Processing Error)
Most of the vulnerabilities in this cluster are related to the memory or value
manipulation error, which was not initially expected by developers
• e.g.) Out-of-bounds read, info-leak, integer overflow)
A certain number of Comparison/Branch/Arithmetic Operation instructions exist
Summary
28. 28
CVE-2016-0797
Patched Part of CVE-2016-0797
@@ -190,11 +189,7 @@ int BN_hex2bn(BIGNUM **bn,
}
+ for (i = 0; i <= (INT_MAX/4) &&
isxdigit((unsigned char)a[i]); i++)
+ continue;
+
+ if (i > INT_MAX/4)
+ goto err;
- for (i = 0; isxdigit((unsigned char)a[i]); i++) ;
Added a check to confirm the
integer value is under the expected upper limit
Integer Overflow Vulnerability
Will be used in bn_expand(ret, i*4)
29. 29
CWE-ID CVE-ID
Feature Vectors:
(Normalized Instruction:# of Occurrences)
CWE-254 CVE-2015-1793 trans: 4, jump: 3, ctrans: 1, nop: 1, func: 1
CWE-254 CVE-2014-3567 trans: 4, jump: 2, func: 1
CWE-254 CVE-2014-3470 trans: 4, jump: 2, nop: 2, lea: 1, func: 1, cmp: 1
CWE-254 CVE-2015-0205 trans: 5, func: 1
CWE-254 CVE-2014-0224 trans: 5, jump: 3, logical: 2, cmp: 1
CWE-19 CVE-2014-0195 trans: 6, jump: 3, cmp: 1
Details of the Cluster 2
Most of the labels are CWE-254 (Security Features)
Most of the security fixes for the vulnerabilities in this cluster contain some sort
of error handling function
A certain number of Data Transfer/Branch/Function related instructions exist
Summary
30. 30
CVE-2014-3470
Patched Part of CVE-2014-3470
@@ -2512,13 +2512,6 @@ int ssl3_send_client_key_exchange
int field_size = 0;
+ if (s->session->sess_cert == NULL)
+ {
+ ssl3_send_alert(s,SSL3_AL_FATAL,
SSL_AD_UNEXPECTED_MESSAGE);
+ SSLerr(SSL_F_SSL3_SEND_CLIENT_KEY_EXCHANGE,
SSL_R_UNEXPECTED_MESSAGE);
+ goto err;
+ }
Added two error handling function + exit the function
NULL Pointer Dereference
31. 31
Discussions
Why Only Two Clusters?
Some vulnerabilities are found in multiple functions
Similar functions contain same vulnerability
How to Improve
Include other features such as function name?
Collect more security fixes
Use vulnerability corpus generation tools? (e.g, LAVA)
Use other machine learning techniques
For Semi-Automated Patch-Diffing
Calculate the similarity between the extracted security fix patterns
(instructions) and the difference (increased instructions) found by the
patch diffing
*LAVA: Large-scale Automated Vulnerability Addition
https://www.andreamambretti.com/files/papers/oakland2016_lava.pdf
We count the number of
occurrences of normalized instructions
Centroid
33. 33
Classifying Security Fixes and Other Fixes
Dataset
OpenSSL 1.0.1 (62 Security Fixes / 377 Other fixes)
Classification Method
Supervised Linear Classifier
Soft Margin Support Vector Machine (SVM)
Kernel: RBF (C=10, γ = 0.001)
Experiment
Used 62 Security Fixes and 62 Other fixes (Random sampling)
Conducted 10-fold Cross-Validation 3 times
• Perform random sampling for each cross-validation
Environment
OS: Ubuntu 14.04, Compiler: gcc 5.4.0
34. 34
Support Vector Machine (SVM)
Method used for classification (+regression) tasks
Before After
A
A
A
A
B
B
BB
BB
B
A
BA
B
A
A
A
A
A
B
B
BB
BB
B
A
BA
B
A
A A
35. 35
Result
Type of Fix Dataset 1 Dataset 2 Dataset 3 Average
Accuracy 0.62 0.54 0.54 0.56
Precision
Security Fix 0.70 0.57 0.57 0.61
Other 0.59 0.54 0.54 0.55
Recall
Security Fix 0.42 0.49 0.42 0.41
Other 0.82 0.71 0.68 0.73
F-Score
Security Fix 0.53 0.46 0.44 0.47
Other 0.68 0.61 0.63 0.64
36. 36
Summary of Result
Summary
Overall Accuracy : 56% (average)
Security Fixes: Precision 61% / Recall 41% (average)
Other Fixes: Precision 55% / Recall 73% (average)
Discussions
Use other metrics? (e.g., Cyclomatic complexity)
Accuracy Ratio of the number of correctly labeled fixes to the number of all fixes in the dataset
Precision
Ratio of the number of correctly labeled security fixes to the number of all fixes
labeled as “security fix” by the program
Recall
Ratio of the number of correctly labeled security fixes to the number of all security
fixes in the dataset
F-Score Harmonic mean of the Precision and Recall
Glossary
37. 37
Summary & Conclusion
Patch diffing is still a difficult task because it requires
a deep knowledge and experience
✔
Extracted security fix patterns which could be used
to support the semi-automated patch diffing
Conducted an experiment to see if it is possible to
distinguish between security fixes and other fixes
✔
✔
Provided insights for future research
related to the semi-automated patch diffing
38. 38
Appendix [1/3]
Original Paper
Asuka Nakajima, Ren Kimura, Yuhei Kawakoya, Makoto Iwamura, Takeo Hariu, “An
Investigation of Method to Assist Identification of Patched Part of the Vulnerable
Software Based on Patch Diffing” Multimedia, Distributed, Cooperative, and Mobile
Symposium, June 2017, Japan
Download URL
https://ipsj.ixsq.nii.ac.jp/ej/?action=repository_uri&item_id=190132&file
_id=1&file_no=1
39. 39
Appendix [2/3]
Other Research (1)
Asuka Nakajima, Takuya Watanabe, Eitaro Shioji, Mitsuaki Akiyama, and
Maverick Woo “A Pilot Study on Consumer IoT Device Vulnerability
Disclosure and Patch Release in Japan and the United States”
Proceedings of the 14th ACM ASIA Conference on Information, Computer and
Communications Security (ASIA CCS 2019)
[PDF] https://www.cylab.cmu.edu/_files/pdfs/tech_reports/CMUCyLab19001.pdf
Revealed Significant 1-Day Risk Related to IoT
ASIA CCS 2019
Example: CVE-2017-7852
Patch Release
Timeline
DCS-932L RevA
2015/Nov/18
DCS-932L RevA
2016/Jul/19
244 Days Vendor: D-Link, Product: Network Camera
1-Day Risk: Unsynchronized Patch Release (Geographical Arbitrage)
40. 40
Appendix [3/3]
Other Activities (Female InfoSec Community)
CTF for GIRLS: http://girls.seccon.jp (Twitter:@ctf4g)
Asuka Nakajima, Suhee Kang, Hazel Yen, “Women in Security:
Building a Female InfoSec Community in Korea, Japan, and Taiwan”,
BlackHatUSA 2019
Women-Only CTF Workshop Talk about Asian Female InfoSec Community