SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Pilot Study on
Semi-Automated Patch Diffing by
Applying Machine-Learning Techniques
Asuka Nakajima (@AsuNa_jp)
NTT Secure Platform Laboratories
ROOTCON 13
2
#whoami
 Asuka Nakajima (@AsuNa_jp)
Security Researcher at NTT Secure Platform Laboratories
Vulnerability Discovery / Reverse Engineering / IoT Security
 Founder of “CTF for GIRLS”
First female infosec community in Japan (est.2014)
 Black Hat Asia Review Board
From 2018-Present
 Veteran Conference/Event Speaker
BlackHatUSA 2019, AsiaCCS 2019, AIS3 2018/2016, PHDays IV, SECCON,etc
3
Agenda
Background
Extracting Security Fix Patterns Using
Unsupervised Machine Learning Algorithm*
Classifying Security Fixes and Other Fixes*
Conclusion
PART 1
PART 2
*Original Paper
Asuka Nakajima, Ren Kimura, Yuhei Kawakoya, Makoto Iwamura, Takeo
Hariu, “An Investigation of Method to Assist Identification of Patched Part of the Vulnerable Software Based on Patch
Diffing” Multimedia, Distributed, Cooperative, and Mobile Symposium, June 2017, Japan
4
What is Patch Diffing?
Before Patched After Patched
Identify vulnerable part & Create 1-day exploit
Compare
5
Example: CVE-2006-4691 (MS06-70)
Before Patched (netapi32.dll) After Patched (netapi32.dll)
if ( !v5 )
{
_wcscpy(&Dest, L"");
v6 = (wchar_t *)&v24;
}
if ( _wcslen(Str) > 0x101 ){
NetpLogPrintHelper(“NetpManageIPCConnect:
server name %ws too long
- error outn“, (char)Str);
return 87;
}
if ( *Str != 92 ){
_wcscpy(&Dest, L"");
v4 = (wchar_t *)&v24;
}
Assembly
Pseudo Code
Assembly
Pseudo Code
Stacked-Based Buffer overflow in NetpManageIPCConnect Function
Windows
Security Check
6
Tools for Patch Diffing
However, patch diffing is still a difficult task
because it requires deep knowledge and experience
 Bindiff (Zynamics)
 https://www.zynamics.com/bindiff.html
 Turbodiff (Core SECURITY)
 https://www.coresecurity.com/corelabs-research/open-source-tools/turbodiff
 Diaphora (Joxean Koret)
 http://diaphora.re/
Acquired by Google
Semi-automated patch diffing
7
Previous Work
Machine learning techniques could be applied
 DarunGrim
 Shows the candidate functions that
security fixes might have been applied
 Approach
 Use heuristics pattern-matching rules to identify the candidate functions
 These patterns are manually defined by the developer
Pattern Type Score
cmp Opcode +1
test Opcode +1
0xFFFFFFF Immediate Value +3
wcslen Function Name +2
strlen Function Name +2
StringCchCopyW Function Name +2
ULongLongToUlong Function Name +2
DarunGrim (Jeongwook Oh)
Extracting Security Fix Patterns Using
Unsupervised Machine Learning Algorithm
PART 1
9
Hypothesis
@@ -672,10 +675,6 @@ static int do_ssl3_write(SSL *s,
+ if (wb->buf == NULL)
+ if (!ssl3_setup_write_buffer(s))
+ return -1;
+
if (len == 0 && !create_empty_fragment)
return 0;
CVE-2014-0198
@@ -92,8 +92,6 @@ X509_REQ *X509_to_X509_REQ(X509 *x,
pktmp = X509_get_pubkey(x);
+ if (pktmp == NULL)
+ goto err;
i = X509_REQ_set_pubkey(ret, pktmp);
EVP_PKEY_free(pktmp);
CVE-2015-0288
Similar Types of Vulnerabilities
will be Fixed in a Similar Manner
Null Pointer Dereference
Extract Fix Patterns Using Unsupervised
Machine Learning Algorithm (Cluster Analysis)
 Occurs when a program
attempts to read or write
to memory with a NULL
pointer
 Check weather the
pointer is NULL or not
10
Challenges
Challenge 1 : Optimization
Challenge 2 : Other Fixes May Have Been Applied
1. Basic Block Reordering
2. Instruction Reordering
3. Operand Changes
4. Inline Expansion / Loop Unrolling
11
Challenge1: Basic Block ReorderingChallenge 1
CVE-ID Program1 (Before Patched) Program2 (After Patched)
CVE-
2015-
1788
call 0x81031d0 <BN_copy>
test eax,eax
setnebl
jmp 0x820337a <BN_GF2m_mod_inv+970>
mov eax,DWORD PTR [esp+0x38]
mov DWORD PTR [esp+0x4],edi
mov DWORD PTR [esp],eax
call 0x8102f90 <bn_expand2>
jmp 0x8203149 <BN_GF2m_mod_inv+409>
lea eax,[esp+0x58]
mov edx,eax
jmp 0x820339a <BN_GF2m_mod_inv+1002>
shl ecx,0x5
mov DWORD PTR [esp+0x20],ecx
jmp 0x8203307 <BN_GF2m_mod_inv+855>
mov eax,DWORD PTR [esp+0x30]
mov DWORD PTR [esp+0x4],eax
mov eax,DWORD PTR [esp+0x3c]
mov DWORD PTR [esp],eax
call 0x8102f90 <bn_expand2>
jmp 0x82031dd <BN_GF2m_mod_inv+557>
mov eax,DWORD PTR [esp+0x30]
mov DWORD PTR [esp+0x4],eax
mov eax,DWORD PTR [esp+0x34]
mov DWORD PTR [esp],eax
call 0x8102f90 <bn_expand2>
jmp 0x8203190 <BN_GF2m_mod_inv+480>
call 0x81031d0 <BN_copy>
test eax,eax
setnebl
jmp 0x820338a <BN_GF2m_mod_inv+986>
mov eax,DWORD PTR [esp+0x30]
mov DWORD PTR [esp+0x4],eax
mov eax,DWORD PTR [esp+0x3c]
mov DWORD PTR [esp],eax
call 0x8102f90 <bn_expand2>
jmp 0x82031dd <BN_GF2m_mod_inv+557>
mov eax,DWORD PTR [esp+0x30]
mov DWORD PTR [esp+0x4],eax
mov eax,DWORD PTR [esp+0x34]
mov DWORD PTR [esp],eax
call 0x8102f90 <bn_expand2>
jmp 0x8203190 <BN_GF2m_mod_inv+480>
mov eax,DWORD PTR [esp+0x38]
mov DWORD PTR [esp+0x4],edi
mov DWORD PTR [esp],eax
call 0x8102f90 <bn_expand2>
jmp 0x8203149 <BN_GF2m_mod_inv+409>
lea eax,[esp+0x58]
mov edx,eax
jmp 0x82033aa <BN_GF2m_mod_inv+1018>
shl ecx,0x5
mov DWORD PTR [esp+0x20],ecx
jmp 0x8203317 <BN_GF2m_mod_inv+871>​​​​​​​
12
Challenge1: Instruction ReorderingChallenge 1
CVE-ID Program1 Program2
CVE-
2015-1789
mov [esp+44h+var_4], eax
push esi
mov ebp, [esp+4Ch+arg_4]
push esi
mov esi, [esp+4Ch+arg_0]
mov ecx, [esi]
mov eax, [esi + 8]
push edi
mov edi, [esi+4]
cmp edi, 17h
mov [esp+44h+var_4], eax
push esi
mov ebp, [esp+4Ch+arg_4]
push esi
mov esi, [esp+4Ch+arg_0]
push edi
mov edi, [esi+4]
mov ecx, [esi]
mov eax, [esi + 8]
cmp edi, 17h
IDA Pro
13
Challenge1: Operand Changes
CVE-ID Program1 Program2
CVE-
2008-5023
xor ebx, ebx
add rsp, 38h
mov eax, ebx
pop rbx
pop rbp
pop r12
pop r13
retn
xor r12d, r12d
add rsp, 38h
mov eax, r12d
pop rbx
pop rbp
pop r12
pop r13
retn
Register is different (ebx -> r12d)
Challenge 1
14
Challenges1: Inline Expansion/Loop UnrollingChallenge 1
Source code Program1 (Before) Program2 (After)
Inline
Expansion
void my_print(int n){
printf("%d", n);
}
int main(){
int n = 1;
my_print(n);
return 0;
}
<main>:
push ebp
mov ebp,esp
sub esp,0x4
mov DWORD PTR [ebp-0x4],0x1
push DWORD PTR [ebp-0x4]
call 804840b <my_print>
add esp,0x4
mov eax,0x0
leave
ret
<main>:
push 0x1
push 0x80484e0
push 0x1
call 8048310 <__printf_chk@plt>
add esp,0xc
xor eax,eax
ret
Loop
Unrolling
int main(){
int i;
for(i = 0; i < 3; i++){
printf("HelloWorld!");
}
return 0;
}
<main>:
push ebp
mov ebp,esp
sub esp,0x4
mov DWORD PTR [ebp-0x4],0x0
jmp 804842b <main+0x20>
push 0x80484c0
call 80482e0 <printf@plt>
add esp,0x4
add DWORD PTR [ebp-0x4],0x1
cmp DWORD PTR [ebp-0x4],0x2
jle 804841a <main+0xf>
<main>:
push 0x80484d0
push 0x1
call 8048310 <__printf_chk@plt>
push 0x80484d0
push 0x1
call 8048310 <__printf_chk@plt>
push 0x80484d0
push 0x1
call 8048310 <__printf_chk@plt>
15
Challenge 2
 Other fixes may have been applied
1. Bug Fixes
2. Refactoring
3. Feature Updates
16
Experiment Overview
Dataset
 Target Software: OpenSSL 1.0.1
Collected 62 Security Fixes
 Cluster Analysis
 Hierarchical Clustering Algorithm
Extract security fix patterns which could be used
to support the semi-automated patch diffing
GOAL
17
Data Collection Method [1/3]
 OpenSSL 1.0.1 (git / 4675a56 (openssl 1.0.1 stable)
CVE-
ID Type Hash value
CVE-
2012-
2110
Before
Patched
d36e0ee460f41d6b64015
455c4f5414a319865c3
After
Patched
8d5505d099973a06781b7
e0e5b65861859a7d994
CVE-
2016-
6304
Before
Patched
151adf2e5cc23284a059e0
f155505006a1c9fad9
After
Patched
2c0d295e26306e15a92eb
23a84a1802005c1c137
@@ -260,7 +265,11 @@ int BN_dec2bn(BIGNUM **bn, const char *a)
a++;
}
- for (i = 0; isdigit((unsigned char)a[i]); i++) ;
+ for (i = 0; i <= (INT_MAX/4) && isdigit((unsigned char)a[i]); i++)
+ continue;
+
+ if (i > INT_MAX/4)
Compile/Disassemble before & after patched source code
Diff the source code &
Identify the patched part (Function)
Analyze Commit Log &
Release note*
*OpenSSL 1.0.1 Series Release Notes
https://web.archive.org/web/20170208161711/https://www.openssl.org/news/openssl-1.0.1-notes.html
STEP 1 STEP 2
STEP 3
18
STEP 1 STEP 2
Extract the increased
instructions
Normalize the
instructions
After
Normalization
mov
mov
cmp
jne
test
je
push
jmp
lea
pop
…
trans
trans
cmp
jump
cmp
jump
stack
jump
lea
stack
…
# of occurrences
(each instruction)
trans: 2
cmp: 2
jump: 3
stack:2
lea: 1
…
Before
Normalization
Count the number of
occurrences of each
normalized instruction
trans
trans
cmp
jump
cmp
jump
stack
jump
lea
stack
…
STEP 3
mov
mov
cmp
jne
test
je
push
jmp
lea
pop
…
Before
Patched
After
Patched
push
mov
sub
jmp
…
push
mov
sub
+ mov
+ mov
jmp
…
Data Collection Method [2/3]
 Feature Extraction Method
Normalized
Instructions
Increased
Instructions
19
Normalized
Instruction
Type of Instruction Target Instuctions
jump Branch jns, jle, jne, jge, jae, jmp, js, jl, je, jg, ja, jb, jbe
trans Data Transfer movxz, mov, movsx, xchg, cdq
ctrans Conditional Data Transfer cmovge, cmovae, cmovs, cmovns, cmove, cmovne
stack Stack Manipulation push,pop
logical Logical Operation and, xor, or, not
arith Arithmetic Operation sub, add, imul, neg, adc
nop No Operation nop
bop Bit/Byte Operation bt, setne, sete
shift Shift Operation shr, shl, sar
func Function Operation call, ret
str String Operation repz *
cmp Comparison test, cmp
lea Address Computation lea
 Opcode (Instruction) Normalization
 Summarized and expressed the instructions that fall into similar
categories by one (normalized) instruction
 e.g.) Branch instruction such as jns,jle,jne,jge are normalized as “jump”
 Normalized the instructions which appeared in the security fixes (Function)
Data Collection Method [3/3]
20
Cluster Analysis [1/2]
Divides data into groups that are meaningful/useful
Before Clustering After Clustering
Cluster 1
Cluster 2
Cluster 3
21
Cluster Analysis [2/2]
 Hierarchical Clustering Algorithm
 Produce a classification in which small clusters of very similar data points
are nested within larger clusters of less closely-related data points*
 e.g.) Agglomerative Hierarchical Clustering
 Non-Hierarchical Clustering Algorithm
 Generates a classification by partitioning dataset*
 e.g.) K-means Clustering
*Hierarchical and non-Hierarchical Clustering
https://www.daylight.com/meetings/mug96/barnard/E-MUG95.html
Divides data into groups that are meaningful/useful
22
Cluster Analysis [2/2]
 Hierarchical Clustering Algorithm
 Produce a classification in which small clusters of very similar data points
are nested within larger clusters of less closely-related data points*
 e.g.) Agglomerative Hierarchical Clustering
*Hierarchical and non-Hierarchical Clustering
https://www.daylight.com/meetings/mug96/barnard/E-MUG95.html
Divides data into groups that are meaningful/useful
Euclidean Distance
&
Ward’s Method
d(A,B) = E(AUB) – E(P) – E(Q)
Cluster A
Cluster B
23
CWE [1/2]
Classic Buffer Over flow
 CWE (Common Weakness Enumeration)
 List of Software Weakness Types
 Gives a unique identifier (CWE-ID) to each types
 e.g, CWE-120:Buffer Copy without Checking Size of Input
 Latest Version:3.4 / Total 808 Weaknesses.
https://cwe.mitre.org/data/definitions/120.html
Used as a label
for each data
24
CWE [2/2]
CWE organizes a wide variety of
weakness types in a hierarchical structure
*CWE Overview, IPA
https://www.ipa.go.jp/security/english/vuln/CWE_en.html
 The weakness types at higher
levels in the structure gives a more
abstract and broader concept*
 Structure Types
 Development Concepts
 Research Concepts
 Architectural Concepts
[Parent] CWE-119 -> [Child] CWE-120
25
CWE [2/2]
CWE organizes a wide variety of
weakness types in a hierarchical structure
*CWE Overview, IPA
https://www.ipa.go.jp/security/english/vuln/CWE_en.html
 The weakness types at higher
levels in the structure gives a more
abstract and broader concept*
 Structure Types
 Development Concepts
 Research Concepts
 Architectural Concepts
CWE-ID Description
CWE-17 Code
CWE-19 Data Processing Error
CWE-254 Security Features
CWE-361 Time and State
CWE-398 Indicator of Poor Code Quality
CWE-399 Resource Management Errors
Used these root
CWE-IDs as labels
[Parent] CWE-119 -> [Child] CWE-120
26
Result
② ①
Cluster 2
Cluster 1
Dendrogram
CVE-ID + Label (CWE-ID)
27
Details of the Cluster 1
CWE-ID CVE-ID
Feature Vectors
(Normalized Instruction:# of Occurrences)
CWE-19 CVE-2016-6306 jump: 9, trans: 7, cmp: 6, lea: 4, arith: 3, func: 1
CWE-19 CVE-2016-0797 jump: 7, lea: 4, cmp: 4, trans: 2, arith: 2
CWE-19 CVE-2015-0206 jump: 7, trans: 5, func: 4, stack: 4, cmp: 4, arith: 2, lea: 1
CWE-19 CVE-2014-3508
jump: 9, trans: 5, cmp: 5, stack: 4, nop: 3, lea: 2, arith: 2,
bop: 1, func: 1
CWE-398 CVE-2014-5139
jump: 7, stack: 4, cmp: 3, logical: 2, trans: 2, nop: 2,
func': 1
Most of the labels are CWE-19 (Data Processing Error)
 Most of the vulnerabilities in this cluster are related to the memory or value
manipulation error, which was not initially expected by developers
• e.g.) Out-of-bounds read, info-leak, integer overflow)
 A certain number of Comparison/Branch/Arithmetic Operation instructions exist
Summary
28
CVE-2016-0797
Patched Part of CVE-2016-0797
@@ -190,11 +189,7 @@ int BN_hex2bn(BIGNUM **bn,
}
+ for (i = 0; i <= (INT_MAX/4) &&
isxdigit((unsigned char)a[i]); i++)
+ continue;
+
+ if (i > INT_MAX/4)
+ goto err;
- for (i = 0; isxdigit((unsigned char)a[i]); i++) ;
Added a check to confirm the
integer value is under the expected upper limit
Integer Overflow Vulnerability
Will be used in bn_expand(ret, i*4)
29
CWE-ID CVE-ID
Feature Vectors:
(Normalized Instruction:# of Occurrences)
CWE-254 CVE-2015-1793 trans: 4, jump: 3, ctrans: 1, nop: 1, func: 1
CWE-254 CVE-2014-3567 trans: 4, jump: 2, func: 1
CWE-254 CVE-2014-3470 trans: 4, jump: 2, nop: 2, lea: 1, func: 1, cmp: 1
CWE-254 CVE-2015-0205 trans: 5, func: 1
CWE-254 CVE-2014-0224 trans: 5, jump: 3, logical: 2, cmp: 1
CWE-19 CVE-2014-0195 trans: 6, jump: 3, cmp: 1
Details of the Cluster 2
Most of the labels are CWE-254 (Security Features)
 Most of the security fixes for the vulnerabilities in this cluster contain some sort
of error handling function
 A certain number of Data Transfer/Branch/Function related instructions exist
Summary
30
CVE-2014-3470
Patched Part of CVE-2014-3470
@@ -2512,13 +2512,6 @@ int ssl3_send_client_key_exchange
int field_size = 0;
+ if (s->session->sess_cert == NULL)
+ {
+ ssl3_send_alert(s,SSL3_AL_FATAL,
SSL_AD_UNEXPECTED_MESSAGE);
+ SSLerr(SSL_F_SSL3_SEND_CLIENT_KEY_EXCHANGE,
SSL_R_UNEXPECTED_MESSAGE);
+ goto err;
+ }
Added two error handling function + exit the function
NULL Pointer Dereference
31
Discussions
 Why Only Two Clusters?
 Some vulnerabilities are found in multiple functions
 Similar functions contain same vulnerability
 How to Improve
 Include other features such as function name?
 Collect more security fixes
 Use vulnerability corpus generation tools? (e.g, LAVA)
 Use other machine learning techniques
 For Semi-Automated Patch-Diffing
 Calculate the similarity between the extracted security fix patterns
(instructions) and the difference (increased instructions) found by the
patch diffing
*LAVA: Large-scale Automated Vulnerability Addition
https://www.andreamambretti.com/files/papers/oakland2016_lava.pdf
We count the number of
occurrences of normalized instructions
Centroid
Classifying Security Fixes and Other Fixes
PART 2
33
Classifying Security Fixes and Other Fixes
 Dataset
 OpenSSL 1.0.1 (62 Security Fixes / 377 Other fixes)
 Classification Method
 Supervised Linear Classifier
 Soft Margin Support Vector Machine (SVM)
 Kernel: RBF (C=10, γ = 0.001)
 Experiment
 Used 62 Security Fixes and 62 Other fixes (Random sampling)
 Conducted 10-fold Cross-Validation 3 times
• Perform random sampling for each cross-validation
 Environment
 OS: Ubuntu 14.04, Compiler: gcc 5.4.0
34
Support Vector Machine (SVM)
Method used for classification (+regression) tasks
Before After
A
A
A
A
B
B
BB
BB
B
A
BA
B
A
A
A
A
A
B
B
BB
BB
B
A
BA
B
A
A A
35
Result
Type of Fix Dataset 1 Dataset 2 Dataset 3 Average
Accuracy 0.62 0.54 0.54 0.56
Precision
Security Fix 0.70 0.57 0.57 0.61
Other 0.59 0.54 0.54 0.55
Recall
Security Fix 0.42 0.49 0.42 0.41
Other 0.82 0.71 0.68 0.73
F-Score
Security Fix 0.53 0.46 0.44 0.47
Other 0.68 0.61 0.63 0.64
36
Summary of Result
 Summary
 Overall Accuracy : 56% (average)
 Security Fixes: Precision 61% / Recall 41% (average)
 Other Fixes: Precision 55% / Recall 73% (average)
 Discussions
 Use other metrics? (e.g., Cyclomatic complexity)
Accuracy Ratio of the number of correctly labeled fixes to the number of all fixes in the dataset
Precision
Ratio of the number of correctly labeled security fixes to the number of all fixes
labeled as “security fix” by the program
Recall
Ratio of the number of correctly labeled security fixes to the number of all security
fixes in the dataset
F-Score Harmonic mean of the Precision and Recall
Glossary
37
Summary & Conclusion
 Patch diffing is still a difficult task because it requires
a deep knowledge and experience
✔
 Extracted security fix patterns which could be used
to support the semi-automated patch diffing
 Conducted an experiment to see if it is possible to
distinguish between security fixes and other fixes
✔
✔
Provided insights for future research
related to the semi-automated patch diffing
38
Appendix [1/3]
 Original Paper
Asuka Nakajima, Ren Kimura, Yuhei Kawakoya, Makoto Iwamura, Takeo Hariu, “An
Investigation of Method to Assist Identification of Patched Part of the Vulnerable
Software Based on Patch Diffing” Multimedia, Distributed, Cooperative, and Mobile
Symposium, June 2017, Japan
Download URL
https://ipsj.ixsq.nii.ac.jp/ej/?action=repository_uri&item_id=190132&file
_id=1&file_no=1
39
Appendix [2/3]
 Other Research (1)
 Asuka Nakajima, Takuya Watanabe, Eitaro Shioji, Mitsuaki Akiyama, and
Maverick Woo “A Pilot Study on Consumer IoT Device Vulnerability
Disclosure and Patch Release in Japan and the United States”
Proceedings of the 14th ACM ASIA Conference on Information, Computer and
Communications Security (ASIA CCS 2019)
 [PDF] https://www.cylab.cmu.edu/_files/pdfs/tech_reports/CMUCyLab19001.pdf
Revealed Significant 1-Day Risk Related to IoT
ASIA CCS 2019
Example: CVE-2017-7852
Patch Release
Timeline
DCS-932L RevA
2015/Nov/18
DCS-932L RevA
2016/Jul/19
244 Days Vendor: D-Link, Product: Network Camera
1-Day Risk: Unsynchronized Patch Release (Geographical Arbitrage)
40
Appendix [3/3]
 Other Activities (Female InfoSec Community)
 CTF for GIRLS: http://girls.seccon.jp (Twitter:@ctf4g)
 Asuka Nakajima, Suhee Kang, Hazel Yen, “Women in Security:
Building a Female InfoSec Community in Korea, Japan, and Taiwan”,
BlackHatUSA 2019
Women-Only CTF Workshop Talk about Asian Female InfoSec Community

Weitere ähnliche Inhalte

Was ist angesagt?

Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerabilityCsw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
CanSecWest
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
Positive Hack Days
 
Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23
DefconRussia
 
20190521 pwn 101_by_roy
20190521 pwn 101_by_roy20190521 pwn 101_by_roy
20190521 pwn 101_by_roy
Roy
 

Was ist angesagt? (20)

Cisco IOS shellcode: All-in-one
Cisco IOS shellcode: All-in-oneCisco IOS shellcode: All-in-one
Cisco IOS shellcode: All-in-one
 
深入淺出C語言
深入淺出C語言深入淺出C語言
深入淺出C語言
 
Дмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортДмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репорт
 
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerabilityCsw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
 
Коварный code type ITGM #9
Коварный code type ITGM #9Коварный code type ITGM #9
Коварный code type ITGM #9
 
Metaprogramming and Reflection in Common Lisp
Metaprogramming and Reflection in Common LispMetaprogramming and Reflection in Common Lisp
Metaprogramming and Reflection in Common Lisp
 
Process management
Process managementProcess management
Process management
 
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
 
Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23
 
Inside Winnyp
Inside WinnypInside Winnyp
Inside Winnyp
 
Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.
 
Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기
 
Implementing Lightweight Networking
Implementing Lightweight NetworkingImplementing Lightweight Networking
Implementing Lightweight Networking
 
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Evgeniy Muralev, Mark Vince, Working with the compiler, not against itEvgeniy Muralev, Mark Vince, Working with the compiler, not against it
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
 
Java, Up to Date Sources
Java, Up to Date SourcesJava, Up to Date Sources
Java, Up to Date Sources
 
Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++
 
20190521 pwn 101_by_roy
20190521 pwn 101_by_roy20190521 pwn 101_by_roy
20190521 pwn 101_by_roy
 
Network security mannual (2)
Network security mannual (2)Network security mannual (2)
Network security mannual (2)
 
[HITB Malaysia 2011] Exploit Automation
[HITB Malaysia 2011] Exploit Automation[HITB Malaysia 2011] Exploit Automation
[HITB Malaysia 2011] Exploit Automation
 

Ähnlich wie [ROOTCON13] Pilot Study on Semi-Automated Patch Diffing by Applying Machine-Learning Techniques

Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
DefconRussia
 
Cypherock Assessment (1).pdf
Cypherock Assessment (1).pdfCypherock Assessment (1).pdf
Cypherock Assessment (1).pdf
PARNIKA GUPTA
 

Ähnlich wie [ROOTCON13] Pilot Study on Semi-Automated Patch Diffing by Applying Machine-Learning Techniques (20)

Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
 
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
 
[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis
 
NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016
 
16. Java stacks and queues
16. Java stacks and queues16. Java stacks and queues
16. Java stacks and queues
 
6. processes and threads
6. processes and threads6. processes and threads
6. processes and threads
 
A Unicorn Seeking Extraterrestrial Life: Analyzing SETI@home's Source Code
A Unicorn Seeking Extraterrestrial Life: Analyzing SETI@home's Source CodeA Unicorn Seeking Extraterrestrial Life: Analyzing SETI@home's Source Code
A Unicorn Seeking Extraterrestrial Life: Analyzing SETI@home's Source Code
 
Return Oriented Programming, an introduction
Return Oriented Programming, an introductionReturn Oriented Programming, an introduction
Return Oriented Programming, an introduction
 
Heap overflows for humans – 101
Heap overflows for humans – 101Heap overflows for humans – 101
Heap overflows for humans – 101
 
Advanced Debugging Using Java Bytecodes
Advanced Debugging Using Java BytecodesAdvanced Debugging Using Java Bytecodes
Advanced Debugging Using Java Bytecodes
 
Cypherock Assessment (1).pdf
Cypherock Assessment (1).pdfCypherock Assessment (1).pdf
Cypherock Assessment (1).pdf
 
Awesome_fuzzing_for _pentester_red-pill_2017
Awesome_fuzzing_for _pentester_red-pill_2017Awesome_fuzzing_for _pentester_red-pill_2017
Awesome_fuzzing_for _pentester_red-pill_2017
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
NDC TechTown 2023_ Return Oriented Programming an introduction.pdf
NDC TechTown 2023_ Return Oriented Programming an introduction.pdfNDC TechTown 2023_ Return Oriented Programming an introduction.pdf
NDC TechTown 2023_ Return Oriented Programming an introduction.pdf
 
Python Peculiarities
Python PeculiaritiesPython Peculiarities
Python Peculiarities
 
Online test program generator for RISC-V processors
Online test program generator for RISC-V processorsOnline test program generator for RISC-V processors
Online test program generator for RISC-V processors
 
Taint scope
Taint scopeTaint scope
Taint scope
 
How to drive a malware analyst crazy
How to drive a malware analyst crazyHow to drive a malware analyst crazy
How to drive a malware analyst crazy
 

Mehr von Asuka Nakajima

Mehr von Asuka Nakajima (7)

[Dagstuhl Seminar 17281] Similarity Calculation Method for Binary Executables
[Dagstuhl Seminar 17281] Similarity Calculation Method for Binary Executables[Dagstuhl Seminar 17281] Similarity Calculation Method for Binary Executables
[Dagstuhl Seminar 17281] Similarity Calculation Method for Binary Executables
 
技術紹介: S2E: Selective Symbolic Execution Engine
技術紹介: S2E: Selective Symbolic Execution Engine技術紹介: S2E: Selective Symbolic Execution Engine
技術紹介: S2E: Selective Symbolic Execution Engine
 
[JPCERT/CC POC Meeting] 研究紹介 + DLLハイジャックの脆弱性
[JPCERT/CC POC Meeting] 研究紹介 + DLLハイジャックの脆弱性[JPCERT/CC POC Meeting] 研究紹介 + DLLハイジャックの脆弱性
[JPCERT/CC POC Meeting] 研究紹介 + DLLハイジャックの脆弱性
 
[CSS×2.0 2014] Polyglotシェルコードの最高記録に挑戦しよう☆
[CSS×2.0 2014] Polyglotシェルコードの最高記録に挑戦しよう☆[CSS×2.0 2014] Polyglotシェルコードの最高記録に挑戦しよう☆
[CSS×2.0 2014] Polyglotシェルコードの最高記録に挑戦しよう☆
 
[セキュリティ・キャンプフォーラム 2014] 卒業生プレゼンテーション 『私とセキュリティと過去と未来』
[セキュリティ・キャンプフォーラム 2014] 卒業生プレゼンテーション  『私とセキュリティと過去と未来』[セキュリティ・キャンプフォーラム 2014] 卒業生プレゼンテーション  『私とセキュリティと過去と未来』
[セキュリティ・キャンプフォーラム 2014] 卒業生プレゼンテーション 『私とセキュリティと過去と未来』
 
[AsiaCCS2019] A Pilot Study on Consumer IoT Device Vulnerability Disclosure a...
[AsiaCCS2019] A Pilot Study on Consumer IoT Device Vulnerability Disclosure a...[AsiaCCS2019] A Pilot Study on Consumer IoT Device Vulnerability Disclosure a...
[AsiaCCS2019] A Pilot Study on Consumer IoT Device Vulnerability Disclosure a...
 
2014年10月江戸前セキュリティ勉強会資料 -セキュリティ技術者になるには-
2014年10月江戸前セキュリティ勉強会資料 -セキュリティ技術者になるには-2014年10月江戸前セキュリティ勉強会資料 -セキュリティ技術者になるには-
2014年10月江戸前セキュリティ勉強会資料 -セキュリティ技術者になるには-
 

Kürzlich hochgeladen

Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
chumtiyababu
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Kürzlich hochgeladen (20)

NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 

[ROOTCON13] Pilot Study on Semi-Automated Patch Diffing by Applying Machine-Learning Techniques

  • 1. Pilot Study on Semi-Automated Patch Diffing by Applying Machine-Learning Techniques Asuka Nakajima (@AsuNa_jp) NTT Secure Platform Laboratories ROOTCON 13
  • 2. 2 #whoami  Asuka Nakajima (@AsuNa_jp) Security Researcher at NTT Secure Platform Laboratories Vulnerability Discovery / Reverse Engineering / IoT Security  Founder of “CTF for GIRLS” First female infosec community in Japan (est.2014)  Black Hat Asia Review Board From 2018-Present  Veteran Conference/Event Speaker BlackHatUSA 2019, AsiaCCS 2019, AIS3 2018/2016, PHDays IV, SECCON,etc
  • 3. 3 Agenda Background Extracting Security Fix Patterns Using Unsupervised Machine Learning Algorithm* Classifying Security Fixes and Other Fixes* Conclusion PART 1 PART 2 *Original Paper Asuka Nakajima, Ren Kimura, Yuhei Kawakoya, Makoto Iwamura, Takeo Hariu, “An Investigation of Method to Assist Identification of Patched Part of the Vulnerable Software Based on Patch Diffing” Multimedia, Distributed, Cooperative, and Mobile Symposium, June 2017, Japan
  • 4. 4 What is Patch Diffing? Before Patched After Patched Identify vulnerable part & Create 1-day exploit Compare
  • 5. 5 Example: CVE-2006-4691 (MS06-70) Before Patched (netapi32.dll) After Patched (netapi32.dll) if ( !v5 ) { _wcscpy(&Dest, L""); v6 = (wchar_t *)&v24; } if ( _wcslen(Str) > 0x101 ){ NetpLogPrintHelper(“NetpManageIPCConnect: server name %ws too long - error outn“, (char)Str); return 87; } if ( *Str != 92 ){ _wcscpy(&Dest, L""); v4 = (wchar_t *)&v24; } Assembly Pseudo Code Assembly Pseudo Code Stacked-Based Buffer overflow in NetpManageIPCConnect Function Windows Security Check
  • 6. 6 Tools for Patch Diffing However, patch diffing is still a difficult task because it requires deep knowledge and experience  Bindiff (Zynamics)  https://www.zynamics.com/bindiff.html  Turbodiff (Core SECURITY)  https://www.coresecurity.com/corelabs-research/open-source-tools/turbodiff  Diaphora (Joxean Koret)  http://diaphora.re/ Acquired by Google Semi-automated patch diffing
  • 7. 7 Previous Work Machine learning techniques could be applied  DarunGrim  Shows the candidate functions that security fixes might have been applied  Approach  Use heuristics pattern-matching rules to identify the candidate functions  These patterns are manually defined by the developer Pattern Type Score cmp Opcode +1 test Opcode +1 0xFFFFFFF Immediate Value +3 wcslen Function Name +2 strlen Function Name +2 StringCchCopyW Function Name +2 ULongLongToUlong Function Name +2 DarunGrim (Jeongwook Oh)
  • 8. Extracting Security Fix Patterns Using Unsupervised Machine Learning Algorithm PART 1
  • 9. 9 Hypothesis @@ -672,10 +675,6 @@ static int do_ssl3_write(SSL *s, + if (wb->buf == NULL) + if (!ssl3_setup_write_buffer(s)) + return -1; + if (len == 0 && !create_empty_fragment) return 0; CVE-2014-0198 @@ -92,8 +92,6 @@ X509_REQ *X509_to_X509_REQ(X509 *x, pktmp = X509_get_pubkey(x); + if (pktmp == NULL) + goto err; i = X509_REQ_set_pubkey(ret, pktmp); EVP_PKEY_free(pktmp); CVE-2015-0288 Similar Types of Vulnerabilities will be Fixed in a Similar Manner Null Pointer Dereference Extract Fix Patterns Using Unsupervised Machine Learning Algorithm (Cluster Analysis)  Occurs when a program attempts to read or write to memory with a NULL pointer  Check weather the pointer is NULL or not
  • 10. 10 Challenges Challenge 1 : Optimization Challenge 2 : Other Fixes May Have Been Applied 1. Basic Block Reordering 2. Instruction Reordering 3. Operand Changes 4. Inline Expansion / Loop Unrolling
  • 11. 11 Challenge1: Basic Block ReorderingChallenge 1 CVE-ID Program1 (Before Patched) Program2 (After Patched) CVE- 2015- 1788 call 0x81031d0 <BN_copy> test eax,eax setnebl jmp 0x820337a <BN_GF2m_mod_inv+970> mov eax,DWORD PTR [esp+0x38] mov DWORD PTR [esp+0x4],edi mov DWORD PTR [esp],eax call 0x8102f90 <bn_expand2> jmp 0x8203149 <BN_GF2m_mod_inv+409> lea eax,[esp+0x58] mov edx,eax jmp 0x820339a <BN_GF2m_mod_inv+1002> shl ecx,0x5 mov DWORD PTR [esp+0x20],ecx jmp 0x8203307 <BN_GF2m_mod_inv+855> mov eax,DWORD PTR [esp+0x30] mov DWORD PTR [esp+0x4],eax mov eax,DWORD PTR [esp+0x3c] mov DWORD PTR [esp],eax call 0x8102f90 <bn_expand2> jmp 0x82031dd <BN_GF2m_mod_inv+557> mov eax,DWORD PTR [esp+0x30] mov DWORD PTR [esp+0x4],eax mov eax,DWORD PTR [esp+0x34] mov DWORD PTR [esp],eax call 0x8102f90 <bn_expand2> jmp 0x8203190 <BN_GF2m_mod_inv+480> call 0x81031d0 <BN_copy> test eax,eax setnebl jmp 0x820338a <BN_GF2m_mod_inv+986> mov eax,DWORD PTR [esp+0x30] mov DWORD PTR [esp+0x4],eax mov eax,DWORD PTR [esp+0x3c] mov DWORD PTR [esp],eax call 0x8102f90 <bn_expand2> jmp 0x82031dd <BN_GF2m_mod_inv+557> mov eax,DWORD PTR [esp+0x30] mov DWORD PTR [esp+0x4],eax mov eax,DWORD PTR [esp+0x34] mov DWORD PTR [esp],eax call 0x8102f90 <bn_expand2> jmp 0x8203190 <BN_GF2m_mod_inv+480> mov eax,DWORD PTR [esp+0x38] mov DWORD PTR [esp+0x4],edi mov DWORD PTR [esp],eax call 0x8102f90 <bn_expand2> jmp 0x8203149 <BN_GF2m_mod_inv+409> lea eax,[esp+0x58] mov edx,eax jmp 0x82033aa <BN_GF2m_mod_inv+1018> shl ecx,0x5 mov DWORD PTR [esp+0x20],ecx jmp 0x8203317 <BN_GF2m_mod_inv+871>​​​​​​​
  • 12. 12 Challenge1: Instruction ReorderingChallenge 1 CVE-ID Program1 Program2 CVE- 2015-1789 mov [esp+44h+var_4], eax push esi mov ebp, [esp+4Ch+arg_4] push esi mov esi, [esp+4Ch+arg_0] mov ecx, [esi] mov eax, [esi + 8] push edi mov edi, [esi+4] cmp edi, 17h mov [esp+44h+var_4], eax push esi mov ebp, [esp+4Ch+arg_4] push esi mov esi, [esp+4Ch+arg_0] push edi mov edi, [esi+4] mov ecx, [esi] mov eax, [esi + 8] cmp edi, 17h IDA Pro
  • 13. 13 Challenge1: Operand Changes CVE-ID Program1 Program2 CVE- 2008-5023 xor ebx, ebx add rsp, 38h mov eax, ebx pop rbx pop rbp pop r12 pop r13 retn xor r12d, r12d add rsp, 38h mov eax, r12d pop rbx pop rbp pop r12 pop r13 retn Register is different (ebx -> r12d) Challenge 1
  • 14. 14 Challenges1: Inline Expansion/Loop UnrollingChallenge 1 Source code Program1 (Before) Program2 (After) Inline Expansion void my_print(int n){ printf("%d", n); } int main(){ int n = 1; my_print(n); return 0; } <main>: push ebp mov ebp,esp sub esp,0x4 mov DWORD PTR [ebp-0x4],0x1 push DWORD PTR [ebp-0x4] call 804840b <my_print> add esp,0x4 mov eax,0x0 leave ret <main>: push 0x1 push 0x80484e0 push 0x1 call 8048310 <__printf_chk@plt> add esp,0xc xor eax,eax ret Loop Unrolling int main(){ int i; for(i = 0; i < 3; i++){ printf("HelloWorld!"); } return 0; } <main>: push ebp mov ebp,esp sub esp,0x4 mov DWORD PTR [ebp-0x4],0x0 jmp 804842b <main+0x20> push 0x80484c0 call 80482e0 <printf@plt> add esp,0x4 add DWORD PTR [ebp-0x4],0x1 cmp DWORD PTR [ebp-0x4],0x2 jle 804841a <main+0xf> <main>: push 0x80484d0 push 0x1 call 8048310 <__printf_chk@plt> push 0x80484d0 push 0x1 call 8048310 <__printf_chk@plt> push 0x80484d0 push 0x1 call 8048310 <__printf_chk@plt>
  • 15. 15 Challenge 2  Other fixes may have been applied 1. Bug Fixes 2. Refactoring 3. Feature Updates
  • 16. 16 Experiment Overview Dataset  Target Software: OpenSSL 1.0.1 Collected 62 Security Fixes  Cluster Analysis  Hierarchical Clustering Algorithm Extract security fix patterns which could be used to support the semi-automated patch diffing GOAL
  • 17. 17 Data Collection Method [1/3]  OpenSSL 1.0.1 (git / 4675a56 (openssl 1.0.1 stable) CVE- ID Type Hash value CVE- 2012- 2110 Before Patched d36e0ee460f41d6b64015 455c4f5414a319865c3 After Patched 8d5505d099973a06781b7 e0e5b65861859a7d994 CVE- 2016- 6304 Before Patched 151adf2e5cc23284a059e0 f155505006a1c9fad9 After Patched 2c0d295e26306e15a92eb 23a84a1802005c1c137 @@ -260,7 +265,11 @@ int BN_dec2bn(BIGNUM **bn, const char *a) a++; } - for (i = 0; isdigit((unsigned char)a[i]); i++) ; + for (i = 0; i <= (INT_MAX/4) && isdigit((unsigned char)a[i]); i++) + continue; + + if (i > INT_MAX/4) Compile/Disassemble before & after patched source code Diff the source code & Identify the patched part (Function) Analyze Commit Log & Release note* *OpenSSL 1.0.1 Series Release Notes https://web.archive.org/web/20170208161711/https://www.openssl.org/news/openssl-1.0.1-notes.html STEP 1 STEP 2 STEP 3
  • 18. 18 STEP 1 STEP 2 Extract the increased instructions Normalize the instructions After Normalization mov mov cmp jne test je push jmp lea pop … trans trans cmp jump cmp jump stack jump lea stack … # of occurrences (each instruction) trans: 2 cmp: 2 jump: 3 stack:2 lea: 1 … Before Normalization Count the number of occurrences of each normalized instruction trans trans cmp jump cmp jump stack jump lea stack … STEP 3 mov mov cmp jne test je push jmp lea pop … Before Patched After Patched push mov sub jmp … push mov sub + mov + mov jmp … Data Collection Method [2/3]  Feature Extraction Method Normalized Instructions Increased Instructions
  • 19. 19 Normalized Instruction Type of Instruction Target Instuctions jump Branch jns, jle, jne, jge, jae, jmp, js, jl, je, jg, ja, jb, jbe trans Data Transfer movxz, mov, movsx, xchg, cdq ctrans Conditional Data Transfer cmovge, cmovae, cmovs, cmovns, cmove, cmovne stack Stack Manipulation push,pop logical Logical Operation and, xor, or, not arith Arithmetic Operation sub, add, imul, neg, adc nop No Operation nop bop Bit/Byte Operation bt, setne, sete shift Shift Operation shr, shl, sar func Function Operation call, ret str String Operation repz * cmp Comparison test, cmp lea Address Computation lea  Opcode (Instruction) Normalization  Summarized and expressed the instructions that fall into similar categories by one (normalized) instruction  e.g.) Branch instruction such as jns,jle,jne,jge are normalized as “jump”  Normalized the instructions which appeared in the security fixes (Function) Data Collection Method [3/3]
  • 20. 20 Cluster Analysis [1/2] Divides data into groups that are meaningful/useful Before Clustering After Clustering Cluster 1 Cluster 2 Cluster 3
  • 21. 21 Cluster Analysis [2/2]  Hierarchical Clustering Algorithm  Produce a classification in which small clusters of very similar data points are nested within larger clusters of less closely-related data points*  e.g.) Agglomerative Hierarchical Clustering  Non-Hierarchical Clustering Algorithm  Generates a classification by partitioning dataset*  e.g.) K-means Clustering *Hierarchical and non-Hierarchical Clustering https://www.daylight.com/meetings/mug96/barnard/E-MUG95.html Divides data into groups that are meaningful/useful
  • 22. 22 Cluster Analysis [2/2]  Hierarchical Clustering Algorithm  Produce a classification in which small clusters of very similar data points are nested within larger clusters of less closely-related data points*  e.g.) Agglomerative Hierarchical Clustering *Hierarchical and non-Hierarchical Clustering https://www.daylight.com/meetings/mug96/barnard/E-MUG95.html Divides data into groups that are meaningful/useful Euclidean Distance & Ward’s Method d(A,B) = E(AUB) – E(P) – E(Q) Cluster A Cluster B
  • 23. 23 CWE [1/2] Classic Buffer Over flow  CWE (Common Weakness Enumeration)  List of Software Weakness Types  Gives a unique identifier (CWE-ID) to each types  e.g, CWE-120:Buffer Copy without Checking Size of Input  Latest Version:3.4 / Total 808 Weaknesses. https://cwe.mitre.org/data/definitions/120.html Used as a label for each data
  • 24. 24 CWE [2/2] CWE organizes a wide variety of weakness types in a hierarchical structure *CWE Overview, IPA https://www.ipa.go.jp/security/english/vuln/CWE_en.html  The weakness types at higher levels in the structure gives a more abstract and broader concept*  Structure Types  Development Concepts  Research Concepts  Architectural Concepts [Parent] CWE-119 -> [Child] CWE-120
  • 25. 25 CWE [2/2] CWE organizes a wide variety of weakness types in a hierarchical structure *CWE Overview, IPA https://www.ipa.go.jp/security/english/vuln/CWE_en.html  The weakness types at higher levels in the structure gives a more abstract and broader concept*  Structure Types  Development Concepts  Research Concepts  Architectural Concepts CWE-ID Description CWE-17 Code CWE-19 Data Processing Error CWE-254 Security Features CWE-361 Time and State CWE-398 Indicator of Poor Code Quality CWE-399 Resource Management Errors Used these root CWE-IDs as labels [Parent] CWE-119 -> [Child] CWE-120
  • 26. 26 Result ② ① Cluster 2 Cluster 1 Dendrogram CVE-ID + Label (CWE-ID)
  • 27. 27 Details of the Cluster 1 CWE-ID CVE-ID Feature Vectors (Normalized Instruction:# of Occurrences) CWE-19 CVE-2016-6306 jump: 9, trans: 7, cmp: 6, lea: 4, arith: 3, func: 1 CWE-19 CVE-2016-0797 jump: 7, lea: 4, cmp: 4, trans: 2, arith: 2 CWE-19 CVE-2015-0206 jump: 7, trans: 5, func: 4, stack: 4, cmp: 4, arith: 2, lea: 1 CWE-19 CVE-2014-3508 jump: 9, trans: 5, cmp: 5, stack: 4, nop: 3, lea: 2, arith: 2, bop: 1, func: 1 CWE-398 CVE-2014-5139 jump: 7, stack: 4, cmp: 3, logical: 2, trans: 2, nop: 2, func': 1 Most of the labels are CWE-19 (Data Processing Error)  Most of the vulnerabilities in this cluster are related to the memory or value manipulation error, which was not initially expected by developers • e.g.) Out-of-bounds read, info-leak, integer overflow)  A certain number of Comparison/Branch/Arithmetic Operation instructions exist Summary
  • 28. 28 CVE-2016-0797 Patched Part of CVE-2016-0797 @@ -190,11 +189,7 @@ int BN_hex2bn(BIGNUM **bn, } + for (i = 0; i <= (INT_MAX/4) && isxdigit((unsigned char)a[i]); i++) + continue; + + if (i > INT_MAX/4) + goto err; - for (i = 0; isxdigit((unsigned char)a[i]); i++) ; Added a check to confirm the integer value is under the expected upper limit Integer Overflow Vulnerability Will be used in bn_expand(ret, i*4)
  • 29. 29 CWE-ID CVE-ID Feature Vectors: (Normalized Instruction:# of Occurrences) CWE-254 CVE-2015-1793 trans: 4, jump: 3, ctrans: 1, nop: 1, func: 1 CWE-254 CVE-2014-3567 trans: 4, jump: 2, func: 1 CWE-254 CVE-2014-3470 trans: 4, jump: 2, nop: 2, lea: 1, func: 1, cmp: 1 CWE-254 CVE-2015-0205 trans: 5, func: 1 CWE-254 CVE-2014-0224 trans: 5, jump: 3, logical: 2, cmp: 1 CWE-19 CVE-2014-0195 trans: 6, jump: 3, cmp: 1 Details of the Cluster 2 Most of the labels are CWE-254 (Security Features)  Most of the security fixes for the vulnerabilities in this cluster contain some sort of error handling function  A certain number of Data Transfer/Branch/Function related instructions exist Summary
  • 30. 30 CVE-2014-3470 Patched Part of CVE-2014-3470 @@ -2512,13 +2512,6 @@ int ssl3_send_client_key_exchange int field_size = 0; + if (s->session->sess_cert == NULL) + { + ssl3_send_alert(s,SSL3_AL_FATAL, SSL_AD_UNEXPECTED_MESSAGE); + SSLerr(SSL_F_SSL3_SEND_CLIENT_KEY_EXCHANGE, SSL_R_UNEXPECTED_MESSAGE); + goto err; + } Added two error handling function + exit the function NULL Pointer Dereference
  • 31. 31 Discussions  Why Only Two Clusters?  Some vulnerabilities are found in multiple functions  Similar functions contain same vulnerability  How to Improve  Include other features such as function name?  Collect more security fixes  Use vulnerability corpus generation tools? (e.g, LAVA)  Use other machine learning techniques  For Semi-Automated Patch-Diffing  Calculate the similarity between the extracted security fix patterns (instructions) and the difference (increased instructions) found by the patch diffing *LAVA: Large-scale Automated Vulnerability Addition https://www.andreamambretti.com/files/papers/oakland2016_lava.pdf We count the number of occurrences of normalized instructions Centroid
  • 32. Classifying Security Fixes and Other Fixes PART 2
  • 33. 33 Classifying Security Fixes and Other Fixes  Dataset  OpenSSL 1.0.1 (62 Security Fixes / 377 Other fixes)  Classification Method  Supervised Linear Classifier  Soft Margin Support Vector Machine (SVM)  Kernel: RBF (C=10, γ = 0.001)  Experiment  Used 62 Security Fixes and 62 Other fixes (Random sampling)  Conducted 10-fold Cross-Validation 3 times • Perform random sampling for each cross-validation  Environment  OS: Ubuntu 14.04, Compiler: gcc 5.4.0
  • 34. 34 Support Vector Machine (SVM) Method used for classification (+regression) tasks Before After A A A A B B BB BB B A BA B A A A A A B B BB BB B A BA B A A A
  • 35. 35 Result Type of Fix Dataset 1 Dataset 2 Dataset 3 Average Accuracy 0.62 0.54 0.54 0.56 Precision Security Fix 0.70 0.57 0.57 0.61 Other 0.59 0.54 0.54 0.55 Recall Security Fix 0.42 0.49 0.42 0.41 Other 0.82 0.71 0.68 0.73 F-Score Security Fix 0.53 0.46 0.44 0.47 Other 0.68 0.61 0.63 0.64
  • 36. 36 Summary of Result  Summary  Overall Accuracy : 56% (average)  Security Fixes: Precision 61% / Recall 41% (average)  Other Fixes: Precision 55% / Recall 73% (average)  Discussions  Use other metrics? (e.g., Cyclomatic complexity) Accuracy Ratio of the number of correctly labeled fixes to the number of all fixes in the dataset Precision Ratio of the number of correctly labeled security fixes to the number of all fixes labeled as “security fix” by the program Recall Ratio of the number of correctly labeled security fixes to the number of all security fixes in the dataset F-Score Harmonic mean of the Precision and Recall Glossary
  • 37. 37 Summary & Conclusion  Patch diffing is still a difficult task because it requires a deep knowledge and experience ✔  Extracted security fix patterns which could be used to support the semi-automated patch diffing  Conducted an experiment to see if it is possible to distinguish between security fixes and other fixes ✔ ✔ Provided insights for future research related to the semi-automated patch diffing
  • 38. 38 Appendix [1/3]  Original Paper Asuka Nakajima, Ren Kimura, Yuhei Kawakoya, Makoto Iwamura, Takeo Hariu, “An Investigation of Method to Assist Identification of Patched Part of the Vulnerable Software Based on Patch Diffing” Multimedia, Distributed, Cooperative, and Mobile Symposium, June 2017, Japan Download URL https://ipsj.ixsq.nii.ac.jp/ej/?action=repository_uri&item_id=190132&file _id=1&file_no=1
  • 39. 39 Appendix [2/3]  Other Research (1)  Asuka Nakajima, Takuya Watanabe, Eitaro Shioji, Mitsuaki Akiyama, and Maverick Woo “A Pilot Study on Consumer IoT Device Vulnerability Disclosure and Patch Release in Japan and the United States” Proceedings of the 14th ACM ASIA Conference on Information, Computer and Communications Security (ASIA CCS 2019)  [PDF] https://www.cylab.cmu.edu/_files/pdfs/tech_reports/CMUCyLab19001.pdf Revealed Significant 1-Day Risk Related to IoT ASIA CCS 2019 Example: CVE-2017-7852 Patch Release Timeline DCS-932L RevA 2015/Nov/18 DCS-932L RevA 2016/Jul/19 244 Days Vendor: D-Link, Product: Network Camera 1-Day Risk: Unsynchronized Patch Release (Geographical Arbitrage)
  • 40. 40 Appendix [3/3]  Other Activities (Female InfoSec Community)  CTF for GIRLS: http://girls.seccon.jp (Twitter:@ctf4g)  Asuka Nakajima, Suhee Kang, Hazel Yen, “Women in Security: Building a Female InfoSec Community in Korea, Japan, and Taiwan”, BlackHatUSA 2019 Women-Only CTF Workshop Talk about Asian Female InfoSec Community