SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
http://www.free-powerpoint-templates-design.com
Malware Detection using
Machine learning
&
Deep Learning
Minh Đức + Đình Phúc
CyRadar Team at SBC 2019
Disclaimer: This topic is about Machine Learning & Deep Learning
Contents
1. Reality
2. In Research
3. In CyRadar
4. Demo
5. Conclusion
6. Q&A
1. Reality
over 350,000 new malware per day
- is a very big threat in today’s computing
world.
- continues to grow in volume and evolve in
complexity.
- a lot of malware generator.
- The number of websites distributing the malware
is increasing at an alarming rate and is getting out of control.
Malware
- Signature-based: code, hash, behavior, rules,...
Malware detection
Advantages Disadvantages
High accurancy Unable to detect new malware.
Easy to bypass.
Require update database frequenly.
Rely on human expertise in creating
the signatures
*A Theoretical Feature-wise Study of Malware Detection Techniques
2. In Research
1 Malware Detection using Machine Learning and Deep Learning | Hemant
Rathore, Swati Agarwal, Sanjay K. Sahay and Mohit Sewak BITS, Pilani |
Dept. of CS & IS, Goa Campus, Goa, India | 4 Apr 2019.
2 Malware Detection using Windows Api Sequence and Machine Learning |
Chandrasekar Ravi, R Manoharan | Chandrasekar Ravi, R Manoharan |
Department of Computer Science and Engineering, Pondicherry
Engineering College,Pillaichavady, Puducherry - 605014, India | April 2012
3 DeepSign: Deep Learning for Automatic Malware Signature Generation and
Classification | Eli (Omid) David | Dept. of Computer Science Bar-Ilan
University | 23 Nov 2017
4 DeepAM: a heterogeneous deep learning framework for intelligent
malware detection | Yanfang Ye1 · Lingwei Chen1 · Shifu Hou1 · William
Hardy1 · Xin Li | 12 May 2016
5 Behavior-based features model for malware detection | Hisham Shehata
Galal1 · Yousef Bassyouni Mahdy1 · Mohammed Ali Atiea1 | 12 December
2014
6 A Fast Malware Detection Algorithm Based on Objective-Oriented
Association Mining | Yuxin Ding, Xuebing Yuan, Ke Tang, Xiao Xiao, Yibin
Zhang | 19 January 2013
Machine learning principle
Training phase
Detection phase
Extract features
Benign/malware
Training
Predictive model
Predictive model
Unknow
Model decision
Dataset:
• VirusTotal.
• Windows API library.
• VxHeavens website.
• Malicia project.
• ...
small, outdate data.
Malware detection
Static analysis Dynamic analysis
Features:
• Raw Byte.
• Strings.
• Header
• Metadata
• Entropy
• Opcodes
• ….
Features:
• API calls
• Resource usage
• Ports
• Host
• Arguments
• …..
Malware detection
Static analysis Dynamic analysis
Advantages:
• Allows malicious files to be detected
prior to execution.
• Easy to run.
• Fast identification.
Advantages:
• Detecting unconceived types of malware
attacks.
• Detecting the polymorphic malwares.
Malware detection
Static analysis Dynamic analysis
Disadvantages:
• Failing to detect the polymorphic
malwares.
• Each model per sub-type.
• Mistaken for encryption, fileless
malwares,...
Disadvantages:
• Hard to extract feartures.
• Storage complexity for behavioral patterns.
• Time complexity.
Algorithms:
• Supervised learning:
• Decision tree.
• Random forest.
• Logistic Regression.
• SVM.
• Deep Learning
• ...
• Unsupervised learning:
• KNN
• A lot of algorithms have good
results. (> 90%)
• Random forest has best
results.
1. CrowdStrike
2. Cylance
3. Endgame
4. MAX
5. Trapmine
6. SeintinelOne
7. Sophos ML
The AV Industry
3. In CyRadar
PE32 files
push xor …...... call jm
0.125 0.23 ….. 0.345 0.098
0.071 0.123 …. 0.32 0.22
Opcode frequency models
• Binary classification problem
• Static analysis
Opcode
is the portion of a machine language instruction
that specifies what operation is to be performed by
the central processing unit (CPU).
Step 1: Collect data:
• Download pages
• Window's system files.
• Virustotal.
Benign
Malware
Step 1: Collect data.
Step 2: Data cleaning:
1. Remove dupplicated files.
2. Verify with virustotal's API.
Source Number of files
Crawl from download pages 10899
Windows 7 4804
Windows 8 7768
Windows 10 8394
Virustotal 44984
Benign Malware
31865 44984
Step 1: Collect data.
Step 2: Data cleaning.
Step 3: Extract features:
1. Disassembly files.
2. Calculate opcode's frequency.
• Features matrix:
63730 files X 1230 opcode
mov push …. xor and
120 150 ... 100 30
065 12 ... 239 123
Step 1: Collect data.
Step 2: Preprocessing.
Step 3: Extract features.
Step 4: Data preprocessing:
1. Variance threshold (0.1)
2. Remove NANs
• Features matrix:
50388 files X 681 opcode
• Reduce ~45% features
Step 1: Collect data.
Step 2: Preprocessing.
Step 3: Extract features.
Step 4: Data preprocessing:
1. Variance threshold (0.1)
2. Calculate opcode percentage.
3. Remove NANs.
4. Standardize features.
Step 1: Collect data.
Step 2: Preprocessing.
Step 3: Extract features.
Step 4: Dimension reduction.
Step 5: Training:
1. Split train-test data:
• Train: (45349, 681)
• Test : (5039, 681)
2. Try with algorithms:
• Random forest.
• SVM
• Linear regression
• Neural network (9 layers)
Random forest
Neural network
Step 1: Collect data.
Step 2: Preprocessing.
Step 3: Extract features.
Step 4: Dimension reduction.
Step 5: Training.
Step 6: Evaluate models:
Step 1: Collect data.
Step 2: Preprocessing.
Step 3: Extract features.
Step 4: Dimension reduction.
Step 5: Training.
Step 6: Evaluate models:
• Testset: ~5000 files:
• ~2900 malware
• ~2100 benign
Algorithm precision recall
Random Forest
(Machine Learning)
98%
(1996/2037)
97%
(2037/2100)
Deep learning
(DNN 9 Layers)
96%
(1955/2037)
97%
(2037/2100)
4. Demo
5. Conclusion
1. Malware is continues to grow in volume
and evolve in complexity.
2. Traditional approaches is less effective to detect
new malware.
3. There are a lot of research using ML & DL to detect
malware.
4. Industries are trying to apply in to the real world
products.
Internet
Shield
Advanced
Threat
Detection
Web
Email
DNS
EDR
EDR Integrated to
Threat Intelligence Platform
6. Q&A

Weitere ähnliche Inhalte

Was ist angesagt?

Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxnishanth kurush
 
Malware- Types, Detection and Future
Malware- Types, Detection and FutureMalware- Types, Detection and Future
Malware- Types, Detection and Futurekaranwayne
 
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...Sam Bowne
 
Common malware and countermeasures
Common malware and countermeasuresCommon malware and countermeasures
Common malware and countermeasuresNoushin Ahson
 
Malware Detection Approaches using Data Mining Techniques.pptx
Malware Detection Approaches using Data Mining Techniques.pptxMalware Detection Approaches using Data Mining Techniques.pptx
Malware Detection Approaches using Data Mining Techniques.pptxAlamgir Hossain
 
Machine Learning in Malware Detection
Machine Learning in Malware DetectionMachine Learning in Malware Detection
Machine Learning in Malware DetectionKaspersky
 
Intrusion prevention system(ips)
Intrusion prevention system(ips)Intrusion prevention system(ips)
Intrusion prevention system(ips)Papun Papun
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detectionChong-Kuan Chen
 
Vulnerability Assessment and Penetration Testing Report
Vulnerability Assessment and Penetration Testing Report Vulnerability Assessment and Penetration Testing Report
Vulnerability Assessment and Penetration Testing Report Rishabh Upadhyay
 
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...Jowin John Chemban
 
Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques Akash Karwande
 
Artificial Intelligence and Cybersecurity
Artificial Intelligence and CybersecurityArtificial Intelligence and Cybersecurity
Artificial Intelligence and CybersecurityOlivier Busolini
 
AES-Advanced Encryption Standard
AES-Advanced Encryption StandardAES-Advanced Encryption Standard
AES-Advanced Encryption StandardPrince Rachit
 
Topics in network security
Topics in network securityTopics in network security
Topics in network securityNasir Bhutta
 

Was ist angesagt? (20)

malware analysis
malware  analysismalware  analysis
malware analysis
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
 
Malware- Types, Detection and Future
Malware- Types, Detection and FutureMalware- Types, Detection and Future
Malware- Types, Detection and Future
 
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
 
Common malware and countermeasures
Common malware and countermeasuresCommon malware and countermeasures
Common malware and countermeasures
 
Malware Detection Approaches using Data Mining Techniques.pptx
Malware Detection Approaches using Data Mining Techniques.pptxMalware Detection Approaches using Data Mining Techniques.pptx
Malware Detection Approaches using Data Mining Techniques.pptx
 
Burp suite
Burp suiteBurp suite
Burp suite
 
Machine Learning in Malware Detection
Machine Learning in Malware DetectionMachine Learning in Malware Detection
Machine Learning in Malware Detection
 
Kali linux os
Kali linux osKali linux os
Kali linux os
 
Intrusion prevention system(ips)
Intrusion prevention system(ips)Intrusion prevention system(ips)
Intrusion prevention system(ips)
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
 
Vulnerability Assessment and Penetration Testing Report
Vulnerability Assessment and Penetration Testing Report Vulnerability Assessment and Penetration Testing Report
Vulnerability Assessment and Penetration Testing Report
 
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
 
PPT steganography
PPT steganographyPPT steganography
PPT steganography
 
Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques
 
Wireshark
WiresharkWireshark
Wireshark
 
Artificial Intelligence and Cybersecurity
Artificial Intelligence and CybersecurityArtificial Intelligence and Cybersecurity
Artificial Intelligence and Cybersecurity
 
AES-Advanced Encryption Standard
AES-Advanced Encryption StandardAES-Advanced Encryption Standard
AES-Advanced Encryption Standard
 
Topics in network security
Topics in network securityTopics in network security
Topics in network security
 
Nessus Basics
Nessus BasicsNessus Basics
Nessus Basics
 

Ähnlich wie Machine Learning & Deep Learning for Effective Malware Detection

Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLSiva krishnam raju Patsamatla
 
NextGen Endpoint Security for Dummies
NextGen Endpoint Security for DummiesNextGen Endpoint Security for Dummies
NextGen Endpoint Security for DummiesAtif Ghauri
 
How to build corporate size fraud prevention
How to build corporate size fraud preventionHow to build corporate size fraud prevention
How to build corporate size fraud preventionYury Leonychev
 
Toward revealing Advanced Persistence Threats in your organization - Public
Toward revealing Advanced Persistence Threats in your organization - PublicToward revealing Advanced Persistence Threats in your organization - Public
Toward revealing Advanced Persistence Threats in your organization - PublicCharles Lim
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsRahul Mohandas
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsRahul Mohandas
 
How to build corporate size fraud prevention
How to build corporate size fraud preventionHow to build corporate size fraud prevention
How to build corporate size fraud preventionRakuten Group, Inc.
 
Hunting: Defense Against The Dark Arts
Hunting: Defense Against The Dark ArtsHunting: Defense Against The Dark Arts
Hunting: Defense Against The Dark ArtsSpyglass Security
 
Paper sharing_Edge based intrusion detection for IOT devices
Paper sharing_Edge based intrusion detection for IOT devicesPaper sharing_Edge based intrusion detection for IOT devices
Paper sharing_Edge based intrusion detection for IOT devicesYOU SHENG CHEN
 
Malware Collection and Analysis via Hardware Virtualization
Malware Collection and Analysis via Hardware VirtualizationMalware Collection and Analysis via Hardware Virtualization
Malware Collection and Analysis via Hardware VirtualizationTamas K Lengyel
 
Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!Distil Networks
 
Why Johnny Still Can’t Pentest: A Comparative Analysis of Open-source Black-...
Why Johnny Still Can’t Pentest:  A Comparative Analysis of Open-source Black-...Why Johnny Still Can’t Pentest:  A Comparative Analysis of Open-source Black-...
Why Johnny Still Can’t Pentest: A Comparative Analysis of Open-source Black-...Rana Khalil
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in ISISACA Riyadh
 
Adversarial machine learning for av software
Adversarial machine learning for av softwareAdversarial machine learning for av software
Adversarial machine learning for av softwarejunseok seo
 
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...grecsl
 
Hunting: Defense Against The Dark Arts v2
Hunting: Defense Against The Dark Arts v2Hunting: Defense Against The Dark Arts v2
Hunting: Defense Against The Dark Arts v2Spyglass Security
 
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedThreat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedFalgun Rathod
 
Malware collection and analysis
Malware collection and analysisMalware collection and analysis
Malware collection and analysisChong-Kuan Chen
 

Ähnlich wie Machine Learning & Deep Learning for Effective Malware Detection (20)

Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using ML
 
CV
CVCV
CV
 
NextGen Endpoint Security for Dummies
NextGen Endpoint Security for DummiesNextGen Endpoint Security for Dummies
NextGen Endpoint Security for Dummies
 
How to build corporate size fraud prevention
How to build corporate size fraud preventionHow to build corporate size fraud prevention
How to build corporate size fraud prevention
 
Toward revealing Advanced Persistence Threats in your organization - Public
Toward revealing Advanced Persistence Threats in your organization - PublicToward revealing Advanced Persistence Threats in your organization - Public
Toward revealing Advanced Persistence Threats in your organization - Public
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
How to build corporate size fraud prevention
How to build corporate size fraud preventionHow to build corporate size fraud prevention
How to build corporate size fraud prevention
 
Hunting: Defense Against The Dark Arts
Hunting: Defense Against The Dark ArtsHunting: Defense Against The Dark Arts
Hunting: Defense Against The Dark Arts
 
Paper sharing_Edge based intrusion detection for IOT devices
Paper sharing_Edge based intrusion detection for IOT devicesPaper sharing_Edge based intrusion detection for IOT devices
Paper sharing_Edge based intrusion detection for IOT devices
 
Malware Collection and Analysis via Hardware Virtualization
Malware Collection and Analysis via Hardware VirtualizationMalware Collection and Analysis via Hardware Virtualization
Malware Collection and Analysis via Hardware Virtualization
 
Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!
 
Why Johnny Still Can’t Pentest: A Comparative Analysis of Open-source Black-...
Why Johnny Still Can’t Pentest:  A Comparative Analysis of Open-source Black-...Why Johnny Still Can’t Pentest:  A Comparative Analysis of Open-source Black-...
Why Johnny Still Can’t Pentest: A Comparative Analysis of Open-source Black-...
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in IS
 
Adversarial machine learning for av software
Adversarial machine learning for av softwareAdversarial machine learning for av software
Adversarial machine learning for av software
 
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
Malware Analysis 101: N00b to Ninja in 60 Minutes at BSidesDC on October 19, ...
 
Hunting: Defense Against The Dark Arts v2
Hunting: Defense Against The Dark Arts v2Hunting: Defense Against The Dark Arts v2
Hunting: Defense Against The Dark Arts v2
 
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedThreat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
 
Malware collection and analysis
Malware collection and analysisMalware collection and analysis
Malware collection and analysis
 
The artificial reality of cyber defense
The artificial reality of cyber defenseThe artificial reality of cyber defense
The artificial reality of cyber defense
 

Mehr von Security Bootcamp

Ransomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfRansomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfSecurity Bootcamp
 
Hieupc-The role of psychology in enhancing cybersecurity
Hieupc-The role of psychology in enhancing cybersecurityHieupc-The role of psychology in enhancing cybersecurity
Hieupc-The role of psychology in enhancing cybersecuritySecurity Bootcamp
 
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s viewNguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s viewSecurity Bootcamp
 
Sbc 2020 bao gio vn co anm dua vao cong nghe mo
Sbc 2020 bao gio vn co anm dua vao cong nghe moSbc 2020 bao gio vn co anm dua vao cong nghe mo
Sbc 2020 bao gio vn co anm dua vao cong nghe moSecurity Bootcamp
 
Giam sat thu dong thong tin an toan hang hai su dung sdr
Giam sat thu dong thong tin an toan hang hai su dung sdrGiam sat thu dong thong tin an toan hang hai su dung sdr
Giam sat thu dong thong tin an toan hang hai su dung sdrSecurity Bootcamp
 
Insider threat-what-us-do d-want
Insider threat-what-us-do d-wantInsider threat-what-us-do d-want
Insider threat-what-us-do d-wantSecurity Bootcamp
 
Macro malware common techniques - public
Macro malware   common techniques - publicMacro malware   common techniques - public
Macro malware common techniques - publicSecurity Bootcamp
 
Tim dieu moi trong nhung dieu cu
Tim dieu moi trong nhung dieu cuTim dieu moi trong nhung dieu cu
Tim dieu moi trong nhung dieu cuSecurity Bootcamp
 
Threat detection with 0 cost
Threat detection with 0 costThreat detection with 0 cost
Threat detection with 0 costSecurity Bootcamp
 
GOLDEN TICKET - Hiểm hoa tiềm ẩn trong hệ thống Active Directory
GOLDEN TICKET -  Hiểm hoa tiềm ẩn trong hệ thống Active DirectoryGOLDEN TICKET -  Hiểm hoa tiềm ẩn trong hệ thống Active Directory
GOLDEN TICKET - Hiểm hoa tiềm ẩn trong hệ thống Active DirectorySecurity Bootcamp
 
PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018
PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018
PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018Security Bootcamp
 
Lannguyen-Detecting Cyber Attacks
Lannguyen-Detecting Cyber AttacksLannguyen-Detecting Cyber Attacks
Lannguyen-Detecting Cyber AttacksSecurity Bootcamp
 
Letrungnghia-gopyluananm2018
Letrungnghia-gopyluananm2018Letrungnghia-gopyluananm2018
Letrungnghia-gopyluananm2018Security Bootcamp
 
Cyber Attacks on Financial _ Vikjava
Cyber Attacks on Financial _ VikjavaCyber Attacks on Financial _ Vikjava
Cyber Attacks on Financial _ VikjavaSecurity Bootcamp
 

Mehr von Security Bootcamp (20)

Ransomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfRansomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdf
 
Hieupc-The role of psychology in enhancing cybersecurity
Hieupc-The role of psychology in enhancing cybersecurityHieupc-The role of psychology in enhancing cybersecurity
Hieupc-The role of psychology in enhancing cybersecurity
 
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s viewNguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
 
Sbc 2020 bao gio vn co anm dua vao cong nghe mo
Sbc 2020 bao gio vn co anm dua vao cong nghe moSbc 2020 bao gio vn co anm dua vao cong nghe mo
Sbc 2020 bao gio vn co anm dua vao cong nghe mo
 
Deception change-the-game
Deception change-the-gameDeception change-the-game
Deception change-the-game
 
Giam sat thu dong thong tin an toan hang hai su dung sdr
Giam sat thu dong thong tin an toan hang hai su dung sdrGiam sat thu dong thong tin an toan hang hai su dung sdr
Giam sat thu dong thong tin an toan hang hai su dung sdr
 
Sbc2019 luong-cyber startup
Sbc2019 luong-cyber startupSbc2019 luong-cyber startup
Sbc2019 luong-cyber startup
 
Insider threat-what-us-do d-want
Insider threat-what-us-do d-wantInsider threat-what-us-do d-want
Insider threat-what-us-do d-want
 
Macro malware common techniques - public
Macro malware   common techniques - publicMacro malware   common techniques - public
Macro malware common techniques - public
 
Tim dieu moi trong nhung dieu cu
Tim dieu moi trong nhung dieu cuTim dieu moi trong nhung dieu cu
Tim dieu moi trong nhung dieu cu
 
Threat detection with 0 cost
Threat detection with 0 costThreat detection with 0 cost
Threat detection with 0 cost
 
Build SOC
Build SOC Build SOC
Build SOC
 
AD red vs blue
AD red vs blueAD red vs blue
AD red vs blue
 
Securitybox
SecurityboxSecuritybox
Securitybox
 
GOLDEN TICKET - Hiểm hoa tiềm ẩn trong hệ thống Active Directory
GOLDEN TICKET -  Hiểm hoa tiềm ẩn trong hệ thống Active DirectoryGOLDEN TICKET -  Hiểm hoa tiềm ẩn trong hệ thống Active Directory
GOLDEN TICKET - Hiểm hoa tiềm ẩn trong hệ thống Active Directory
 
PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018
PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018
PHÂN TÍCH MỘT SỐ CUỘC TẤN CÔNG APT ĐIỂN HÌNH NHẮM VÀO VIỆT NAM 2017-2018
 
Api security-present
Api security-presentApi security-present
Api security-present
 
Lannguyen-Detecting Cyber Attacks
Lannguyen-Detecting Cyber AttacksLannguyen-Detecting Cyber Attacks
Lannguyen-Detecting Cyber Attacks
 
Letrungnghia-gopyluananm2018
Letrungnghia-gopyluananm2018Letrungnghia-gopyluananm2018
Letrungnghia-gopyluananm2018
 
Cyber Attacks on Financial _ Vikjava
Cyber Attacks on Financial _ VikjavaCyber Attacks on Financial _ Vikjava
Cyber Attacks on Financial _ Vikjava
 

Kürzlich hochgeladen

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Machine Learning & Deep Learning for Effective Malware Detection

  • 1. http://www.free-powerpoint-templates-design.com Malware Detection using Machine learning & Deep Learning Minh Đức + Đình Phúc CyRadar Team at SBC 2019
  • 2. Disclaimer: This topic is about Machine Learning & Deep Learning
  • 3. Contents 1. Reality 2. In Research 3. In CyRadar 4. Demo 5. Conclusion 6. Q&A
  • 5. over 350,000 new malware per day
  • 6. - is a very big threat in today’s computing world. - continues to grow in volume and evolve in complexity. - a lot of malware generator. - The number of websites distributing the malware is increasing at an alarming rate and is getting out of control. Malware
  • 7. - Signature-based: code, hash, behavior, rules,... Malware detection Advantages Disadvantages High accurancy Unable to detect new malware. Easy to bypass. Require update database frequenly. Rely on human expertise in creating the signatures
  • 8. *A Theoretical Feature-wise Study of Malware Detection Techniques
  • 10. 1 Malware Detection using Machine Learning and Deep Learning | Hemant Rathore, Swati Agarwal, Sanjay K. Sahay and Mohit Sewak BITS, Pilani | Dept. of CS & IS, Goa Campus, Goa, India | 4 Apr 2019. 2 Malware Detection using Windows Api Sequence and Machine Learning | Chandrasekar Ravi, R Manoharan | Chandrasekar Ravi, R Manoharan | Department of Computer Science and Engineering, Pondicherry Engineering College,Pillaichavady, Puducherry - 605014, India | April 2012 3 DeepSign: Deep Learning for Automatic Malware Signature Generation and Classification | Eli (Omid) David | Dept. of Computer Science Bar-Ilan University | 23 Nov 2017 4 DeepAM: a heterogeneous deep learning framework for intelligent malware detection | Yanfang Ye1 · Lingwei Chen1 · Shifu Hou1 · William Hardy1 · Xin Li | 12 May 2016 5 Behavior-based features model for malware detection | Hisham Shehata Galal1 · Yousef Bassyouni Mahdy1 · Mohammed Ali Atiea1 | 12 December 2014 6 A Fast Malware Detection Algorithm Based on Objective-Oriented Association Mining | Yuxin Ding, Xuebing Yuan, Ke Tang, Xiao Xiao, Yibin Zhang | 19 January 2013
  • 11. Machine learning principle Training phase Detection phase Extract features Benign/malware Training Predictive model Predictive model Unknow Model decision
  • 12. Dataset: • VirusTotal. • Windows API library. • VxHeavens website. • Malicia project. • ... small, outdate data.
  • 13. Malware detection Static analysis Dynamic analysis Features: • Raw Byte. • Strings. • Header • Metadata • Entropy • Opcodes • …. Features: • API calls • Resource usage • Ports • Host • Arguments • …..
  • 14. Malware detection Static analysis Dynamic analysis Advantages: • Allows malicious files to be detected prior to execution. • Easy to run. • Fast identification. Advantages: • Detecting unconceived types of malware attacks. • Detecting the polymorphic malwares.
  • 15. Malware detection Static analysis Dynamic analysis Disadvantages: • Failing to detect the polymorphic malwares. • Each model per sub-type. • Mistaken for encryption, fileless malwares,... Disadvantages: • Hard to extract feartures. • Storage complexity for behavioral patterns. • Time complexity.
  • 16. Algorithms: • Supervised learning: • Decision tree. • Random forest. • Logistic Regression. • SVM. • Deep Learning • ... • Unsupervised learning: • KNN • A lot of algorithms have good results. (> 90%) • Random forest has best results.
  • 17. 1. CrowdStrike 2. Cylance 3. Endgame 4. MAX 5. Trapmine 6. SeintinelOne 7. Sophos ML The AV Industry
  • 19. PE32 files push xor …...... call jm 0.125 0.23 ….. 0.345 0.098 0.071 0.123 …. 0.32 0.22 Opcode frequency models • Binary classification problem • Static analysis
  • 20. Opcode is the portion of a machine language instruction that specifies what operation is to be performed by the central processing unit (CPU).
  • 21. Step 1: Collect data: • Download pages • Window's system files. • Virustotal. Benign Malware
  • 22. Step 1: Collect data. Step 2: Data cleaning: 1. Remove dupplicated files. 2. Verify with virustotal's API. Source Number of files Crawl from download pages 10899 Windows 7 4804 Windows 8 7768 Windows 10 8394 Virustotal 44984 Benign Malware 31865 44984
  • 23. Step 1: Collect data. Step 2: Data cleaning. Step 3: Extract features: 1. Disassembly files. 2. Calculate opcode's frequency. • Features matrix: 63730 files X 1230 opcode mov push …. xor and 120 150 ... 100 30 065 12 ... 239 123
  • 24. Step 1: Collect data. Step 2: Preprocessing. Step 3: Extract features. Step 4: Data preprocessing: 1. Variance threshold (0.1) 2. Remove NANs • Features matrix: 50388 files X 681 opcode • Reduce ~45% features
  • 25. Step 1: Collect data. Step 2: Preprocessing. Step 3: Extract features. Step 4: Data preprocessing: 1. Variance threshold (0.1) 2. Calculate opcode percentage. 3. Remove NANs. 4. Standardize features.
  • 26. Step 1: Collect data. Step 2: Preprocessing. Step 3: Extract features. Step 4: Dimension reduction. Step 5: Training: 1. Split train-test data: • Train: (45349, 681) • Test : (5039, 681) 2. Try with algorithms: • Random forest. • SVM • Linear regression • Neural network (9 layers) Random forest Neural network
  • 27. Step 1: Collect data. Step 2: Preprocessing. Step 3: Extract features. Step 4: Dimension reduction. Step 5: Training. Step 6: Evaluate models:
  • 28. Step 1: Collect data. Step 2: Preprocessing. Step 3: Extract features. Step 4: Dimension reduction. Step 5: Training. Step 6: Evaluate models: • Testset: ~5000 files: • ~2900 malware • ~2100 benign Algorithm precision recall Random Forest (Machine Learning) 98% (1996/2037) 97% (2037/2100) Deep learning (DNN 9 Layers) 96% (1955/2037) 97% (2037/2100)
  • 30. 5. Conclusion 1. Malware is continues to grow in volume and evolve in complexity. 2. Traditional approaches is less effective to detect new malware. 3. There are a lot of research using ML & DL to detect malware. 4. Industries are trying to apply in to the real world products.