SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Submitted By:
Prashant Chopra
Tapesh Kumar
Shaweta Bhadwal
Sakshi Saini
Ujjalson Preet Singh
Payal Sharma
• With the rapid development of the Internet,
malware became one of the major cyber threats
nowadays.
• Any software performing malicious actions,
including information stealing, espionage, etc.
can be referred to as malware. Kaspersky Labs
(2017) define malware as “a type of computer
program designed to infect a legitimate user's
computer and inflict harm on it in multiple
ways.”
• Its capability doesn’t only extend to
compromising computers, to destroy data or
make them useless, but can also steal the
secure details like credit card numbers, bank
account and distribute the information to the
programmer without the user’s knowledge.
• Attackers exploit vulnerabilities in web services,
browsers and operating systems, or use social
engineering techniques to make users run the
• A security practitioner is not only
interested in how accurate a learning
system performs, but also needs to
understand how such performance is
achieved – a requirement not satisfied
by many “black-box” applications of
machine learning. In this section we
supplement our proposed methodology
and provide a procedure for explaining
classification results obtained using our
method.
• To develop the proof of concept for the
machine learning based malware
classification based on Cuckoo
Sandbox.
• To determine the best feature
representation method and how the
features should be extracted, the most
accurate algorithm that can distinguish
• While the diversity of malware is increasing, anti-virus scanners cannot fulfill
the needs of protection, resulting in millions of hosts being attacked.
• According to Kaspersky Labs (2016), 6,563,145 different hosts were attacked,
and 4,000,000 unique malware objects were detected in 2015.
• There is a decrease in the skill level that is required for malware
development, due to the high availability of attacking tools on the Internet
nowadays.
• High availability of anti-detection techniques, as well as ability to buy malware
on the black market result in the opportunity to become an attacker for
anyone, not depending on the skill level.
1. To propose a framework for Malware Classification System (MCS) to analyse
malware behavior dynamically using a concept of information theory and a
machine learning technique.
2. To extract behavioral patterns from execution reports of malware in terms of its
features and generates a data repository.
3. To select the most promising features using information theory based concepts
• Malware, or malicious software, is any
program or file that is harmful to a computer
user.
• These malicious programs can perform a
variety of functions, including stealing,
encrypting or deleting sensitive data, altering
or hijacking core computing functions and
monitoring users' computer activity without
their permission.
• Malware includes computer viruses, Worms,
Trojan horse, Spyware etc.
• A. Viruses
• B. Worms
• C. Trojan Horse
• D. Spyware
• E. Adware:
• F. Backdoors
• G. Key logger
• H. Ransom ware
• Effectively capture knowledge of
the malware to represent.
• The representation can enable
classifiers to efficiently and
effectively correlate data across
large number of objects.
• Malicious software is classified into
families, each family originating
from a single source base and
exhibiting a set of consistent
behaviors.
• Malware analysis is a process of
identifying malware behaviour, what
they are doing, what they want, and
what their main goals are.
• Malware analysis involves a complex
process in its activity. Forensics,
reverse engineering, disassembly,
debugging, these activities take a lot of
time in the progress.
• The goal of malware analysis is to gain
an understanding of how a malware
works, so that we can protect our
organization by preventing malware
attacks.
• Analysing malicious software without executing it is called static analysis.
• The detection patterns used in static analysis include:
1. String signature
2. Byte-sequence n-grams
3. Syntactic library call
4. Control flow graph
5. Opcode (operational code) frequency distribution etc..
The executable has to be unpacked and decrypted before doing static
analysis.
•Tools for unpacked/decrypt
1. Disassembler/Debugger tools: IDA Pro and OllyDbg: which provide a lot
of insight into what the malware is doing and provide patterns to identify the
attackers.
2. Memory dumper tools: LordPE and OllyDump: used to obtain protected
code located in the system’s memory and dump it to a file.
• Dynamic malware analysis is known as the analysis of infected file during its
execution. During the process, infected files are analysed in simulated
environment, something like a virtual machine. After that malware researchers
use certain tools like the System Analyzer, Process Explorer, etc. to identify the
general behaviour of the particular file. In the process, the file is detected after
executing it in actual environment and during the execution of file its system
interaction, its behaviour and effect on the system are observed.
• The advantage of dynamic analysis is that it accurately analyses the known as
well as unknown malware however; this analysis technique is more time
consuming. It necessitates as much time as to prepare the environment for
malware analysis such as a virtual machine environment.
STATIC ANALYSIS
1. Fast and safe
2. Good in analyzing the mul-tipath
malware (Global View)
3. Can't analyze the obfuscated and
polymorphic
4. Can't detect new, unknown malware
5. Low level of false positive (accuracy
is high)
DYNAMIC ANALYSIS
1. Time Consuming and vul-nerable
2. Difficult to analyze the mul-tipath
malware
3. Can analyze the obfuscated and
polymorphic
4. Detectknown as well as unknown
malware
5. High level of false positive (accuracy
is low)
• Binary Collection
Maltrieve Installation
• Dynamic Analysis
Cuckoo Sandbox Installation
• Analytics
Feature Extraction
• Classification
Machine Learning Algorithm
• Label
Labelling of Malware
• Final Result
Evaluation of Algorithm
• Maltrieve originated as a fork of mwcrawler. It retrieves malware directly from the
sources as listed at a number of sites. Currently we crawl the following:
• Malc0de
• Malware Domain List
• Malware URLs
• VX Vault
• URLquery
• CleanMX
• ZeusTracker
Malware binaries are collected via
honeypots and spam-traps, and
malware family labels are generated
by running an anti-virus tool on each
binary.
To assess behavioural patterns
shared by instances of the same
malware family, the behaviour of
each binary is monitored in a
sandbox environment and behavior-
based analysis reports summarizing
operations, such as opening an
outgoing IRC connection or stopping
• VirusTotal is a free service that
analyses suspicious files and URLs
and facilitates the quick detection of
viruses, worms, Trojans, and all kinds
of malware.
• VirusTotal is a free online service that
analyses files and URLs enabling the
identification of viruses, worms,
Trojans and other kinds of malicious
content detected by antivirus engines
and website scanners.
• It may be used as a means to detect
false positives, i.e. innocuous
resources detected as malicious by
• Cuckoo is a malware sandboxing utility
which has practical applications of the
dynamical analysis approach. Instead of
statically analyzing the binary file, it gets
executed and monitored in real time.
• Cuckoo is an open source automated
malware analysis system that allows you
to perform analysis on sandboxed
malware.
• Cuckoo Sandbox started as a Google
Summer of Code project in 2010 within
the Honeynet Project. After the initial
work during the summer of 2010, the first
beta release was published on February
5th, 2011, when Cuckoo was publicly
announced and distributed for the first
time.
Cuckoo is designed for use in analyzing the following
kinds of files:
• Generic Windows executables
• DLL files
• PDF documents
• Microsoft Office documents
• URLs
• PHP scripts
• Almost everything else
• Traces of win32 API calls performed
by all processes spawned by the
Malware
• Files being created, deleted, and
downloaded by the malware during
its execution
• Memory dumps of the malware
processes
• Network traffic trace in PCAP format
• Screenshots of the Windows desktop
taken during the execution of
the malware
• Full memory dumps of the machines
• The process of extracting data
from the files is called feature
extraction.
• The goal of feature extraction is
to obtain a set of informative and
non-redundant data. It is
essential to understand that
features should represent the
important and relevant
information about our dataset
since without it we cannot make
an accurate prediction.
• Excessive amount of raw features available(image classification,
spam detection).
• Learning algorithms are already well defined.
• No Machine Learning algorithm can perform table without feature
extraction but if features are extracted well, even linear methods
show great results.
• Companies invest in feature extraction pipeline.
• Various machine learning approaches like Association Rule, Support Vector
Machine, Decision Tree, Random Forest, Naive Bayes and Clustering have
been proposed for detecting and classifying unknown samples into either
known malware families or underline those samples that exhibit unseen
behavior, for detailed analysis.
• The basic idea of any machine learning task is to train the model, based on
some algorithm, to perform a certain task: classification, clusterization,
regression, etc.
Data intake Data transformation Model Training
Model testing
Model deployment
Test Dataset
Machine
Learning
Workflow
Process
1. Data intake: At first, the dataset is loaded from the file and is saved in
memory.
2. Data transformation: Data that was loaded at step 1 is transformed,
cleared, & normalized so that it lies in the same range, has the same
format, etc. and feature extraction and selection has done. Further,data
is separated into sets – ‘training set’ and ‘test set’.
3. Model Training. At this stage, a model is built using the selected
algorithm.
4.Model Testing. The model that was built or trained during step 3 is tested
using the test data set, and the produced result is used for building a new
model, that would consider previous models, i.e. “learn” from them.
5. Model Deployment. At this stage, the best model is selected (either after
the defined number of iteration or as soon as the needed result is achieved).
• K-Nearest Neighbours (KNN) is one of the simplest, though, accurate
machine learning algorithms. KNN is a non-parametric algorithm, meaning
that it does not make any assumptions about the data structure.
• In real world problems, data rarely obeys the general theoretical
assumptions, making non-parametric algorithms a good solution for such
problems.
• KNN model representation is as simple as the dataset – there is no learning
required, the entire training set is stored.
• KNN can be used for both classification and regression problems.
• In Support Vector Machines (SVM) the term ‘support vectors’ refers to the
points lying closest to the hyperplane, that would change the hyperplane
position if removed. The distance between the support vector and the
hyperplane is referred to as margin.
• The further from the hyperplane our classes lie, the more accurate predictions
we can make. That is why, although multiple hyperplanes can be found per
problem, the goal of the SVM algorithm is to find such a hyperplane that
would result in the maximum margins
• Only if we have a method for the users to know a malware when it enters their
system, one can protect or take precaution.
• With all the anti-virus packages available currently, still the malware finds its way
into our personal computer.
• Signature-based antivirus products are able to detect only those malwares that
has already caused damage and are registered.
• The reports generated by dynamic analysis can be compiled into behavioural
profiles that can be clustered to combine samples with similar behaviour into
coherent families.
• The machine learning technologies that are being used in detecting and
classifying malwares are not adequate to handle challenges arising from the
huge amount of dynamic and severely imbalanced network data.

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Malware Analysis
Introduction to Malware AnalysisIntroduction to Malware Analysis
Introduction to Malware AnalysisAndrew McNicol
 
Cyber Security Vulnerabilities
Cyber Security VulnerabilitiesCyber Security Vulnerabilities
Cyber Security VulnerabilitiesSiemplify
 
Application Security - Your Success Depends on it
Application Security - Your Success Depends on itApplication Security - Your Success Depends on it
Application Security - Your Success Depends on itWSO2
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxnishanth kurush
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detectionChong-Kuan Chen
 
Network security (vulnerabilities, threats, and attacks)
Network security (vulnerabilities, threats, and attacks)Network security (vulnerabilities, threats, and attacks)
Network security (vulnerabilities, threats, and attacks)Fabiha Shahzad
 
OWASP Top 10 2021 What's New
OWASP Top 10 2021 What's NewOWASP Top 10 2021 What's New
OWASP Top 10 2021 What's NewMichael Furman
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection systemAparna Bhadran
 
Classification of vulnerabilities
Classification of vulnerabilitiesClassification of vulnerabilities
Classification of vulnerabilitiesMayur Mehta
 
Intrusion prevention system(ips)
Intrusion prevention system(ips)Intrusion prevention system(ips)
Intrusion prevention system(ips)Papun Papun
 
The CIA triad.pptx
The CIA triad.pptxThe CIA triad.pptx
The CIA triad.pptxGulnurAzat
 
Overview of the Cyber Kill Chain [TM]
Overview of the Cyber Kill Chain [TM]Overview of the Cyber Kill Chain [TM]
Overview of the Cyber Kill Chain [TM]David Sweigert
 

Was ist angesagt? (20)

Cybersecurity
CybersecurityCybersecurity
Cybersecurity
 
Trojan horse
Trojan horseTrojan horse
Trojan horse
 
Introduction to Malware Analysis
Introduction to Malware AnalysisIntroduction to Malware Analysis
Introduction to Malware Analysis
 
Presentation on Web Attacks
Presentation on Web AttacksPresentation on Web Attacks
Presentation on Web Attacks
 
malware analysis
malware  analysismalware  analysis
malware analysis
 
Cyber Security Vulnerabilities
Cyber Security VulnerabilitiesCyber Security Vulnerabilities
Cyber Security Vulnerabilities
 
Application Security - Your Success Depends on it
Application Security - Your Success Depends on itApplication Security - Your Success Depends on it
Application Security - Your Success Depends on it
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
 
Incident response process
Incident response processIncident response process
Incident response process
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
 
Network security (vulnerabilities, threats, and attacks)
Network security (vulnerabilities, threats, and attacks)Network security (vulnerabilities, threats, and attacks)
Network security (vulnerabilities, threats, and attacks)
 
OWASP Top 10 2021 What's New
OWASP Top 10 2021 What's NewOWASP Top 10 2021 What's New
OWASP Top 10 2021 What's New
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
Classification of vulnerabilities
Classification of vulnerabilitiesClassification of vulnerabilities
Classification of vulnerabilities
 
Intrusion prevention system(ips)
Intrusion prevention system(ips)Intrusion prevention system(ips)
Intrusion prevention system(ips)
 
Malware Detection using Machine Learning
Malware Detection using Machine Learning	Malware Detection using Machine Learning
Malware Detection using Machine Learning
 
Metasploit framwork
Metasploit framworkMetasploit framwork
Metasploit framwork
 
The CIA triad.pptx
The CIA triad.pptxThe CIA triad.pptx
The CIA triad.pptx
 
Computer Worms
Computer WormsComputer Worms
Computer Worms
 
Overview of the Cyber Kill Chain [TM]
Overview of the Cyber Kill Chain [TM]Overview of the Cyber Kill Chain [TM]
Overview of the Cyber Kill Chain [TM]
 

Ähnlich wie Malware Classification and Analysis

Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLSiva krishnam raju Patsamatla
 
Introduction to cyber security
Introduction to cyber securityIntroduction to cyber security
Introduction to cyber securityGeevarghese Titus
 
What Are The Types of Malware? Must Read
What Are The Types of Malware? Must ReadWhat Are The Types of Malware? Must Read
What Are The Types of Malware? Must ReadBytecode Security
 
Malware Detection By Machine Learning Presentation.pptx
Malware Detection By Machine Learning  Presentation.pptxMalware Detection By Machine Learning  Presentation.pptx
Malware Detection By Machine Learning Presentation.pptxalishapatidar2021
 
Ethical Hacking justvamshi .pptx
Ethical Hacking justvamshi          .pptxEthical Hacking justvamshi          .pptx
Ethical Hacking justvamshi .pptxvamshimatangi
 
Advanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementAdvanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementMayur Nanotkar
 
Web hacking 1.0
Web hacking 1.0Web hacking 1.0
Web hacking 1.0Q Fadlan
 
CH1- Introduction to malware analysis-v2.pdf
CH1- Introduction to malware analysis-v2.pdfCH1- Introduction to malware analysis-v2.pdf
CH1- Introduction to malware analysis-v2.pdfWajdiElhamzi3
 
NextGen Endpoint Security for Dummies
NextGen Endpoint Security for DummiesNextGen Endpoint Security for Dummies
NextGen Endpoint Security for DummiesAtif Ghauri
 
CHAPTER 1 MALWARE ANALYSIS PRIMER.pdf
CHAPTER 1 MALWARE ANALYSIS PRIMER.pdfCHAPTER 1 MALWARE ANALYSIS PRIMER.pdf
CHAPTER 1 MALWARE ANALYSIS PRIMER.pdfManjuAppukuttan2
 
A malware detection method for health sensor data based on machine learning
A malware detection method for health sensor data based on machine learningA malware detection method for health sensor data based on machine learning
A malware detection method for health sensor data based on machine learningjaigera
 
Vulnerability assessment and penetration testing
Vulnerability assessment and penetration testingVulnerability assessment and penetration testing
Vulnerability assessment and penetration testingAbu Sadat Mohammed Yasin
 
Anti-tampering in Android and Take Look at Google SafetyNet Attestation API
Anti-tampering in Android and Take Look at Google SafetyNet Attestation APIAnti-tampering in Android and Take Look at Google SafetyNet Attestation API
Anti-tampering in Android and Take Look at Google SafetyNet Attestation APIArash Ramez
 
VMI based malware detection in virtual environment
VMI based malware detection in virtual environmentVMI based malware detection in virtual environment
VMI based malware detection in virtual environmentAyush Gargya
 
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedThreat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedFalgun Rathod
 

Ähnlich wie Malware Classification and Analysis (20)

Design and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using MLDesign and Development of an Efficient Malware Detection Using ML
Design and Development of an Efficient Malware Detection Using ML
 
Introduction to cyber security
Introduction to cyber securityIntroduction to cyber security
Introduction to cyber security
 
Metasploit
MetasploitMetasploit
Metasploit
 
What Are The Types of Malware? Must Read
What Are The Types of Malware? Must ReadWhat Are The Types of Malware? Must Read
What Are The Types of Malware? Must Read
 
Malware Detection By Machine Learning Presentation.pptx
Malware Detection By Machine Learning  Presentation.pptxMalware Detection By Machine Learning  Presentation.pptx
Malware Detection By Machine Learning Presentation.pptx
 
Ethical Hacking justvamshi .pptx
Ethical Hacking justvamshi          .pptxEthical Hacking justvamshi          .pptx
Ethical Hacking justvamshi .pptx
 
Advanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security ManagementAdvanced Persistent Threats (APTs) - Information Security Management
Advanced Persistent Threats (APTs) - Information Security Management
 
savi technical ppt.pptx
savi technical ppt.pptxsavi technical ppt.pptx
savi technical ppt.pptx
 
Web hacking 1.0
Web hacking 1.0Web hacking 1.0
Web hacking 1.0
 
Malware
MalwareMalware
Malware
 
CH1- Introduction to malware analysis-v2.pdf
CH1- Introduction to malware analysis-v2.pdfCH1- Introduction to malware analysis-v2.pdf
CH1- Introduction to malware analysis-v2.pdf
 
Vapt life cycle
Vapt life cycleVapt life cycle
Vapt life cycle
 
Metasploit
MetasploitMetasploit
Metasploit
 
NextGen Endpoint Security for Dummies
NextGen Endpoint Security for DummiesNextGen Endpoint Security for Dummies
NextGen Endpoint Security for Dummies
 
CHAPTER 1 MALWARE ANALYSIS PRIMER.pdf
CHAPTER 1 MALWARE ANALYSIS PRIMER.pdfCHAPTER 1 MALWARE ANALYSIS PRIMER.pdf
CHAPTER 1 MALWARE ANALYSIS PRIMER.pdf
 
A malware detection method for health sensor data based on machine learning
A malware detection method for health sensor data based on machine learningA malware detection method for health sensor data based on machine learning
A malware detection method for health sensor data based on machine learning
 
Vulnerability assessment and penetration testing
Vulnerability assessment and penetration testingVulnerability assessment and penetration testing
Vulnerability assessment and penetration testing
 
Anti-tampering in Android and Take Look at Google SafetyNet Attestation API
Anti-tampering in Android and Take Look at Google SafetyNet Attestation APIAnti-tampering in Android and Take Look at Google SafetyNet Attestation API
Anti-tampering in Android and Take Look at Google SafetyNet Attestation API
 
VMI based malware detection in virtual environment
VMI based malware detection in virtual environmentVMI based malware detection in virtual environment
VMI based malware detection in virtual environment
 
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedThreat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
 

Kürzlich hochgeladen

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 

Kürzlich hochgeladen (20)

Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 

Malware Classification and Analysis

  • 1. Submitted By: Prashant Chopra Tapesh Kumar Shaweta Bhadwal Sakshi Saini Ujjalson Preet Singh Payal Sharma
  • 2. • With the rapid development of the Internet, malware became one of the major cyber threats nowadays. • Any software performing malicious actions, including information stealing, espionage, etc. can be referred to as malware. Kaspersky Labs (2017) define malware as “a type of computer program designed to infect a legitimate user's computer and inflict harm on it in multiple ways.” • Its capability doesn’t only extend to compromising computers, to destroy data or make them useless, but can also steal the secure details like credit card numbers, bank account and distribute the information to the programmer without the user’s knowledge. • Attackers exploit vulnerabilities in web services, browsers and operating systems, or use social engineering techniques to make users run the
  • 3. • A security practitioner is not only interested in how accurate a learning system performs, but also needs to understand how such performance is achieved – a requirement not satisfied by many “black-box” applications of machine learning. In this section we supplement our proposed methodology and provide a procedure for explaining classification results obtained using our method. • To develop the proof of concept for the machine learning based malware classification based on Cuckoo Sandbox. • To determine the best feature representation method and how the features should be extracted, the most accurate algorithm that can distinguish
  • 4. • While the diversity of malware is increasing, anti-virus scanners cannot fulfill the needs of protection, resulting in millions of hosts being attacked. • According to Kaspersky Labs (2016), 6,563,145 different hosts were attacked, and 4,000,000 unique malware objects were detected in 2015. • There is a decrease in the skill level that is required for malware development, due to the high availability of attacking tools on the Internet nowadays. • High availability of anti-detection techniques, as well as ability to buy malware on the black market result in the opportunity to become an attacker for anyone, not depending on the skill level.
  • 5. 1. To propose a framework for Malware Classification System (MCS) to analyse malware behavior dynamically using a concept of information theory and a machine learning technique. 2. To extract behavioral patterns from execution reports of malware in terms of its features and generates a data repository. 3. To select the most promising features using information theory based concepts
  • 6. • Malware, or malicious software, is any program or file that is harmful to a computer user. • These malicious programs can perform a variety of functions, including stealing, encrypting or deleting sensitive data, altering or hijacking core computing functions and monitoring users' computer activity without their permission. • Malware includes computer viruses, Worms, Trojan horse, Spyware etc.
  • 7. • A. Viruses • B. Worms • C. Trojan Horse • D. Spyware • E. Adware: • F. Backdoors • G. Key logger • H. Ransom ware
  • 8. • Effectively capture knowledge of the malware to represent. • The representation can enable classifiers to efficiently and effectively correlate data across large number of objects. • Malicious software is classified into families, each family originating from a single source base and exhibiting a set of consistent behaviors.
  • 9. • Malware analysis is a process of identifying malware behaviour, what they are doing, what they want, and what their main goals are. • Malware analysis involves a complex process in its activity. Forensics, reverse engineering, disassembly, debugging, these activities take a lot of time in the progress. • The goal of malware analysis is to gain an understanding of how a malware works, so that we can protect our organization by preventing malware attacks.
  • 10. • Analysing malicious software without executing it is called static analysis. • The detection patterns used in static analysis include: 1. String signature 2. Byte-sequence n-grams 3. Syntactic library call 4. Control flow graph 5. Opcode (operational code) frequency distribution etc..
  • 11. The executable has to be unpacked and decrypted before doing static analysis. •Tools for unpacked/decrypt 1. Disassembler/Debugger tools: IDA Pro and OllyDbg: which provide a lot of insight into what the malware is doing and provide patterns to identify the attackers. 2. Memory dumper tools: LordPE and OllyDump: used to obtain protected code located in the system’s memory and dump it to a file.
  • 12. • Dynamic malware analysis is known as the analysis of infected file during its execution. During the process, infected files are analysed in simulated environment, something like a virtual machine. After that malware researchers use certain tools like the System Analyzer, Process Explorer, etc. to identify the general behaviour of the particular file. In the process, the file is detected after executing it in actual environment and during the execution of file its system interaction, its behaviour and effect on the system are observed. • The advantage of dynamic analysis is that it accurately analyses the known as well as unknown malware however; this analysis technique is more time consuming. It necessitates as much time as to prepare the environment for malware analysis such as a virtual machine environment.
  • 13. STATIC ANALYSIS 1. Fast and safe 2. Good in analyzing the mul-tipath malware (Global View) 3. Can't analyze the obfuscated and polymorphic 4. Can't detect new, unknown malware 5. Low level of false positive (accuracy is high) DYNAMIC ANALYSIS 1. Time Consuming and vul-nerable 2. Difficult to analyze the mul-tipath malware 3. Can analyze the obfuscated and polymorphic 4. Detectknown as well as unknown malware 5. High level of false positive (accuracy is low)
  • 14. • Binary Collection Maltrieve Installation • Dynamic Analysis Cuckoo Sandbox Installation • Analytics Feature Extraction • Classification Machine Learning Algorithm • Label Labelling of Malware • Final Result Evaluation of Algorithm
  • 15. • Maltrieve originated as a fork of mwcrawler. It retrieves malware directly from the sources as listed at a number of sites. Currently we crawl the following: • Malc0de • Malware Domain List • Malware URLs • VX Vault • URLquery • CleanMX • ZeusTracker
  • 16.
  • 17. Malware binaries are collected via honeypots and spam-traps, and malware family labels are generated by running an anti-virus tool on each binary. To assess behavioural patterns shared by instances of the same malware family, the behaviour of each binary is monitored in a sandbox environment and behavior- based analysis reports summarizing operations, such as opening an outgoing IRC connection or stopping
  • 18. • VirusTotal is a free service that analyses suspicious files and URLs and facilitates the quick detection of viruses, worms, Trojans, and all kinds of malware. • VirusTotal is a free online service that analyses files and URLs enabling the identification of viruses, worms, Trojans and other kinds of malicious content detected by antivirus engines and website scanners. • It may be used as a means to detect false positives, i.e. innocuous resources detected as malicious by
  • 19. • Cuckoo is a malware sandboxing utility which has practical applications of the dynamical analysis approach. Instead of statically analyzing the binary file, it gets executed and monitored in real time. • Cuckoo is an open source automated malware analysis system that allows you to perform analysis on sandboxed malware. • Cuckoo Sandbox started as a Google Summer of Code project in 2010 within the Honeynet Project. After the initial work during the summer of 2010, the first beta release was published on February 5th, 2011, when Cuckoo was publicly announced and distributed for the first time.
  • 20. Cuckoo is designed for use in analyzing the following kinds of files: • Generic Windows executables • DLL files • PDF documents • Microsoft Office documents • URLs • PHP scripts • Almost everything else
  • 21. • Traces of win32 API calls performed by all processes spawned by the Malware • Files being created, deleted, and downloaded by the malware during its execution • Memory dumps of the malware processes • Network traffic trace in PCAP format • Screenshots of the Windows desktop taken during the execution of the malware • Full memory dumps of the machines
  • 22. • The process of extracting data from the files is called feature extraction. • The goal of feature extraction is to obtain a set of informative and non-redundant data. It is essential to understand that features should represent the important and relevant information about our dataset since without it we cannot make an accurate prediction.
  • 23. • Excessive amount of raw features available(image classification, spam detection). • Learning algorithms are already well defined. • No Machine Learning algorithm can perform table without feature extraction but if features are extracted well, even linear methods show great results. • Companies invest in feature extraction pipeline.
  • 24.
  • 25. • Various machine learning approaches like Association Rule, Support Vector Machine, Decision Tree, Random Forest, Naive Bayes and Clustering have been proposed for detecting and classifying unknown samples into either known malware families or underline those samples that exhibit unseen behavior, for detailed analysis. • The basic idea of any machine learning task is to train the model, based on some algorithm, to perform a certain task: classification, clusterization, regression, etc.
  • 26. Data intake Data transformation Model Training Model testing Model deployment Test Dataset Machine Learning Workflow Process
  • 27. 1. Data intake: At first, the dataset is loaded from the file and is saved in memory. 2. Data transformation: Data that was loaded at step 1 is transformed, cleared, & normalized so that it lies in the same range, has the same format, etc. and feature extraction and selection has done. Further,data is separated into sets – ‘training set’ and ‘test set’. 3. Model Training. At this stage, a model is built using the selected algorithm.
  • 28. 4.Model Testing. The model that was built or trained during step 3 is tested using the test data set, and the produced result is used for building a new model, that would consider previous models, i.e. “learn” from them. 5. Model Deployment. At this stage, the best model is selected (either after the defined number of iteration or as soon as the needed result is achieved).
  • 29. • K-Nearest Neighbours (KNN) is one of the simplest, though, accurate machine learning algorithms. KNN is a non-parametric algorithm, meaning that it does not make any assumptions about the data structure. • In real world problems, data rarely obeys the general theoretical assumptions, making non-parametric algorithms a good solution for such problems. • KNN model representation is as simple as the dataset – there is no learning required, the entire training set is stored. • KNN can be used for both classification and regression problems.
  • 30.
  • 31. • In Support Vector Machines (SVM) the term ‘support vectors’ refers to the points lying closest to the hyperplane, that would change the hyperplane position if removed. The distance between the support vector and the hyperplane is referred to as margin. • The further from the hyperplane our classes lie, the more accurate predictions we can make. That is why, although multiple hyperplanes can be found per problem, the goal of the SVM algorithm is to find such a hyperplane that would result in the maximum margins
  • 32.
  • 33. • Only if we have a method for the users to know a malware when it enters their system, one can protect or take precaution. • With all the anti-virus packages available currently, still the malware finds its way into our personal computer. • Signature-based antivirus products are able to detect only those malwares that has already caused damage and are registered. • The reports generated by dynamic analysis can be compiled into behavioural profiles that can be clustered to combine samples with similar behaviour into coherent families. • The machine learning technologies that are being used in detecting and classifying malwares are not adequate to handle challenges arising from the huge amount of dynamic and severely imbalanced network data.