Malware Detection Using Machine Learning Techniques

Submitted To: Maam Tahira Mehboob
Presented By:
Anum Nisa
Sumaiya Arshad
MAY 18, 2016 | Machine Learning

ABOUT MALWARE & ITS
DETECTION TECHNIQUES:
INTODUCTION:

ABOUT MALWARE & ITS DETECTION
TECHNIQUES:
Malware is …
Malicious software
Virus, Spam, …
Increasing threats
*Continuous and increased attacks on infra-
structure
*Threats to business, national security & personal
security of PCs
Attacks are becoming more advanced and
sophisticated!

MALWARE Executables
Host vs Network based approaches
Limitation of existing techniques
-Signature-based approach
* Fails to detect zero-day attacks.
* Fails to detect threats with evolving capabilities
such as metamorphic and polymorphic malwa
re.
-Anomaly-based approach
*Producing high false positive rate.
-Supervised Learning based approach
*Poor performance on new and evolving malware
*Building classifier model is challenging due to
diversity of malware classes, imbalanced
distribution, data imperfection issues, etc.

Red Hocks (Viruses)

Our Goal
Machine Learning based approach
-Two level:
*Supervised learning approach to detect malicious
flows and further identify specific type
*Combine unsupervised learning with supervised
learning to address new class discovery problem

Two level malware detection framework:
Macro-level classifier
Used to isolate malicious flows from the non
-malicious ones.
Micro-level classifier
Further categorize the malicious flows into
one of the preexisting malware or new
malware
Proposed Framework

Proposed Framework Block diagram

Classification Process
Machine learning, data mining, and text classification &
detection methods to detect Malicious Executable
includes:
Classifies Unknown or Malicious using
 ML alogorithms
Random Forest Classifier
Boosted J 48 decision tree
KNN, naïvebayes, SVM, Multilayer
Perceptron MLP
Mal-ID Basic Detection Algorithm
Both the Bayes network and random forest
classifiers produced more accurate readings.
But boosted Decision Tree (J48) is best classifier

Experimental Evaluation
Our Analysis Shows that among three major foms of
viruses such as computer viruses, Internet worms
and Trojan horses the most dangerous is trojans

ANALYSIS

ANALYSIS
This section will introduce analysis techniques for mobile
and PCs malware. It will transfer well known techniques
from the common computer world to the platforms of
mobile devices.
The main idea of dynamic analysis is executing a given
sample in a controlled environment, monitoring its behavior,
and obtaining information about its nature and purpose.
This is especially important in the field of malware research
because a malware analyst must be able to assess a program’s
threat and create proper counter-measures.
While static analysis might provide more precise results, the
sheer mass of newly emerging malware each day makes it
impossible to conduct a static analysis for even a small
portion of today’s malware.

ANALYSIS Of PARAMETERS:
To analyze malware detection techniques s
ome evaluation parameters are used to detec
t quality
factors (NonFunctional Requirements) :
Category/Type of Virus
Detection Techniques
Algorithm/ Technology/ Mechanism
Best Classification methodology
Evaluation criterion
Implementation Tools

J48 is an extension of ID3.
The additional features of J48 are:
accounting for missing values,
decision trees pruning,
continuous attribute value ranges,
derivation of rules, etc.
In the WEKA data mining tool, J48 is an
open source Java implementation of the
C4.5 algorithm.
Boosted J 48 Decision Tree

Boosted J 48 Decision Tree

Conclusion:
We proposed an effective malware detection framework
based on data mining & machine learning techniques:
 Two level ML based classifier
 New class detection
 Encrypted data
A tree based kernel for SVM was proposed to handle the
data imperfection issue in network flow data
And Boosted J 48 decision tree classifier is analysized as
best classifier among no of different classifiers

Conclusion Contd:
However this paper shows the comparison of efficiency
rate of different malware detection techniques
including KNN, Naives Bayes, J 48 boosted, SVM
(Support Vector Machine).
We explain the feasibility of some detection methods a
nd highlight the major causes of increasing no of
malware files, but more research is necessary.

Future Works
Develop a hierarchical multi-class learning
method to enhance the testing efficiency when
the number of malware classes becomes
extremely large.
Detection (of malware) accuracy can be
improved, through further research into
classification algorithms and ways to mark
malware data more accurately.
And most of the classifiers used are not
optimized for hardware operations or
applications. Additionally hardware algorithm
design can increase precision or accuracy and
efficiency.

Extra
 Metamorphic malware is rewritten with each iteration so
that each succeeding version of thecode is different from
the preceding one. The code changes makes it difficult for
signature-based antivirus software programs to recognize
that different iterations are the same malicious program.
 Polymorphic malware also makes changes to code to avoid
detection. It has two parts, but one part remains the same
with each iteration, which makes the malware a little easier
to identify.
 an you imagine that a piece of malware code can change its
shape and signature each time it appears, to make it
extremely hard for signature based antivirus to detect them
?! This is called Polymorphic or Metamorphic malware.

 software. Trojans can be employed by cyber-thieves and
hackers trying to gain access to users' systems. Users are
typically tricked by some form of social engineering into
loading and executing Trojans on their systems. Once
activated, Trojans can enable cyber-criminals to spy on you,
steal your sensitive data, and gain backdoor access to your
system. These actions can include:
 Deleting data
 Blocking data
 Modifying data
 Copying data
 Disrupting the performance of computers or computer networks

Malware Detection Using Machine Learning Techniques

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Malware Detection Using Machine Learning Techniques

Ähnlich wie Malware Detection Using Machine Learning Techniques (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Malware Detection Using Machine Learning Techniques