Malware viruses can be easily detected using machine learning Techniques such as K-Mean Algorithms, KNN algorithm, Boosted J48 Decision Tree and other Data Mining Techniques. Among them J48 proved to be more effective in detecting computer virus and upcoming networks worms...
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Malware Detection Using Machine Learning Techniques
1. Submitted To: Maam Tahira Mehboob
Presented By:
Anum Nisa
Sumaiya Arshad
MAY 18, 2016 | Machine Learning
2. ABOUT MALWARE & ITS
DETECTION TECHNIQUES:
INTODUCTION:
MAY 18, 2016 | Machine Learning
3. ABOUT MALWARE & ITS DETECTION
TECHNIQUES:
Malware is …
Malicious software
Virus, Spam, …
Increasing threats
*Continuous and increased attacks on infra-
structure
*Threats to business, national security & personal
security of PCs
Attacks are becoming more advanced and
sophisticated!
MAY 18, 2016 | Machine Learning
4. MALWARE Executables
Host vs Network based approaches
Limitation of existing techniques
-Signature-based approach
* Fails to detect zero-day attacks.
* Fails to detect threats with evolving capabilities
such as metamorphic and polymorphic malwa
re.
-Anomaly-based approach
*Producing high false positive rate.
-Supervised Learning based approach
*Poor performance on new and evolving malware
*Building classifier model is challenging due to
diversity of malware classes, imbalanced
distribution, data imperfection issues, etc.
MAY 18, 2016 | Machine Learning
6. Our Goal
Machine Learning based approach
-Two level:
*Supervised learning approach to detect malicious
flows and further identify specific type
*Combine unsupervised learning with supervised
learning to address new class discovery problem
MAY 18, 2016 | Machine Learning
7. Two level malware detection framework:
Macro-level classifier
Used to isolate malicious flows from the non
-malicious ones.
Micro-level classifier
Further categorize the malicious flows into
one of the preexisting malware or new
malware
Proposed Framework
MAY 18, 2016 | Machine Learning
9. Classification Process
Machine learning, data mining, and text classification &
detection methods to detect Malicious Executable
includes:
Classifies Unknown or Malicious using
ML alogorithms
Random Forest Classifier
Boosted J 48 decision tree
KNN, naïvebayes, SVM, Multilayer
Perceptron MLP
Mal-ID Basic Detection Algorithm
Both the Bayes network and random forest
classifiers produced more accurate readings.
But boosted Decision Tree (J48) is best classifier
MAY 18, 2016 | Machine Learning
10. Experimental Evaluation
Our Analysis Shows that among three major foms of
viruses such as computer viruses, Internet worms
and Trojan horses the most dangerous is trojans
MAY 18, 2016 | Machine Learning
12. ANALYSIS
This section will introduce analysis techniques for mobile
and PCs malware. It will transfer well known techniques
from the common computer world to the platforms of
mobile devices.
The main idea of dynamic analysis is executing a given
sample in a controlled environment, monitoring its behavior,
and obtaining information about its nature and purpose.
This is especially important in the field of malware research
because a malware analyst must be able to assess a program’s
threat and create proper counter-measures.
While static analysis might provide more precise results, the
sheer mass of newly emerging malware each day makes it
impossible to conduct a static analysis for even a small
portion of today’s malware.
MAY 18, 2016 | Machine Learning
13. ANALYSIS Of PARAMETERS:
To analyze malware detection techniques s
ome evaluation parameters are used to detec
t quality
factors (NonFunctional Requirements) :
Category/Type of Virus
Detection Techniques
Algorithm/ Technology/ Mechanism
Best Classification methodology
Evaluation criterion
Implementation Tools
MAY 18, 2016 | Machine Learning
14.
15. J48 is an extension of ID3.
The additional features of J48 are:
accounting for missing values,
decision trees pruning,
continuous attribute value ranges,
derivation of rules, etc.
In the WEKA data mining tool, J48 is an
open source Java implementation of the
C4.5 algorithm.
Boosted J 48 Decision Tree
MAY 18, 2016 | Machine Learning
16. Boosted J 48 Decision Tree
MAY 18, 2016 | Machine Learning
17. Conclusion:
We proposed an effective malware detection framework
based on data mining & machine learning techniques:
Two level ML based classifier
New class detection
Encrypted data
A tree based kernel for SVM was proposed to handle the
data imperfection issue in network flow data
And Boosted J 48 decision tree classifier is analysized as
best classifier among no of different classifiers
MAY 18, 2016 | Machine Learning
18. Conclusion Contd:
However this paper shows the comparison of efficiency
rate of different malware detection techniques
including KNN, Naives Bayes, J 48 boosted, SVM
(Support Vector Machine).
We explain the feasibility of some detection methods a
nd highlight the major causes of increasing no of
malware files, but more research is necessary.
MAY 18, 2016 | Machine Learning
20. Future Works
Develop a hierarchical multi-class learning
method to enhance the testing efficiency when
the number of malware classes becomes
extremely large.
Detection (of malware) accuracy can be
improved, through further research into
classification algorithms and ways to mark
malware data more accurately.
And most of the classifiers used are not
optimized for hardware operations or
applications. Additionally hardware algorithm
design can increase precision or accuracy and
efficiency.
MAY 18, 2016 | Machine Learning
23. Extra
Metamorphic malware is rewritten with each iteration so
that each succeeding version of thecode is different from
the preceding one. The code changes makes it difficult for
signature-based antivirus software programs to recognize
that different iterations are the same malicious program.
Polymorphic malware also makes changes to code to avoid
detection. It has two parts, but one part remains the same
with each iteration, which makes the malware a little easier
to identify.
an you imagine that a piece of malware code can change its
shape and signature each time it appears, to make it
extremely hard for signature based antivirus to detect them
?! This is called Polymorphic or Metamorphic malware.
24. software. Trojans can be employed by cyber-thieves and
hackers trying to gain access to users' systems. Users are
typically tricked by some form of social engineering into
loading and executing Trojans on their systems. Once
activated, Trojans can enable cyber-criminals to spy on you,
steal your sensitive data, and gain backdoor access to your
system. These actions can include:
Deleting data
Blocking data
Modifying data
Copying data
Disrupting the performance of computers or computer networks