With the emergence of smart devices and the Internet of Things (IoT), millions of users connected to the network produce massive network traffic datasets. These vast datasets of network traffic, Big Data are challenging to store, deal with and analyse using a single computer. In this paper we developed parallel implementation using a High Performance Computer (HPC) for the Non-Negative Matrix Factorization technique as an engine for an Intrusion Detection System (HPC-NMF-IDS). The large IoT traffic datasets of order of millions samples are distributed evenly on all the computing cores for both storage and speedup purpose. The distribution of computing tasks involved in the Matrix Factorization takes into account the reduction of the communication cost between the computing cores. The experiments we conducted on the proposed HPC-IDS-NMF give better results than the traditional ML-based intrusion detection systems. We could train the HPC model with datasets of one million samples in only 31 seconds instead of the 40 minutes using one processor), that is a speed up of 87 times. Moreover, we have got an excellent detection accuracy rate of 98% for KDD dataset.
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
COPYRIGHT
This thesis is copyright materials protected under the Berne Convection, the copyright Act 1999 and other international and national enactments in that behalf, on intellectual property. It may not be reproduced by any means in full or in part except for short extracts in fair dealing so for research or private study, critical scholarly review or discourse with acknowledgment, with written permission of the Dean School of Graduate Studies on behalf of both the author and XXX XXX University.ABSTRACT
With Fast growing internet world the risk of intrusion has also increased, as a result Intrusion Detection System (IDS) is the admired key research field. IDS are used to identify any suspicious activity or patterns in the network or machine, which endeavors the security features or compromise the machine. IDS majorly use all the features of the data. It is a keen observation that all the features are not of equal relevance for the detection of attacks. Moreover every feature does not contribute in enhancing the system performance significantly. The main aim of the work done is to develop an efficient denial of service network intrusion classification model. The specific objectives included: to analyse existing literature in intrusion detection systems; what are the techniques used to model IDS, types of network attacks, performance of various machine learning tools, how are network intrusion detection systems assessed; to find out top network traffic attributes that can be used to model denial of service intrusion detection; to develop a machine learning model for detection of denial of service network intrusion.Methods: The research design was experimental and data was collected by simulation using NSL-KDD dataset. By implementing Correlation Feature Selection (CFS) mechanism using three search algorithms, a smallest set of features is selected with all the features that are selected very frequently. Findings: The smallest subset of features chosen is the most nominal among all the feature subset found. Further, the performances using Artificial neural networks(ANN), decision trees, Support Vector Machines (SVM) and K-Nearest Neighbour (KNN) classifiers is compared for 7 subsets found by filter model and 41 attributes. Results: The outcome indicates a remarkable improvement in the performance metrics used for comparison of the two classifiers. The results show that using 17/18 selected features improves DOS types classification accuracies as compared to using the 41 features in the NSL-KDD dataset. It was further observed that using an ensemble of three classifiers with decision fusion performs better as compared to using a single classifier for DOS type’s classification. Among machine learning tools experimented, ANN achieved best classification accuracies followed by SVM and DT. KNN registered the lowest classification accuracies. Application: The proposed work with such an improved detection rate and lesser classification time and lar.
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
Machine learning-based intrusion detection system for detecting web attacksIAESIJAI
The increasing use of smart devices results in a huge amount of data, which raises concerns about personal data, including health data and financial data. This data circulates on the network and can encounter network traffic at any time. This traffic can either be normal traffic or an intrusion created by hackers with the aim of injecting abnormal traffic into the network. Firewalls and traditional intrusion detection systems detect attacks based on signature patterns. However, this is not sufficient to detect advanced or unknown attacks. To detect different types of unknown attacks, the use of intelligent techniques is essential. In this paper, we analyse some machine learning techniques proposed in recent years. In this study, several classifications were made to detect anomalous behaviour in network traffic. The models were built and evaluated based on the Canadian Institute for Cybersecurity-intrusion detection systems dataset released in 2017 (CIC-IDS-2017), which includes both current and historical attacks. The experiments were conducted using decision tree, random forest, logistic regression, gaussian naïve bayes, adaptive boosting, and their ensemble approach. The models were evaluated using various evaluation metrics such as accuracy, precision, recall, F1-score, false positive rate, receiver operating characteristic curve, and calibration curve.
A novel signature based traffic classification engine to reduce false alarms ...IJCNCJournal
Pattern matching plays a significant role in ascertaining network attacks and the foremost prerequisite for a trusted intrusion detection system (IDS) is accurate pattern matching. During the pattern matching process packets are scanned against a pre-defined rule sets. After getting scanned, the packets are marked as alert or benign by the detection system. Sometimes the detection system generates false alarms i.e., good traffic being identified as bad traffic. The ratio of generating the false positives varies from the performance of the detection engines used to scan incoming packets. Intrusion detection systems use to deploy algorithmic procedures to reduce false positives though producing a good number of false alarms. As the necessities, we have been working on the optimization of the algorithms and procedures so that false positives can be reduced to a great extent. As an effort we have proposed a signature-based traffic classification technique that can categorize the incoming packets based on the traffic characteristics and behaviour which would eventually reduce the rate of false alarms
Through the generalization of deep learning, the research community has addressed critical challenges in
the network security domain, like malware identification and anomaly detection. However, they have yet to
discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often
limited in memory and processing power, rendering the compute-intensive deep learning environment
unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the
deep learning pipeline and using raw packet data as input. We introduce a feature- engineering-less
machine learning (ML) process to perform malware detection on IoT devices. Our proposed model,”
Feature engineering-less ML (FEL-ML),” is a lighter-weight detection algorithm that expends no extra
computations on “engineered” features. It effectively accelerates the low-powered IoT edge. It is trained
on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional
feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added
benefit of eliminating the significant investment by subject matter experts in feature engineering.
EFFICIENT ATTACK DETECTION IN IOT DEVICES USING FEATURE ENGINEERING-LESS MACH...ijcsit
Through the generalization of deep learning, the research community has addressed critical challenges in
the network security domain, like malware identification and anomaly detection. However, they have yet to
discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often
limited in memory and processing power, rendering the compute-intensive deep learning environment
unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the
deep learning pipeline and using raw packet data as input. We introduce a feature- engineering-less
machine learning (ML) process to perform malware detection on IoT devices. Our proposed model,”
Feature engineering-less ML (FEL-ML),” is a lighter-weight detection algorithm that expends no extra
computations on “engineered” features. It effectively accelerates the low-powered IoT edge. It is trained
on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional
feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added
benefit of eliminating the significant investment by subject matter experts in feature engineering.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Outstanding to the promotion of the Internet and local networks, interruption occasions to computer
systems are emerging. Intrusion detection systems are becoming progressively vital in retaining
appropriate network safety. IDS is a software or hardware device that deals with attacks by gathering
information from a numerous system and network sources, then evaluating signs of security complexities.
Enterprise networked systems are unsurprisingly unprotected to the growing threats posed by hackers as
well as malicious users inside to a network. IDS technology is one of the significant tools used now-a-days,
to counter such threat. In this research we have proposed framework by using advance feature selection
and dimensionality reduction technique we can reduce IDS data then applying Fuzzy ARTMAP classifier
we can find intrusions so that we get accurate results within less time. Feature selection, as an active
research area in decreasing dimensionality, eliminating unrelated data, developing learning correctness,
and improving result unambiguousness.
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
COPYRIGHT
This thesis is copyright materials protected under the Berne Convection, the copyright Act 1999 and other international and national enactments in that behalf, on intellectual property. It may not be reproduced by any means in full or in part except for short extracts in fair dealing so for research or private study, critical scholarly review or discourse with acknowledgment, with written permission of the Dean School of Graduate Studies on behalf of both the author and XXX XXX University.ABSTRACT
With Fast growing internet world the risk of intrusion has also increased, as a result Intrusion Detection System (IDS) is the admired key research field. IDS are used to identify any suspicious activity or patterns in the network or machine, which endeavors the security features or compromise the machine. IDS majorly use all the features of the data. It is a keen observation that all the features are not of equal relevance for the detection of attacks. Moreover every feature does not contribute in enhancing the system performance significantly. The main aim of the work done is to develop an efficient denial of service network intrusion classification model. The specific objectives included: to analyse existing literature in intrusion detection systems; what are the techniques used to model IDS, types of network attacks, performance of various machine learning tools, how are network intrusion detection systems assessed; to find out top network traffic attributes that can be used to model denial of service intrusion detection; to develop a machine learning model for detection of denial of service network intrusion.Methods: The research design was experimental and data was collected by simulation using NSL-KDD dataset. By implementing Correlation Feature Selection (CFS) mechanism using three search algorithms, a smallest set of features is selected with all the features that are selected very frequently. Findings: The smallest subset of features chosen is the most nominal among all the feature subset found. Further, the performances using Artificial neural networks(ANN), decision trees, Support Vector Machines (SVM) and K-Nearest Neighbour (KNN) classifiers is compared for 7 subsets found by filter model and 41 attributes. Results: The outcome indicates a remarkable improvement in the performance metrics used for comparison of the two classifiers. The results show that using 17/18 selected features improves DOS types classification accuracies as compared to using the 41 features in the NSL-KDD dataset. It was further observed that using an ensemble of three classifiers with decision fusion performs better as compared to using a single classifier for DOS type’s classification. Among machine learning tools experimented, ANN achieved best classification accuracies followed by SVM and DT. KNN registered the lowest classification accuracies. Application: The proposed work with such an improved detection rate and lesser classification time and lar.
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
Machine learning-based intrusion detection system for detecting web attacksIAESIJAI
The increasing use of smart devices results in a huge amount of data, which raises concerns about personal data, including health data and financial data. This data circulates on the network and can encounter network traffic at any time. This traffic can either be normal traffic or an intrusion created by hackers with the aim of injecting abnormal traffic into the network. Firewalls and traditional intrusion detection systems detect attacks based on signature patterns. However, this is not sufficient to detect advanced or unknown attacks. To detect different types of unknown attacks, the use of intelligent techniques is essential. In this paper, we analyse some machine learning techniques proposed in recent years. In this study, several classifications were made to detect anomalous behaviour in network traffic. The models were built and evaluated based on the Canadian Institute for Cybersecurity-intrusion detection systems dataset released in 2017 (CIC-IDS-2017), which includes both current and historical attacks. The experiments were conducted using decision tree, random forest, logistic regression, gaussian naïve bayes, adaptive boosting, and their ensemble approach. The models were evaluated using various evaluation metrics such as accuracy, precision, recall, F1-score, false positive rate, receiver operating characteristic curve, and calibration curve.
A novel signature based traffic classification engine to reduce false alarms ...IJCNCJournal
Pattern matching plays a significant role in ascertaining network attacks and the foremost prerequisite for a trusted intrusion detection system (IDS) is accurate pattern matching. During the pattern matching process packets are scanned against a pre-defined rule sets. After getting scanned, the packets are marked as alert or benign by the detection system. Sometimes the detection system generates false alarms i.e., good traffic being identified as bad traffic. The ratio of generating the false positives varies from the performance of the detection engines used to scan incoming packets. Intrusion detection systems use to deploy algorithmic procedures to reduce false positives though producing a good number of false alarms. As the necessities, we have been working on the optimization of the algorithms and procedures so that false positives can be reduced to a great extent. As an effort we have proposed a signature-based traffic classification technique that can categorize the incoming packets based on the traffic characteristics and behaviour which would eventually reduce the rate of false alarms
Through the generalization of deep learning, the research community has addressed critical challenges in
the network security domain, like malware identification and anomaly detection. However, they have yet to
discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often
limited in memory and processing power, rendering the compute-intensive deep learning environment
unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the
deep learning pipeline and using raw packet data as input. We introduce a feature- engineering-less
machine learning (ML) process to perform malware detection on IoT devices. Our proposed model,”
Feature engineering-less ML (FEL-ML),” is a lighter-weight detection algorithm that expends no extra
computations on “engineered” features. It effectively accelerates the low-powered IoT edge. It is trained
on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional
feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added
benefit of eliminating the significant investment by subject matter experts in feature engineering.
EFFICIENT ATTACK DETECTION IN IOT DEVICES USING FEATURE ENGINEERING-LESS MACH...ijcsit
Through the generalization of deep learning, the research community has addressed critical challenges in
the network security domain, like malware identification and anomaly detection. However, they have yet to
discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often
limited in memory and processing power, rendering the compute-intensive deep learning environment
unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the
deep learning pipeline and using raw packet data as input. We introduce a feature- engineering-less
machine learning (ML) process to perform malware detection on IoT devices. Our proposed model,”
Feature engineering-less ML (FEL-ML),” is a lighter-weight detection algorithm that expends no extra
computations on “engineered” features. It effectively accelerates the low-powered IoT edge. It is trained
on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional
feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added
benefit of eliminating the significant investment by subject matter experts in feature engineering.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Outstanding to the promotion of the Internet and local networks, interruption occasions to computer
systems are emerging. Intrusion detection systems are becoming progressively vital in retaining
appropriate network safety. IDS is a software or hardware device that deals with attacks by gathering
information from a numerous system and network sources, then evaluating signs of security complexities.
Enterprise networked systems are unsurprisingly unprotected to the growing threats posed by hackers as
well as malicious users inside to a network. IDS technology is one of the significant tools used now-a-days,
to counter such threat. In this research we have proposed framework by using advance feature selection
and dimensionality reduction technique we can reduce IDS data then applying Fuzzy ARTMAP classifier
we can find intrusions so that we get accurate results within less time. Feature selection, as an active
research area in decreasing dimensionality, eliminating unrelated data, developing learning correctness,
and improving result unambiguousness.
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
Intrusion Detection Systems (IDS) form a key part of system defence, where it identifies abnormal
activities happening in a computer system. In recent years different soft computing based techniques have
been proposed for the development of IDS. On the other hand, intrusion detection is not yet a perfect
technology. This has provided an opportunity for data mining to make quite a lot of important
contributions in the field of intrusion detection. In this paper we have proposed a new hybrid technique
by utilizing data mining techniques such as fuzzy C means clustering, Fuzzy neural network / Neurofuzzy and radial basis function(RBF) SVM for fortification of the intrusion detection system. The
proposed technique has five major steps in which, first step is to perform the relevance analysis, and then
input data is clustered using Fuzzy C-means clustering. After that, neuro-fuzzy is trained, such that each
of the data point is trained with the corresponding neuro-fuzzy classifier associated with the cluster.
Subsequently, a vector for SVM classification is formed and in the last step, classification using RBF-
SVM is performed to detect intrusion has happened or not. Data set used is the KDD cup 1999 dataset
and we have used precision, recall, F-measure and accuracy as the evaluation metrics parameters. Our
technique could achieve better accuracy for all types of intrusions. The results of proposed technique are
compared with the other existing techniques. These comparisons proved the effectiveness of our
technique.
Hyperparameters optimization XGBoost for network intrusion detection using CS...IAESIJAI
With the introduction of high-speed internet access, the demand for security and dependable networks has grown. In recent years, network attacks have gotten more complex and intense, making security a vital component of organizational information systems. Network intrusion detection systems (NIDS) have become an essential detection technology to protect data integrity and system availability against such attacks. NIDS is one of the most well-known areas of machine learning software in the security field, with machine learning algorithms constantly being developed to improve performance. This research focuses on detecting abnormalities in societal infiltration using the hyperparameters optimization XGBoost (HO-XGB) algorithm with the Communications Security Establishment-The Canadian Institute for Cybersecurity-Intrusion Detection System2018 (CSE-CICIDS2018) dataset to get the best potential results. When compared to typical machine learning methods published in the literature, HO-XGB outperforms them. The study shows that XGBoost outperforms other detection algorithms. We refined the HO-XGB model's hyperparameters, which included learning_rate, subsample, max_leaves, max_depth, gamma, colsample_bytree, min_child_weight, n_estimators, max_depth, and reg_alpha. The experimental findings reveal that HO-XGB1 outperforms multiple parameter settings for intrusion detection, effectively optimizing XGBoost's hyperparameters.
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion. Our observations confirm the conjecture
that both the feature selection and stochastic based genetic operators improves the accuracy and the
effectiveness. The training time is shown to be reduced tremendously by 98.59% and accuracy improved to
98.75%.
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS A system-for-denial-of-service...IEEEMEMTECHSTUDENTPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMSijfls
The increase in the deployment of IoT networks has improved productivity of humans and organisations.
However, IoT networks are increasingly becoming platforms for launching DDoS attacks due to inherent
weaker security and resource-constrained nature of IoT devices. This paper focusses on detecting DDoS
attack in IoT networks by classifying incoming network packets on the transport layer as either
“Suspicious” or “Benign” using unsupervised machine learning algorithms. In this work, two deep
learning algorithms and two clustering algorithms were independently trained for mitigating DDoS
attacks. We lay emphasis on exploitation based DDOS attacks which include TCP SYN-Flood attacks and
UDP-Lag attacks. We use Mirai, BASHLITE and CICDDoS2019 dataset in training the algorithms during
the experimentation phase. The accuracy score and normalized-mutual-information score are used to
quantify the classification performance of the four algorithms. Our results show that the autoencoder
performed overall best with the highest accuracy across all the datasets.
DDoS Attack Detection on Internet o Things using Unsupervised Algorithmsijfls
The increase in the deployment of IoT networks has improved productivity of humans and organisations. However, IoT networks are increasingly becoming platforms for launching DDoS attacks due to inherent weaker security and resource-constrained nature of IoT devices. This paper focusses on detecting DDoS attack in IoT networks by classifying incoming network packets on the transport layer as either “Suspicious” or “Benign” using unsupervised machine learning algorithms. In this work, two deep learning algorithms and two clustering algorithms were independently trained for mitigating DDoS attacks. We lay emphasis on exploitation based DDOS attacks which include TCP SYN-Flood attacks and UDP-Lag attacks. We use Mirai, BASHLITE and CICDDoS2019 dataset in training the algorithms during the experimentation phase. The accuracy score and normalized-mutual-information score are used to quantify the classification performance of the four algorithms. Our results show that the autoencoder performed overall best with the highest accuracy across all the datasets.
Trust Metric-Based Anomaly Detection Via Deep Deterministic Policy Gradient R...IJCNCJournal
Addressing real-time network security issues is paramount due to the rapidly expanding IoT jargon. The erratic rise in usage of inadequately secured IoT- based sensory devices like wearables of mobile users, autonomous vehicles, smartphones and appliances by a larger user community is fuelling the need for a trustable, super-performant security framework. An efficient anomaly detection system would aim to address the anomaly detection problem by devising a competent attack detection model. This paper delves into the Deep Deterministic Policy Gradient (DDPG) approach, a promising Reinforcement Learning platform to combat noisy sensor samples which are instigated by alarming network attacks. The authors propose an enhanced DDPG approach based on trust metrics and belief networks, referred to as Deep Deterministic Policy Gradient Belief Network (DDPG-BN). This deep-learning-based approach is projected as an algorithm to provide “Deep-Defense” to the plethora of network attacks. Confidence interval is chosen as the trust metric to decide on the termination of sensor sample collection. Once an enlisted attack is detected, the collection of samples from the particular sensor will automatically cease. The evaluations and results of the experiments highlight a better detection accuracy of 98.37% compared to its counterpart conventional DDPG implementation of 97.46%. The paper also covers the work based on a contemporary Deep Reinforcement Learning (DRL) algorithm, the Actor Critic (AC). The proposed deep learning binary classification model is validated using the NSL-KDD dataset and the performance is compared to a few deep learning implementations as well.
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...IJCNCJournal
Addressing real-time network security issues is paramount due to the rapidly expanding IoT jargon. The erratic rise in usage of inadequately secured IoT- based sensory devices like wearables of mobile users, autonomous vehicles, smartphones and appliances by a larger user community is fuelling the need for a trustable, super-performant security framework. An efficient anomaly detection system would aim to address the anomaly detection problem by devising a competent attack detection model. This paper delves into the Deep Deterministic Policy Gradient (DDPG) approach, a promising Reinforcement Learning platform to combat noisy sensor samples which are instigated by alarming network attacks. The authors propose an enhanced DDPG approach based on trust metrics and belief networks, referred to as Deep Deterministic Policy Gradient Belief Network (DDPG-BN). This deep-learning-based approach is projected as an algorithm to provide “Deep-Defense” to the plethora of network attacks. Confidence interval is chosen as the trust metric to decide on the termination of sensor sample collection. Once an enlisted attack is detected, the collection of samples from the particular sensor will automatically cease. The evaluations and results of the experiments highlight a better detection accuracy of 98.37% compared to its counterpart conventional DDPG implementation of 97.46%. The paper also covers the work based on a contemporary Deep Reinforcement Learning (DRL) algorithm, the Actor Critic (AC). The proposed deep learning binary classification model is validated using the NSL-KDD dataset and the performance is compared to a few deep learning implementations as well.
A Lightweight Method for Detecting Cyber Attacks in High-traffic Large Networ...IJCNCJournal
Protecting information systems is a difficult and long-term task. The size and traffic intensity of computer networks are diverse and no one protection solution is universal for all cases. A certain solution protects well in the campus network, but it is unlikely to protect well in the service provider's network. A key component of a cyber defence system is a network attack detector. This component needs to be designed to have a good way to scale detection capabilities with network size and traffic intensity beyond the size and intensity of a campus network. From this point of view, this paper aims to build a network attack detection method suitable for the scale of large and high-traffic networks based on machine learning models using clustering techniques and our proposed detection technique. The detection technique is different from outlier detection commonly used in clustering-based anomaly detection applications. The method was evaluated in cases using different feature extraction methods and different clustering algorithms. Experimental results on the NSL-KDD data set are positive with a detection accuracy of over 97%.
A LIGHTWEIGHT METHOD FOR DETECTING CYBER ATTACKS IN HIGH-TRAFFIC LARGE NETWOR...IJCNCJournal
Protecting information systems is a difficult and long-term task. The size and traffic intensity of computer
networks are diverse and no one protection solution is universal for all cases. A certain solution protects
well in the campus network, but it is unlikely to protect well in the service provider's network. A key
component of a cyber defence system is a network attack detector. This component needs to be designed to
have a good way to scale detection capabilities with network size and traffic intensity beyond the size and
intensity of a campus network. From this point of view, this paper aims to build a network attack detection
method suitable for the scale of large and high-traffic networks based on machine learning models using
clustering techniques and our proposed detection technique. The detection technique is different from
outlier detection commonly used in clustering-based anomaly detection applications. The method was
evaluated in cases using different feature extraction methods and different clustering algorithms.
Experimental results on the NSL-KDD data set are positive with a detection accuracy of over 97%.
Feature Selection using the Concept of Peafowl Mating in IDSIJCNCJournal
Cloud computing has high applicability as an Internet based service that relies on sharing computing resources. Cloud computing provides services that are Infrastructure based, Platform based and Software based. The popularity of this technology is due to its superb performance, high level of computing ability, low cost of services, scalability, availability and flexibility. The obtainability and openness of data in cloud environment make it vulnerable to the world of cyber-attacks. To detect the attacks Intrusion Detection System is used, that can identify the attacks and ensure information security. Such a coherent and proficient Intrusion Detection System is proposed in this paper to achieve higher certainty levels regarding safety in cloud environment. In this paper, the mating behavior of peafowl is incorporated into an optimization algorithm which in turn is used as a feature selection algorithm. The algorithm is used to reduce the huge size of cloud data so that the IDS can work efficiently on the cloud to detect intrusions. The proposed model has been experimented with NSL-KDD dataset as well as Kyoto dataset and have proved to be a better as well as an efficient IDS.
Feature Selection using the Concept of Peafowl Mating in IDSIJCNCJournal
Cloud computing has high applicability as an Internet based service that relies on sharing computing resources. Cloud computing provides services that are Infrastructure based, Platform based and Software based. The popularity of this technology is due to its superb performance, high level of computing ability, low cost of services, scalability, availability and flexibility. The obtainability and openness of data in cloud environment make it vulnerable to the world of cyber-attacks. To detect the attacks Intrusion Detection System is used, that can identify the attacks and ensure information security. Such a coherent and proficient Intrusion Detection System is proposed in this paper to achieve higher certainty levels regarding safety in cloud environment. In this paper, the mating behavior of peafowl is incorporated into an optimization algorithm which in turn is used as a feature selection algorithm. The algorithm is used to reduce the huge size of cloud data so that the IDS can work efficiently on the cloud to detect intrusions. The proposed model has been experimented with NSL-KDD dataset as well as Kyoto dataset and have proved to be a better as well as an efficient IDS.
Vehicle Ad Hoc Networks (VANETs) have become a viable technology to improve traffic flow and safety on the roads. Due to its effectiveness and scalability, the Wingsuit Search-based Optimised Link State Routing Protocol (WS-OLSR) is frequently used for data distribution in VANETs. However, the selection of MultiPoint Relays (MPRs) plays a pivotal role in WS-OLSR's performance. This paper presents an improved MPR selection algorithm tailored to WS-OLSR, designed to enhance the overall routing efficiency and reduce overhead. The analysis found that the current OLSR protocol has problems such as redundancy of HELLO and TC message packets or failure to update routing information in time, so a WS-OLSR routing protocol based on improved-MPR selection algorithm was proposed. Firstly, factors such as node mobility and link changes are comprehensively considered to reflect network topology changes, and the broadcast cycle of node HELLO messages is controlled through topology changes. Secondly, a new MPR selection algorithm is proposed, considering link stability issues and nodes. Finally, evaluate its effectiveness in terms of packet delivery ratio, end-to-end delay, and control message overhead. Simulation results demonstrate the superior performance of our improved MR selection algorithm when compared to traditional approaches.
A Novel Medium Access Control Strategy for Heterogeneous Traffic in Wireless ...IJCNCJournal
So far, Wireless Body Area Networks (WBANs) have played a pivotal role in driving the development of intelligent healthcare systems with broad applicability across various domains. Each WBAN consists of one or more types of sensors that can be embedded in clothing, attached directly to the body, or even implanted beneath an individual's skin. These sensors typically serve asingle application. However, the traffic generated by each sensor may have distinct requirements. This diversity necessitates a dual approach: tailored treatment based on the specific needs of each traffic typeand the fulfillment of application requirements, such asreliability and timeliness. Never the less, the presence of energy constraints and the unreliable nature of wireless communications make QoS provisioning under such networks a non-trivial task. In this context, the current paper introduces a novel Medium AccessControl (MAC) strategy for the regular traffic applications of WBANs, designed to significantly enhance efficiency when compared to the established MAC protocols IEEE 802.15.4 and IEEE 802.15.6, with a particular focus on improving reliability, timeliness, and energy efficiency.
Weitere ähnliche Inhalte
Ähnlich wie High Performance NMF Based Intrusion Detection System for Big Data IOT Traffic
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
Intrusion Detection Systems (IDS) form a key part of system defence, where it identifies abnormal
activities happening in a computer system. In recent years different soft computing based techniques have
been proposed for the development of IDS. On the other hand, intrusion detection is not yet a perfect
technology. This has provided an opportunity for data mining to make quite a lot of important
contributions in the field of intrusion detection. In this paper we have proposed a new hybrid technique
by utilizing data mining techniques such as fuzzy C means clustering, Fuzzy neural network / Neurofuzzy and radial basis function(RBF) SVM for fortification of the intrusion detection system. The
proposed technique has five major steps in which, first step is to perform the relevance analysis, and then
input data is clustered using Fuzzy C-means clustering. After that, neuro-fuzzy is trained, such that each
of the data point is trained with the corresponding neuro-fuzzy classifier associated with the cluster.
Subsequently, a vector for SVM classification is formed and in the last step, classification using RBF-
SVM is performed to detect intrusion has happened or not. Data set used is the KDD cup 1999 dataset
and we have used precision, recall, F-measure and accuracy as the evaluation metrics parameters. Our
technique could achieve better accuracy for all types of intrusions. The results of proposed technique are
compared with the other existing techniques. These comparisons proved the effectiveness of our
technique.
Hyperparameters optimization XGBoost for network intrusion detection using CS...IAESIJAI
With the introduction of high-speed internet access, the demand for security and dependable networks has grown. In recent years, network attacks have gotten more complex and intense, making security a vital component of organizational information systems. Network intrusion detection systems (NIDS) have become an essential detection technology to protect data integrity and system availability against such attacks. NIDS is one of the most well-known areas of machine learning software in the security field, with machine learning algorithms constantly being developed to improve performance. This research focuses on detecting abnormalities in societal infiltration using the hyperparameters optimization XGBoost (HO-XGB) algorithm with the Communications Security Establishment-The Canadian Institute for Cybersecurity-Intrusion Detection System2018 (CSE-CICIDS2018) dataset to get the best potential results. When compared to typical machine learning methods published in the literature, HO-XGB outperforms them. The study shows that XGBoost outperforms other detection algorithms. We refined the HO-XGB model's hyperparameters, which included learning_rate, subsample, max_leaves, max_depth, gamma, colsample_bytree, min_child_weight, n_estimators, max_depth, and reg_alpha. The experimental findings reveal that HO-XGB1 outperforms multiple parameter settings for intrusion detection, effectively optimizing XGBoost's hyperparameters.
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion. Our observations confirm the conjecture
that both the feature selection and stochastic based genetic operators improves the accuracy and the
effectiveness. The training time is shown to be reduced tremendously by 98.59% and accuracy improved to
98.75%.
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS A system-for-denial-of-service...IEEEMEMTECHSTUDENTPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMSijfls
The increase in the deployment of IoT networks has improved productivity of humans and organisations.
However, IoT networks are increasingly becoming platforms for launching DDoS attacks due to inherent
weaker security and resource-constrained nature of IoT devices. This paper focusses on detecting DDoS
attack in IoT networks by classifying incoming network packets on the transport layer as either
“Suspicious” or “Benign” using unsupervised machine learning algorithms. In this work, two deep
learning algorithms and two clustering algorithms were independently trained for mitigating DDoS
attacks. We lay emphasis on exploitation based DDOS attacks which include TCP SYN-Flood attacks and
UDP-Lag attacks. We use Mirai, BASHLITE and CICDDoS2019 dataset in training the algorithms during
the experimentation phase. The accuracy score and normalized-mutual-information score are used to
quantify the classification performance of the four algorithms. Our results show that the autoencoder
performed overall best with the highest accuracy across all the datasets.
DDoS Attack Detection on Internet o Things using Unsupervised Algorithmsijfls
The increase in the deployment of IoT networks has improved productivity of humans and organisations. However, IoT networks are increasingly becoming platforms for launching DDoS attacks due to inherent weaker security and resource-constrained nature of IoT devices. This paper focusses on detecting DDoS attack in IoT networks by classifying incoming network packets on the transport layer as either “Suspicious” or “Benign” using unsupervised machine learning algorithms. In this work, two deep learning algorithms and two clustering algorithms were independently trained for mitigating DDoS attacks. We lay emphasis on exploitation based DDOS attacks which include TCP SYN-Flood attacks and UDP-Lag attacks. We use Mirai, BASHLITE and CICDDoS2019 dataset in training the algorithms during the experimentation phase. The accuracy score and normalized-mutual-information score are used to quantify the classification performance of the four algorithms. Our results show that the autoencoder performed overall best with the highest accuracy across all the datasets.
Trust Metric-Based Anomaly Detection Via Deep Deterministic Policy Gradient R...IJCNCJournal
Addressing real-time network security issues is paramount due to the rapidly expanding IoT jargon. The erratic rise in usage of inadequately secured IoT- based sensory devices like wearables of mobile users, autonomous vehicles, smartphones and appliances by a larger user community is fuelling the need for a trustable, super-performant security framework. An efficient anomaly detection system would aim to address the anomaly detection problem by devising a competent attack detection model. This paper delves into the Deep Deterministic Policy Gradient (DDPG) approach, a promising Reinforcement Learning platform to combat noisy sensor samples which are instigated by alarming network attacks. The authors propose an enhanced DDPG approach based on trust metrics and belief networks, referred to as Deep Deterministic Policy Gradient Belief Network (DDPG-BN). This deep-learning-based approach is projected as an algorithm to provide “Deep-Defense” to the plethora of network attacks. Confidence interval is chosen as the trust metric to decide on the termination of sensor sample collection. Once an enlisted attack is detected, the collection of samples from the particular sensor will automatically cease. The evaluations and results of the experiments highlight a better detection accuracy of 98.37% compared to its counterpart conventional DDPG implementation of 97.46%. The paper also covers the work based on a contemporary Deep Reinforcement Learning (DRL) algorithm, the Actor Critic (AC). The proposed deep learning binary classification model is validated using the NSL-KDD dataset and the performance is compared to a few deep learning implementations as well.
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...IJCNCJournal
Addressing real-time network security issues is paramount due to the rapidly expanding IoT jargon. The erratic rise in usage of inadequately secured IoT- based sensory devices like wearables of mobile users, autonomous vehicles, smartphones and appliances by a larger user community is fuelling the need for a trustable, super-performant security framework. An efficient anomaly detection system would aim to address the anomaly detection problem by devising a competent attack detection model. This paper delves into the Deep Deterministic Policy Gradient (DDPG) approach, a promising Reinforcement Learning platform to combat noisy sensor samples which are instigated by alarming network attacks. The authors propose an enhanced DDPG approach based on trust metrics and belief networks, referred to as Deep Deterministic Policy Gradient Belief Network (DDPG-BN). This deep-learning-based approach is projected as an algorithm to provide “Deep-Defense” to the plethora of network attacks. Confidence interval is chosen as the trust metric to decide on the termination of sensor sample collection. Once an enlisted attack is detected, the collection of samples from the particular sensor will automatically cease. The evaluations and results of the experiments highlight a better detection accuracy of 98.37% compared to its counterpart conventional DDPG implementation of 97.46%. The paper also covers the work based on a contemporary Deep Reinforcement Learning (DRL) algorithm, the Actor Critic (AC). The proposed deep learning binary classification model is validated using the NSL-KDD dataset and the performance is compared to a few deep learning implementations as well.
A Lightweight Method for Detecting Cyber Attacks in High-traffic Large Networ...IJCNCJournal
Protecting information systems is a difficult and long-term task. The size and traffic intensity of computer networks are diverse and no one protection solution is universal for all cases. A certain solution protects well in the campus network, but it is unlikely to protect well in the service provider's network. A key component of a cyber defence system is a network attack detector. This component needs to be designed to have a good way to scale detection capabilities with network size and traffic intensity beyond the size and intensity of a campus network. From this point of view, this paper aims to build a network attack detection method suitable for the scale of large and high-traffic networks based on machine learning models using clustering techniques and our proposed detection technique. The detection technique is different from outlier detection commonly used in clustering-based anomaly detection applications. The method was evaluated in cases using different feature extraction methods and different clustering algorithms. Experimental results on the NSL-KDD data set are positive with a detection accuracy of over 97%.
A LIGHTWEIGHT METHOD FOR DETECTING CYBER ATTACKS IN HIGH-TRAFFIC LARGE NETWOR...IJCNCJournal
Protecting information systems is a difficult and long-term task. The size and traffic intensity of computer
networks are diverse and no one protection solution is universal for all cases. A certain solution protects
well in the campus network, but it is unlikely to protect well in the service provider's network. A key
component of a cyber defence system is a network attack detector. This component needs to be designed to
have a good way to scale detection capabilities with network size and traffic intensity beyond the size and
intensity of a campus network. From this point of view, this paper aims to build a network attack detection
method suitable for the scale of large and high-traffic networks based on machine learning models using
clustering techniques and our proposed detection technique. The detection technique is different from
outlier detection commonly used in clustering-based anomaly detection applications. The method was
evaluated in cases using different feature extraction methods and different clustering algorithms.
Experimental results on the NSL-KDD data set are positive with a detection accuracy of over 97%.
Feature Selection using the Concept of Peafowl Mating in IDSIJCNCJournal
Cloud computing has high applicability as an Internet based service that relies on sharing computing resources. Cloud computing provides services that are Infrastructure based, Platform based and Software based. The popularity of this technology is due to its superb performance, high level of computing ability, low cost of services, scalability, availability and flexibility. The obtainability and openness of data in cloud environment make it vulnerable to the world of cyber-attacks. To detect the attacks Intrusion Detection System is used, that can identify the attacks and ensure information security. Such a coherent and proficient Intrusion Detection System is proposed in this paper to achieve higher certainty levels regarding safety in cloud environment. In this paper, the mating behavior of peafowl is incorporated into an optimization algorithm which in turn is used as a feature selection algorithm. The algorithm is used to reduce the huge size of cloud data so that the IDS can work efficiently on the cloud to detect intrusions. The proposed model has been experimented with NSL-KDD dataset as well as Kyoto dataset and have proved to be a better as well as an efficient IDS.
Feature Selection using the Concept of Peafowl Mating in IDSIJCNCJournal
Cloud computing has high applicability as an Internet based service that relies on sharing computing resources. Cloud computing provides services that are Infrastructure based, Platform based and Software based. The popularity of this technology is due to its superb performance, high level of computing ability, low cost of services, scalability, availability and flexibility. The obtainability and openness of data in cloud environment make it vulnerable to the world of cyber-attacks. To detect the attacks Intrusion Detection System is used, that can identify the attacks and ensure information security. Such a coherent and proficient Intrusion Detection System is proposed in this paper to achieve higher certainty levels regarding safety in cloud environment. In this paper, the mating behavior of peafowl is incorporated into an optimization algorithm which in turn is used as a feature selection algorithm. The algorithm is used to reduce the huge size of cloud data so that the IDS can work efficiently on the cloud to detect intrusions. The proposed model has been experimented with NSL-KDD dataset as well as Kyoto dataset and have proved to be a better as well as an efficient IDS.
Vehicle Ad Hoc Networks (VANETs) have become a viable technology to improve traffic flow and safety on the roads. Due to its effectiveness and scalability, the Wingsuit Search-based Optimised Link State Routing Protocol (WS-OLSR) is frequently used for data distribution in VANETs. However, the selection of MultiPoint Relays (MPRs) plays a pivotal role in WS-OLSR's performance. This paper presents an improved MPR selection algorithm tailored to WS-OLSR, designed to enhance the overall routing efficiency and reduce overhead. The analysis found that the current OLSR protocol has problems such as redundancy of HELLO and TC message packets or failure to update routing information in time, so a WS-OLSR routing protocol based on improved-MPR selection algorithm was proposed. Firstly, factors such as node mobility and link changes are comprehensively considered to reflect network topology changes, and the broadcast cycle of node HELLO messages is controlled through topology changes. Secondly, a new MPR selection algorithm is proposed, considering link stability issues and nodes. Finally, evaluate its effectiveness in terms of packet delivery ratio, end-to-end delay, and control message overhead. Simulation results demonstrate the superior performance of our improved MR selection algorithm when compared to traditional approaches.
A Novel Medium Access Control Strategy for Heterogeneous Traffic in Wireless ...IJCNCJournal
So far, Wireless Body Area Networks (WBANs) have played a pivotal role in driving the development of intelligent healthcare systems with broad applicability across various domains. Each WBAN consists of one or more types of sensors that can be embedded in clothing, attached directly to the body, or even implanted beneath an individual's skin. These sensors typically serve asingle application. However, the traffic generated by each sensor may have distinct requirements. This diversity necessitates a dual approach: tailored treatment based on the specific needs of each traffic typeand the fulfillment of application requirements, such asreliability and timeliness. Never the less, the presence of energy constraints and the unreliable nature of wireless communications make QoS provisioning under such networks a non-trivial task. In this context, the current paper introduces a novel Medium AccessControl (MAC) strategy for the regular traffic applications of WBANs, designed to significantly enhance efficiency when compared to the established MAC protocols IEEE 802.15.4 and IEEE 802.15.6, with a particular focus on improving reliability, timeliness, and energy efficiency.
May_2024 Top 10 Read Articles in Computer Networks & Communications.pdfIJCNCJournal
The International Journal of Computer Networks & Communications (IJCNC) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of Computer Networks & Communications. The journal focuses on all technical and practical aspects of Computer Networks & data Communications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
A Topology Control Algorithm Taking into Account Energy and Quality of Transm...IJCNCJournal
The efficient use of energy in wireless sensor networks is critical for extending node lifetime. The network topology is one of the factors that have a significant impact on the energy usage at the nodes and the quality of transmission (QoT) in the network. We propose a topology control algorithm for software-defined wireless sensor networks (SDWSNs) in this paper. Our method is to formulate topology control algorithm as a nonlinear programming (NP) problem with the objective to optimizing two metrics, maximum communication range, and desired degree. This NP problem is solved at the SDWSN controller by employing the genetic algorithm (GA) to determine the best topology. The simulation results show that the proposed algorithm outperforms the MaxPower algorithm in terms of average node degree and energy expansion ratio.
Multi-Server user Authentication Scheme for Privacy Preservation with Fuzzy C...IJCNCJournal
The integration of artificial intelligence technology with a scalable Internet of Things (IoT) platform facilitates diverse smart communication services, allowing remote users to access services from anywhere at any time. The multi-server environment within IoT introduces a flexible security service model, enabling users to interact with any server through a single registration. To ensure secure and privacy preservation services for resources, an authentication scheme is essential. Zhao et al. recently introduced a user authentication scheme for the multi-server environment, utilizing passwords and smart cards, claiming resilience against well-known attacks. This paper conducts cryptanalysis on Zhao et al.'s scheme, focusing on denial of service and privacy attacks, revealing a lack of user-friendliness. Subsequently, we propose a new multi-server user authentication scheme for privacy preservation with fuzzy commitment over the IoT environment, addressing the shortcomings of Zhao et al.'s scheme. Formal security verification of the proposed scheme is conducted using the ProVerif simulation tool. Through both formal and informal security analyses, we demonstrate that the proposed scheme is resilient against various known attacks and those identified in Zhao et al.'s scheme.
Advanced Privacy Scheme to Improve Road Safety in Smart Transportation SystemsIJCNCJournal
In -Vehicle Ad-Hoc Network (VANET), vehicles continuously transmit and receive spatiotemporal data with neighboring vehicles, thereby establishing a comprehensive 360-degree traffic awareness system. Vehicular Network safety applications facilitate the transmission of messages between vehicles that are near each other, at regular intervals, enhancing drivers' contextual understanding of the driving environment and significantly improving traffic safety. Privacy schemes in VANETs are vital to safeguard vehicles’ identities and their associated owners or drivers. Privacy schemes prevent unauthorized parties from linking the vehicle's communications to a specific real-world identity by employing techniques such as pseudonyms, randomization, or cryptographic protocols. Nevertheless, these communications frequently contain important vehicle information that malevolent groups could use to Monitor the vehicle over a long period. The acquisition of this shared data has the potential to facilitate the reconstruction of vehicle trajectories, thereby posing a potential risk to the privacy of the driver. Addressing the critical challenge of developing effective and scalable privacy-preserving protocols for communication in vehicle networks is of the highest priority. These protocols aim to reduce the transmission of confidential data while ensuring the required level of communication. This paper aims to propose an Advanced Privacy Vehicle Scheme (APV) that periodically changes pseudonyms to protect vehicle identities and improve privacy. The APV scheme utilizes a concept called the silent period, which involves changing the pseudonym of a vehicle periodically based on the tracking of neighboring vehicles. The pseudonym is a temporary identifier that vehicles use to communicate with each other in a VANET. By changing the pseudonym regularly, the APV scheme makes it difficult for unauthorized entities to link a vehicle's communications to its real-world identity. The proposed APV is compared to the SLOW, RSP, CAPS, and CPN techniques. The data indicates that the efficiency of APV is a better improvement in privacy metrics. It is evident that the AVP offers enhanced safety for vehicles during transportation in the smart city.
April 2024 - Top 10 Read Articles in Computer Networks & CommunicationsIJCNCJournal
The International Journal of Computer Networks & Communications (IJCNC) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of Computer Networks & Communications. The journal focuses on all technical and practical aspects of Computer Networks & data Communications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
A Novel Medium Access Control Strategy for Heterogeneous Traffic in Wireless ...IJCNCJournal
So far, Wireless Body Area Networks (WBANs) have played a pivotal role in driving the development of intelligent healthcare systems with broad applicability across various domains. Each WBAN consists of one or more types of sensors that can be embedded in clothing, attached directly to the body, or even implanted beneath an individual's skin. These sensors typically serve asingle application. However, the traffic generated by each sensor may have distinct requirements. This diversity necessitates a dual approach: tailored treatment based on the specific needs of each traffic typeand the fulfillment of application requirements, such asreliability and timeliness. Never the less, the presence of energy constraints and the unreliable nature of wireless communications make QoS provisioning under such networks a non-trivial task. In this context, the current paper introduces a novel Medium AccessControl (MAC) strategy for the regular traffic applications of WBANs, designed to significantly enhance efficiency when compared to the established MAC protocols IEEE 802.15.4 and IEEE 802.15.6, with a particular focus on improving reliability, timeliness, and energy efficiency.
A Topology Control Algorithm Taking into Account Energy and Quality of Transm...IJCNCJournal
The efficient use of energy in wireless sensor networks is critical for extending node lifetime. The network topology is one of the factors that have a significant impact on the energy usage at the nodes and the quality of transmission (QoT) in the network. We propose a topology control algorithm for software-defined wireless sensor networks (SDWSNs) in this paper. Our method is to formulate topology control algorithm as a nonlinear programming (NP) problem with the objective to optimizing two metrics, maximum communication range, and desired degree. This NP problem is solved at the SDWSN controller by employing the genetic algorithm (GA) to determine the best topology. The simulation results show that the proposed algorithm outperforms the MaxPower algorithm in terms of average node degree and energy expansion ratio.
Multi-Server user Authentication Scheme for Privacy Preservation with Fuzzy C...IJCNCJournal
The integration of artificial intelligence technology with a scalable Internet of Things (IoT) platform facilitates diverse smart communication services, allowing remote users to access services from anywhere at any time. The multi-server environment within IoT introduces a flexible security service model, enabling users to interact with any server through a single registration. To ensure secure and privacy preservation services for resources, an authentication scheme is essential. Zhao et al. recently introduced a user authentication scheme for the multi-server environment, utilizing passwords and smart cards, claiming resilience against well-known attacks. This paper conducts cryptanalysis on Zhao et al.'s scheme, focusing on denial of service and privacy attacks, revealing a lack of user-friendliness. Subsequently, we propose a new multi-server user authentication scheme for privacy preservation with fuzzy commitment over the IoT environment, addressing the shortcomings of Zhao et al.'s scheme. Formal security verification of the proposed scheme is conducted using the ProVerif simulation tool. Through both formal and informal security analyses, we demonstrate that the proposed scheme is resilient against various known attacks and those identified in Zhao et al.'s scheme.
Advanced Privacy Scheme to Improve Road Safety in Smart Transportation SystemsIJCNCJournal
In -Vehicle Ad-Hoc Network (VANET), vehicles continuously transmit and receive spatiotemporal data with neighboring vehicles, thereby establishing a comprehensive 360-degree traffic awareness system. Vehicular Network safety applications facilitate the transmission of messages between vehicles that are near each other, at regular intervals, enhancing drivers' contextual understanding of the driving environment and significantly improving traffic safety. Privacy schemes in VANETs are vital to safeguard vehicles’ identities and their associated owners or drivers. Privacy schemes prevent unauthorized parties from linking the vehicle's communications to a specific real-world identity by employing techniques such as pseudonyms, randomization, or cryptographic protocols. Nevertheless, these communications frequently contain important vehicle information that malevolent groups could use to Monitor the vehicle over a long period. The acquisition of this shared data has the potential to facilitate the reconstruction of vehicle trajectories, thereby posing a potential risk to the privacy of the driver. Addressing the critical challenge of developing effective and scalable privacy-preserving protocols for communication in vehicle networks is of the highest priority. These protocols aim to reduce the transmission of confidential data while ensuring the required level of communication. This paper aims to propose an Advanced Privacy Vehicle Scheme (APV) that periodically changes pseudonyms to protect vehicle identities and improve privacy. The APV scheme utilizes a concept called the silent period, which involves changing the pseudonym of a vehicle periodically based on the tracking of neighboring vehicles. The pseudonym is a temporary identifier that vehicles use to communicate with each other in a VANET. By changing the pseudonym regularly, the APV scheme makes it difficult for unauthorized entities to link a vehicle's communications to its real-world identity. The proposed APV is compared to the SLOW, RSP, CAPS, and CPN techniques. The data indicates that the efficiency of APV is a better improvement in privacy metrics. It is evident that the AVP offers enhanced safety for vehicles during transportation in the smart city.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
High Performance NMF based Intrusion Detection System for Big Data IoT TrafficIJCNCJournal
With the emergence of smart devices and the Internet of Things (IoT), millions of users connected to the network produce massive network traffic datasets. These vast datasets of network traffic, Big Data are challenging to store, deal with and analyse using a single computer. In this paper we developed parallel implementation using a High Performance Computer (HPC) for the Non-Negative Matrix Factorization technique as an engine for an Intrusion Detection System (HPC-NMF-IDS). The large IoT traffic datasets of order of millions samples are distributed evenly on all the computing cores for both storage and speedup purpose. The distribution of computing tasks involved in the Matrix Factorization takes into account the reduction of the communication cost between the computing cores. The experiments we conducted on the proposed HPC-IDS-NMF give better results than the traditional ML-based intrusion detection systems. We could train the HPC model with datasets of one million samples in only 31 seconds instead of the 40 minutes using one processor), that is a speed up of 87 times. Moreover, we have got an excellent detection accuracy rate of 98% for KDD dataset.
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IJCNCJournal
Cyber intrusion attacks increasingly target the Internet of Things (IoT) ecosystem, exploiting vulnerable devices and networks. Malicious activities must be identified early to minimize damage and mitigate threats. Using actual benign and attack traffic from the CICIoT2023 dataset, this WORK aims to evaluate and benchmark machine-learning techniques for IoT intrusion detection. There are four main phases to the system. First, the CICIoT2023 dataset is refined to remove irrelevant features and clean up missing and duplicate data. The second phase employs statistical models and artificial intelligence to discover novel features. The most significant features are then selected in the third phase based on cooperative game theory. Using the original CICIoT2023 dataset and a dataset containing only novel features, we train and evaluate a variety of machine learning classifiers. On the original dataset, Random Forest achieved the highest accuracy of 99%. Still, with novel features, Random Forest's performance dropped only slightly (96%) while other models achieved significantly lower accuracy. As a whole, the work contributes substantial contributions to tailored feature engineering, feature selection, and rigorous benchmarking of IoT intrusion detection techniques. IoT networks and devices face continuously evolving threats, making it necessary to develop robust intrusion detection systems.
Enhancing Traffic Routing Inside a Network through IoT Technology & Network C...IJCNCJournal
IoT networking uses real items as stationary or mobile nodes. Mobile nodes complicate networking. Internet of Things (IoT) networks have a lot of control overhead messages because devices are mobile. These signals are generated by the constant flow of control data as such device identity, geographical positioning, node mobility, device configuration, and others. Network clustering is a popular overhead communication management method. Many cluster-based routing methods have been developed to address system restrictions. Node clustering based on the Internet of Things (IoT) protocol, may be used to cluster all network nodes according to predefined criteria. Each cluster will have a Smart Designated Node. SDN cluster management is efficient. Many intelligent nodes remain in the network. The network design spreads these signals. This paper presents an intelligent and responsive routing approach for clustered nodes in IoT networks. An existing method builds a new sub-area clustered topology. The Nodes Clustering Based on the Internet of Things (NCIoT) method improves message transmission between any two nodes. This will facilitate the secure and reliable interchange of healthcare data between professionals and patients. NCIoT is a system that organizes nodes in the Internet of Things (IoT) by grouping them together based on their proximity. It also picks SDN routes for these nodes. This approach involves selecting one option from a range of choices and preparing for likely outcomes problem addressing limitations on activities is a primary focus during the review process. Predictive inquiry employs the process of analyzing data to forecast and anticipate future events. This document provides an explanation of compact units. The Predictive Inquiry Small Packets (PISP) improved its backup system and partnered with SDN to establish a routing information table for each intelligent node, resulting in higher routing performance. Both principal and secondary roads are available for use. The simulation findings indicate that NCIoT algorithms outperform CBR protocols. Enhancements lead to a substantial 78% boost in network performance. In addition, the end-to-end latency dropped by 12.5%. The PISP methodology produces 5.9% more inquiry packets compared to alternative approaches. The algorithms are constructed and evaluated against academic ones.
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IJCNCJournal
Cyber intrusion attacks increasingly target the Internet of Things (IoT) ecosystem, exploiting vulnerable devices and networks. Malicious activities must be identified early to minimize damage and mitigate threats. Using actual benign and attack traffic from the CICIoT2023 dataset, this WORK aims to evaluate and benchmark machine-learning techniques for IoT intrusion detection. There are four main phases to the system. First, the CICIoT2023 dataset is refined to remove irrelevant features and clean up missing and duplicate data. The second phase employs statistical models and artificial intelligence to discover novel features. The most significant features are then selected in the third phase based on cooperative game theory. Using the original CICIoT2023 dataset and a dataset containing only novel features, we train and evaluate a variety of machine learning classifiers. On the original dataset, Random Forest achieved the highest accuracy of 99%. Still, with novel features, Random Forest's performance dropped only slightly (96%) while other models achieved significantly lower accuracy. As a whole, the work contributes substantial contributions to tailored feature engineering, feature selection, and rigorous benchmarking of IoT intrusion detection techniques. IoT networks and devices face continuously evolving threats, making it necessary to develop robust intrusion detection systems.
** Connect, Collaborate, And Innovate: IJCNC - Where Networking Futures Take ...IJCNCJournal
The International Journal of Computer Networks & Communications (IJCNC) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of Computer Networks & Communications. The journal focuses on all technical and practical aspects of Computer Networks & data Communications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
Enhancing Traffic Routing Inside a Network through IoT Technology & Network C...IJCNCJournal
IoT networking uses real items as stationary or mobile nodes. Mobile nodes complicate networking. Internet of Things (IoT) networks have a lot of control overhead messages because devices are mobile. These signals are generated by the constant flow of control data as such device identity, geographical positioning, node mobility, device configuration, and others. Network clustering is a popular overhead communication management method. Many cluster-based routing methods have been developed to address system restrictions. Node clustering based on the Internet of Things (IoT) protocol, may be used to cluster all network nodes according to predefined criteria. Each cluster will have a Smart Designated Node. SDN cluster management is efficient. Many intelligent nodes remain in the network. The network design spreads these signals. This paper presents an intelligent and responsive routing approach for clustered nodes in IoT networks. An existing method builds a new sub-area clustered topology. The Nodes Clustering Based on the Internet of Things (NCIoT) method improves message transmission between any two nodes. This will facilitate the secure and reliable interchange of healthcare data between professionals and patients. NCIoT is a system that organizes nodes in the Internet of Things (IoT) by grouping them together based on their proximity. It also picks SDN routes for these nodes. This approach involves selecting one option from a range of choices and preparing for likely outcomes problem addressing limitations on activities is a primary focus during the review process. Predictive inquiry employs the process of analyzing data to forecast and anticipate future events. This document provides an explanation of compact units. The Predictive Inquiry Small Packets (PISP) improved its backup system and partnered with SDN to establish a routing information table for each intelligent node, resulting in higher routing performance. Both principal and secondary roads are available for use. The simulation findings indicate that NCIoT algorithms outperform CBR protocols. Enhancements lead to a substantial 78% boost in network performance. In addition, the end-to-end latency dropped by 12.5%. The PISP methodology produces 5.9% more inquiry packets compared to alternative approaches. The algorithms are constructed and evaluated against academic ones.
Multipoint Relay Path for Efficient Topology Maintenance Algorithm in Optimiz...IJCNCJournal
The Optimal Link State Routing (OLSR) protocol employs multipoint relay (MPR) nodes to disseminate topology control (TC) messages, enabling network topology discovery and maintenance. However, this approach increases control overhead and leads to wasted network bandwidth in stable topology scenarios due to fixed flooding periods. To address these challenges, this paper presents an Efficient Topology Maintenance Algorithm (ETM-OLSR) for Enhanced Link-State Routing Protocols. By reducing the number of MPR nodes, TC message generation and forwarding frequency are minimized. Furthermore, the algorithm selects a smaller subset of TC messages based on the changes in the MPR selection set from the previous cycle, adapting to stable and fluctuating network conditions. Additionally, the sending cycle of TC messages is dynamically adjusted in response to network topology changes. Simulation results demonstrate that the ETM-OLSR algorithm effectively reduces network control overhead, minimizes end-to-end delay, and improves network throughput compared to traditional OLSR and HTR-OLSR algorithms.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
High Performance NMF Based Intrusion Detection System for Big Data IOT Traffic
1. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
DOI: 10.5121/ijcnc.2024.16203 43
HIGH PERFORMANCE NMF BASED INTRUSION
DETECTION SYSTEM FOR BIG DATA IOT TRAFFIC
Abderezak Touzene, Ahmed Al Farsi, Nasser Al Zeidi
Department of Computer Science, College of Science, Sultan Qaboos University, Oman
ABSTRACT
With the emergence of smart devices and the Internet of Things (IoT), millions of users connected to the
network produce massive network traffic datasets. These vast datasets of network traffic, Big Data are
challenging to store, deal with and analyse using a single computer. In this paper we developed parallel
implementation using a High Performance Computer (HPC) for the Non-Negative Matrix Factorization
technique as an engine for an Intrusion Detection System (HPC-NMF-IDS). The large IoT traffic datasets
of order of millions samples are distributed evenly on all the computing cores for both storage and speed-
up purpose. The distribution of computing tasks involved in the Matrix Factorization takes into account the
reduction of the communication cost between the computing cores. The experiments we conducted on the
proposed HPC-IDS-NMF give better results than the traditional ML-based intrusion detection systems. We
could train the HPC model with datasets of one million samples in only 31 seconds instead of the 40
minutes using one processor), that is a speed up of 87 times. Moreover, we have got an excellent detection
accuracy rate of 98% for KDD dataset.
KEYWORDS
Intrusion Detection Systems, Machine Learning, Dimensionality Reduction, High Performance Computing,
IoT traffic.
1. INTRODUCTION
Based on reports published by cybersecurity institutions in several countries worldwide, network
cyber-attacks have increased exponentially in recent decades. These days, as we witness the era
of the Fourth Industrial Revolution (4IR) and its emerging technologies like the Internet of
Things, Quantum Computing, and Artificial Intelligence. Millions of users have become
connected to the Internet, and hundreds of millions of devices connected to the network produce
millions of network traffic records datasets. Storing and analysing those massive network traffic
datasets using a single computer become difficult and highly inefficient especially when it comes
to detection and prevention of traffic attacks on real-time.
Many machine learning algorithms can deal with relatively large datasets dimensionality
reduction techniques for faster analytics, but it takes much time as the dataset size increases, Big
Data. Therefore, it is necessary to use High Performance Computer and parallel implementation
of machine learning algorithms to overcome both the storage and speed limitations.
This paper will focus on parallel Nonnegative Matrix Factorization using a High Performance
Computer (HPC) using Message Passing Interface (MPI) for processors’ communication to
implement an efficient real-time intrusion detection system to Big Data Analytics for large scale
IoT traffic datasets. Nonnegative Matrix Factorization (NMF) is an approximation numerical
method aiming at decomposing data matrix A into its simpler factors lower-rank matrices H and
W. NMF is an unsupervised Machine Learning technique widely used in data mining, dimension
2. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
44
reduction, clustering, factor analysis, text mining, computer vision, and bioinformatics, image
recognition and recommendation systems to name a few. In contrast to Singular Value
Decomposition (SVD) and Principal Component Analysis (PCA), Nonnegative Matrix
Factorization (NMF) requires that A, W, and H be nonnegative. For many real-world data, non-
negativity is inherent, and the interpretation of factors has a natural interpretation which could be
one of the advantages of NMF compared to PCA and SVD.
Formally, Nonnegative Matrix Factorization problem is to find two low-rank factors matrix
and for a given nonnegative matrix , such that A ≈ WH. Most of the available
optimization techniques include Hierarchical Alternative Least squares HALS, Multiplicative
Updates (MU), Stochastic Gradient Descent, and Block Principal Pivoting (ALNS-BPP), which
are based on alternating optimizing W and H while keeping one of them fixed.
1.1. Intrusion Detection System Background
This section discusses the background of Intrusion detection systems, including their definition
and diverse types. Moreover, it discusses several papers on machine learning IDS algorithms.
1.1.1. Intrusion Detection System(IDS)
An intrusion Detection System is defined as a hardware device or software that observes systems
for malicious network traffic or policy violations. The purpose of IDS is to detect various types of
malicious network traffic or malicious computer use that a firewall cannot recognize. This is
critical to achieving high protection against actions threatening computer systems' availability,
integrity, or confidentiality [1].
1.1.2. Types of Introduction Detection Systems(IDS)
There are many classifications for intrusion detection systems (See Figure 1). However, this
classification has been used extensively in previous studies based on the data collection method:
1. Network intrusion detection system, which observes and analyses data traffic to detect if
there is an attack or malicious behaviour (NIDS).
2. The Host-based intrusion detection system monitors and analyses data from log files
(HIDS), and based on the detection technique, it can be categorized into three main
categories: Specification-based IDS, Anomaly-based IDS, and signature-based IDS.
Figure1: Intrusion Detection Systems Classification
3. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
45
Signature-based (Misuse) Intrusion Detection Systems: Signature-based intrusion
detection analyzes the network traffic, searching for occasions or combinations of similar to
a predefined event pattern that describes a known attack.
Advantages: Signature-based Intrusion Detection Systems effectively detect intrusions
without almost no false detections.
Disadvantages: Signature-based-IDS can only identify known attacks, requiring constant
updating of the attack signatures. Signature-based-IDS detectors are trained very well for
detecting specific types of attacks which may prevent them from detecting new kinds of
attacks.
Anomaly-based Introduction Detection Systems: Anomaly-based intrusion detection
systems identify abnormal behavior on a network. Commonly attacks differ from regular
legitimate network traffic; the IDS can detect them by analyzing these changes and
differences. Anomaly IDS are trained well on normal network traffic from historical data
collected. So, they can see abnormal behaviors easily.
Advantages: Anomaly-based Intrusion Detection systems can detect abnormal behavior, so
they can detect an attack without any knowledge about it, only from their behavior.
Disadvantages: Because of the variations in users and network behaviors, Anomaly-based
IDS may fire many false alarms. Anomaly detection approaches must be trained in huge
datasets of normal behavior activities.
1.2. Motivation
In this paper we aim at overcoming some of the limitations existing in traditional machine
learning-based Intrusion Detection Systems (IDS), such as a considerable amount of training and
testing on Big data datasets and to detect multiple types of attack in real-time. In this study we
analyse the performance of ML-based IDS using KDD and CIC datasets by applying NMF,
which will help reduce the datasets dimensions into lower-rank matrices that can be used for
analysing and testing any new network traffic in real time.
The rest of the paper is structured of the remaining as follows: In section 2, will start with a brief
background on Machine Learning based IDS and NMF and introduce the related work on which
the proposed solution will be built. In section 3 will discuss the design of the proposed HPC
parallel NMF based IDS including the learning and the detection phase. Section 4 is dedicated to
the experimental work, it describes the implementation environment and the datasets used, and
the performance evaluation of our IDS. Finally, section 5 will summarizes the paper and
highlights some limitations along with the future works.
2. BACKGROUND AND RELATED WORK
This section will discuss the background of Machine Learning IDS and the background related to
the Nonnegative Matrix Factorization (NMF).
2.1. Machine Learning-based IDS
Many recent Anomaly Intrusion Detection Systems (AIDS) is based on Machine Learning
methods. There are a lot of ML algorithms and methods used for ML-based IDS, such as neural
networks, nearest neighbour, decision trees, and clustering methods, applied to discover the
meaningful features from IDS datasets [1] [2].
4. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
46
2.1.1. Supervised Learning in Intrusion Detection System
Supervised ML-based IDS techniques that can identify attacks based on labeled training datasets.
A supervised ML technique can be divided into training and testing. Training phase, important
features are specified and processed in datasets, then we train the model from these datasets.
There are many applications of supervised machine learning-based IDS. Li et al. [3] used an
SVM classifier with an RBF kernel to classify the KDD 1999 dataset into predefined classes.
From a total of 41 attributes, a subset of features was carefully chosen by using the feature
selection approach [3]. K-Nearest Neighbours (KNN) classifier: The k-Nearest Neighbour (k-
NN) method is a typical non-parametric classifier used in machine learning [4]. These methods
aim to name an unlabelled data sample to the class of its k nearest neighbours.
2.1.2. Un Supervised ML-Based Intrusion Detection System
Unsupervised ML can be defined as an ML technique that obtains information of interest from
input data sets without class labels. The input data points are usually treated as a set of random
variables. A standard density model is then generated for the data set. In supervised learning,
output labels are presented and used to train the machine to obtain the desired results for an
unseen data point. By contrast, in unsupervised learning, no label is provided. Instead, the data is
automatically grouped into different categories through the learning process [5].
2.2. Non Negative Matrix Factorization (NMF)
Non-negative matrix factorization is an algorithm that takes a nonnegative input matrix
and decomposes it into lower rank matrices W and H based in low rank parameter K. NMF is an
unsupervised Machine Learning technique commonly used in clustering, dimensionality
reduction, factor analysis, data/text mining, computer vision, bioinformatics, image recognition
and recommendation systems to name a few. In contrast to Principal Component Analysis (PCA)
and Singular Value Decomposition (SVD), NMF requires that A, W, and H be nonnegative. For
many real-world data, non-negativity is inherent, and the interpretation of factors has a natural
interpretation which could be one of the advantages of NMF compared to PCA and SVD [10]
[11].
2.2.1. Foundations of Non Negative Matrix Factorization Framework
NMF takes a nonnegative input matrix is number of rows which represent number of
features and is number of column which represent number of samples, and low rank parameter
which is positive integer < { }, NMF algorithms aims to find two low rank matrices
and such that .
NMF aims to minimize the following cost function:
(1)
(2)
(3)
Most of the available optimization techniques include Hierarchical Alternative Least squares
HALS, Multiplicative Updates (MU), Stochastic Gradient Descent, and Block Principal Pivoting
(ALNS-BPP), which are based on alternating optimizing W and H while keeping one of them
fixed. NMF-IDS system consist of three major phases. In phase 1 the network dataset file is
converted into a two dimensional matrix . In phase 2, the matrix will be factorizedinto
two low-rank matrices and . Phase 3 consists of the detection phase. The same phases
will be conducted for the parallel High Performance Computing distribution HPC-NMF- IDS.
5. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
47
2.2.2. NMF Factorization Phase
Lee and swing [6] proposed a multiplicative updates algorithm (MU) to solve the NMF
factorization problem where the factor matrices W and H are updated using the following
formulas:
(4)
(5)
means the matrix transpose of the matrix . MU algorithm can be divided into individual
smaller sub problems of matrix dot product. Instep 1 we update W based on , and
,then in step 2 we update H based on , and . See algorithm 1 below.
Algorithm (1)
The while loop at algorithm 1 will stop if the stopping criteria are satisfied. Either it reaches the
maximum number of iterations specified by the user, or it reaches convergence based on the
Frobenius norm function .
2.2.3. NMF Detection Phase
After MU algorithm reach to convergence or it reach the maximum number of iterations specified
by the user, we obtain the factor matrices W and H, that can be used to represent every sample
from A as weighted linear combination of columns of W, every column of W called bases where
the corresponding called the weights or encoding.
6. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
48
Now if we have a new sample let’s call it and we want to check if it matches one or more of the
samples from the training set, we compute the encoding of it using :
Now we check the similarity of this encoding with every encoding that existed in H. The closest
match (sample class) is that sample whose encoding is the closest to the new sample (multi-class
detection). We can determine the matching score between the encodings using the following
formula:
2.3. Related Work
2.3.1. Serial NMF-based IDS
X. Guan in [7] presented an efficient and fast anomalous intrusion detection model that includes
many data from different sources. A new method based on non-negative matrix factorization
(NMF) is discussed to characterize program and user behaviors in a computer system. A large
amount of high-dimensional data was collected in their experiments. NMF was used and reduced
the vectors to a smaller vector length after that, any simple classifier can be implemented in low
dimension data instead of the entire dataset. After getting low dimension features the model can
differentiate between normal traffic and abnormal traffic easily by using a threshold, so any user
behavior on that threshold will be considered an attack.
Limitations: Although the implemented NMF-based IDS gives good accuracy, the datasets
were nonstandard. Moreover, the threshold technique used in the testing phase could not be
applied to multi-class network attacks.
2.3.2. Parallel NMF
In order to overcome the limitation of low performance NMF when applied to a larger dataset.
scientists have proposed parallel NMF and applied it in different ways, we will discuss here two
approaches, the first is based on MapReduce, and the second is based on Message Passing
Interface (MPI).
2.3.2.1. Hadoop Map Reduce based Parallel NMF
Yin et al., 2018 [12] proposed scalable distributed Nonnegative Matrix Factorization based on
Hadoop Map Reduce for different application. See Figure 2.
Figure 2: Map Reduce Applications
7. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
49
The authors proposed a technique for NMF matrix update by using block-wise updates in an
efficient MapReduce implementation. Moreover, they propose frequent and lazy block-wise
updates to optimize the operations. They claimed that their solution is faster than any existing
MapReduce NMF implementation with a traditional NMF update algorithm.
Limitations:
Although their implementation can handle larger files efficiently, the reported results for NMF
algorithm time are relatively large. See Figure 2. It is meanly due to Hadoop Map Reduce-based
algorithms with involving read/write data to/from disk which affecting the algorithms'
performance.
2.3.2.2. Message Passing Interface based NMF
MPI-FAUN by R. Kannan et al. 2018 [13] overcomes the Hadoop implementation as it shows
better speedup results. They test their algorithm in more extensive datasets of order millions x
millions in seconds using MPI-based parallel high-performance NMF. See Figure 3.
Figure 3: MPI based applications
The authors proposed a parallel distributed high-performance NMF framework based on MPI that
iteratively updates the low-rank factors in an alternating fashion. The framework proposed can be
applied with many different NMF update algorithms, giving efficient results for dense and sparse
matrices of massive sizes of hundreds of millions of datasets. The framework parallelism is
designed to use minimum communication and it can scale up to more than 1000 cores.
Figure 4: Processors data distribution
Data Distribution:
They divided the matrix W into blocks of rows equal to the number of processors p, ( )
8. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
50
and the matrix H into blocks of columns ( ). Then, based on this distribution, matrix
the A is distributed by rows ( ) and also by columns ( ), as shown in the
Figure 4, so that processor has column and row . Using this distribution Alternating
Update algorithm such as Multiplicative Update (MU) or Hierarchical Alternative Least Square
(HALS) implemented
Limitations: As far as it is linked to our research, the authors were only interested to develop a
parallel version of NMF without testing on specific application.
In this paper, we use the same methodology, apply it and test it in the context of developing a
real-time Intrusion Detection System using our University High Performance Computing facility.
3. PARALLEL NMF-IDS PROPOSED SOLUTION
This section will discuss the proposed solution of distributed parallel NMF based IDS in the HPC
of Sultan Qaboos University (Luban).
3.1. Proposed Solution
This section discusses the proposed solution for the problem of low performance (Speed and
accuracy) of Machine Learning Anomaly Intrusion Detection System when dealing with huge
datasets of orders of millions of samples.
3.1.1. NMF based Intrusion Detection System
KDD99 and CIC datasets (discussed in section 4.3) are Big data sets contains millions of network
traffic data. Based on the results of the above-mentioned literature, NMF based IDS has proven to
give better performance than other ML based AIDS, in terms of speed of training/testing high
dimensional datasets and accuracy. Therefore, NMF was selected. The initial experiment on
NMF based IDS (one processor) showed promising results.
3.1.2. Parallel NMF based Intrusion Detection System
Although NMF-based IDS showed good performance in relatively small datasets, it showed,
based on the experiments, to take a lot of time for larger datasets. In this study, we solved this
issue by applying parallel MPI-based NMF implemented on high-performance computing Luban
(section 4.2).
As mentioned in section (2.1.1), the non-negative matrix factorization algorithm aims to
decompose input matrix into the low-rank matrices and . Lee and swing [6]
proposed a multiplicative updates algorithm (MU) to solve the NMF problem. The matrices W
and H are updated using the following formulas:
1.
2.
Where means the matrix transpose of the matrix . MU algorithm can be divided into
individual smaller sub-problems of matrix dot product. In step 1 we update W based on ,
and , then in step 2 we update H based on , and . Looking at the
9. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
51
dimensions of the matrices involved on each dot product operation, we can see that and
have low dimension only . So, they can be solved in all processors without distribution
to reduce communication costs. Now, to solve the rest of the operations in parallel, we divided W
into blocks of rows equal to the number of processors (W1,…..,Wp) and the matrix H into blocks of
columns (H1
,…Hp
). Then, based on this distribution, we distributed the matrix A, once by rows
(A1,…,Ap) and once by columns (A1,…,Ap), as shown in Figure 5, so that processor I has column Ai
and
row Ai.
Figure 5: Parallel data distribution and communication
With this distribution of data and variables, we can now apply parallel MU algorithm using only
two communications per iteration. As shown in algorithm 1. we can solve in several parallel
instances of equations 4 and 6 to update and , respectively. In equation 3, we apply MPI all-
gather to gather parts of the updated matrix from each processor and distribute the full matrix to
all processors. We do the same process for in equation 5.
Algorithm (2)[W,H]=Parallel_NMF(A,k)
○ is the input matrix distributed bothrow-wise and column-wise across processors
○ is rank of approximation
○ (1): initialize by processors
○ (2): while stopping criteria are not satisfied do
/*compute W given H*/
(3) : Collect H on each process or using all-gather communication
(4) :
/*compute H given W*/
(5) : Collect the matrix Win each processor using all-gather communication
(6) :
(7) : end while
The all-gather communication steps allow to collect from all processors the updated row blocks
of Wi and the columns blocks of H I
to form the matrices W and H on each processor in order to
start a new iteration.
4. IMPLEMENTATION OF HPC-NMF-IDS AND EXPERIMENTAL RESULTS
We will explain the implementation environment starting by explaining the software and
hardware specification of Luban High performance Computing system at Sultan Qaboos
10. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
52
University. Then the data sets used in this study will be described. After that, the parallel
multiplicative update algorithm will be discussed, the rest of the section will show the
performance of the results.
4.1. NMF-IDS Methodology
In this study we will test our NMF-IDS on two known datasets (KDD, and CIC) after a pre-
processing phase, training phase using NMF factorization eq. 4, 5, and detection testing eq.7, 8, 9.
Figure 6: HPC-NMF-IDS Methodology
4.2. High Performance Computing (Luban) @ SQU
Luban is a High Performance Computing system of Sultan Qaboos University (SQU), launched
in February 2020. It provides 50 teraflops of computing power delivered using 15 advanced
compute nodes with around 400 terabytes of storage space, all connected to a high-speed
connection. The hardware and software specifications of Luban System are as follows. See
Figure 7:
● Compute Node Specification:
Each compute node (Think-System SD530): has Cent OS 7 Linux operating system, Dual 20-
cores Intel Xeon Gold 6230 2.10GHz CPUs, 197GB RAM, 10Gb Ethernet interface, and 100Gb
Intel Omni-Path Architecture (OPA) 100 Series.
● Login and Master Node Specification:
Each node (Lenovo Think-System SR630): has Cent OS 7 Linux operating system, Dual 14-cores
CPUs, 197 GB RAM, and 480TB + 12GB (SSD) storage.
Figure 7: High Performance Computing Facility HPC Luban @SQU
11. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
53
4.3. Datasets
This section will discuss the datasets used in this study in detail and how we pre-process them.
To study the performance of the proposed solution in this study, we applied it and analysed its
efficiency and accuracy on two different datasets, namely KDD and CIC.
KDD99: KDD dataset is a dataset used in an international competition held at the University of
California [8], where the goal of that competition was to build an intrusion detection system that
can differentiate between a normal good connection, or a bad connection called an intrusion or
attack. KDD dataset is about 5 million connection records that was generated from 7 days’
network traffic. It contains 41 features, and it is labelled by either Normal or specific type of
attack. KDD contains attacks that can be categorized to Denial of service (DoS), Remote to local
(R2L), and User to root (U2R).
CIC-IDS2017:The second dataset used in this study is CICIDS2017,created by I.Sharafaldin [9]
from the Canadian Institute for Cybersecurity. It’s a benchmark dataset consisting of 2830743
samples and 78 features. CIC dataset contains more recent attacks. For example, Brute Force
FTP, DoS, infiltration, DDoS.
4.4. Datasets Pre-Processing
This section explains the pre-processing methods for KDD and CIC datasets before applying
NMF to them to get the best results.
Label Encoding: To apply NMF to any dataset, we must ensure that all elements within the
dataset are non-negative numbers. The KDD dataset contains some features with text values
namely, service, Protocol_type, and flag, so they need to be converted to numeric values using
label_encoder from sk learn library of Python.
Normalization: Some features from the datasets contain large numbers. For example, src_bytes
and dst_bytes from KDD have large values that can reach thousands. Also, in CIC dataset
Flow_Duration contains values reach more than one million and Destination_Port can reach
thousands, those great values may affect the model's performance as it will be biased to those
great values. Therefore, normalization is applied to ensure that all the dataset’s values are in the
same range. In this study, we apply min-max normalization to make sure that all the values are
ranged between 0 and 1 only.
Train/Test Split: We divided KDD and CIC datasets into several training data sizes to apply the
proposed parallel NMF on it.
Training datasets sizes (30K)
Testing dataset size 3K
The original shape of the datasets was to reduce the number of features and to get
correct results out from NMF we will transpose the input matrix so it will be on the following
shape .
12. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
54
4.5. Experiments and Results
4.5.1. Experiment (1) Find Best Rank K for KDD Dataset
To make the most of NMF Algorithm, we must select the rank hyper-parameter carefully to
ensure that it gives the best results by striking a balance of reducing the dimension of the problem
and keeping the most amount of the features to insure good detection accuracy. In theory it is
difficult to assess which rank K will be the best. We decided to select the best rank
experimentally by running 1000 iteration of NMF with different values of K, as shown in Figure
8. By analyzing the results of the experiments, was selected as it gives an accuracy rate
reaching up to 98% in 1000 iterations.
Figure 8: NMF-IDS Best Rank selection for KDD
4.5.2. Experiment (2) Training & Testing on 30K samples KDD
Using as per the previous experiment, we implemented NMF on 30000 samples of42
features extracted from the KDD dataset.
Table 1. 30K samples KDD dataset Results
Iterations Training Time(s) Accuracy
(%)
100 3.9 76
200 7.7 83
400 15.0 91
600 23 97
800 31 98
As shown in the Table 1, increasing the iterations gives better results in terms the detection
accuracy, at the expense of higher training time (factorization). As it is clear, NMF for one
processor finished 100 iterations in approximately 4 seconds, with a detection accuracy of 76%,
compared to 800 iterations in approximately 31 seconds, but with a detection accuracy of 98%.
13. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
55
Table 2. 30K samples KDD dataset Results
Number of
processors
Training Time(s) Speedup
1 39.5 1
4 3.9 10.1
8 6.6 5.9
32 9 4.3
64 8 4.9
120 9.8 4
Table 2 summarizes the results of running our proposed parallel NMF. We can reduce the
training time from 39.5 seconds using 1 processor to 3.9 seconds using 4 processors which
corresponds to a speedup of 10. This speedup is called super-linear speedup as it is more than
number of processors (4). Super-linear speedup happened here because running NMF in one
processor with a dataset that may not fit into its main memory, virtual memory using paging
stored in the disk memory will be time consuming. On other hands, we noticed unexpected
behaviour as we increase the number of processors the speedup decreases. This happens due to
the cost of the communication between processors, which overwhelms the speedup in processing
when the dataset is small.
4.5.3. Experiment (3) Training & Testing on 1M samples KDD
In this experiment we applied parallel NMF on one million samples of KDD, after one thousand
iterations we got detection accuracy of 97%. The results using different numbers of processors
are shown in Table 3.
Table 3.1M samples KDD dataset Results
Number of
processors
Training Time(s) Speedup
1 2432 -
4 977 2.5
8 550 4.4
32 259 9.3
64 167.7 14.5
120 159 15.3
240 61 39.9
320 39 62.3
420 31 87.5
In this experiment, as we can notice from Table 3, training the model using one processor took
2432 seconds, approximately 40 minutes. However, we reduced this large number using the
Parallel NMF on 420 cores to be only 31 seconds, with a speedup of 87.5 times.
4.5.4. Experiment (4) Find Best Rank K for CIC Dataset
As the best selection of the rank K depends on the dataset, in this experiment we run different
runs of NMF with different values of K for the CIC dataset, as it is shown in Figure 9. After1000.
By analyzing the results of the experiments, was selected as it gives an accuracy rate
reaching up to 90% in 1000 iterations.
14. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
56
Figure 9: NMF-IDS Best Rank selection for CIC
4.5.5. Experiment (5) Training on 30K samples CIC
Using based in experiment 4, we implemented parallel NMF on 30000 samples of CIC.
Using different number of processors, we got the following results shown in Table 2.
Table 4. 30K samples CIC Dataset Results
Iterations Training Time(s) Accuracy (%)
100 20.5 66
200 41.0 74
400 82.1 84
600 124.0 87
800 164.0 90
As shown in the Table 4, increasing the iterations gives better results the accuracy rate is
increasing but with an additional cost for training. NMF-IDS reaches a detection accuracy of 90%
in 164 seconds.
4.5.6. Experiment (6) Training on 30K samples CIC parallel
Using the best rank from experiment 5, we implemented parallel NMF on 30000 samples
of CIC. From Table 5, we notice a similar behavior compared to the KDD dataset in terms of
speed-up due to the size of the dataset which is relatively small.
Table 5. 30K samples CIC dataset Results
Number of
processors
Training Time(s) Speedup
1 164 1
4 36 4.5
8 19 8.6
32 8 20.5
64 10 16.4
120 12 13.6
15. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
57
4.5.7. Experiment (7) Training on 1M samples CIC parallel
Using select in experiment 4, we implemented parallel NMF on one million samples of
CIC. Table 6 shows a better speed up compared to smaller dataset 30K. Using 120 processors our
parallel HPC-NMF-IDS manages to reduce the training time from 1.6 hours to 3.45 minutes.
Table 6. One Million samples CIC dataset Results
Number of
processors
Training Time(s) Speedup
1 5953 1
4 1778 3.3
8 1126 5.2
32 521 11.4
64 302 19.7
120 207 28.8
5. CONCLUSION
Millions of users and smart devices connected to the network produce several millions of
network traffic records. Storing and analyzing those traffic datasets is a challenging task. In this
paper we proposed a parallel and distributed Intrusion detection system based on dimensionality
reduction using High Performance Computer facility for Non-Negative Matrix Factorization to be
able to analyze efficiently large IoT traffic datasets. To achieve higher speedups using as many
cores in the HPC, the NMF algorithm distributes the blocks of rows and columns of the matrices
A, W, and H, by taking into account the data locality and minimization of the communication
between the computing node. Unlike the previous work which focus only on binary classification
of the network traffic, our implementation can detect multi-class of network attacks.
Experimental results show a detection precision of 98% for KDD datasets and 90% precision for
CIC dataset. In terms of efficiency for the HPC implementation, we could train our model using
KDD dataset of order of a million of samples in only 31 seconds instead of the sequential
implementation (one processor) which took approximately 40 minutes, that is a speed up of 87
times.
In our future work this study will investigate different approaches for data distribution distribute
to make the most of the parallelism and reduce the communication overhead to the minimum
possible. We will also investigate different update methods for NMF updates, including Block
Principal Pivoting (BPP) and Hierarchical Alternating Least Squares (HALS) which may give
faster results by reducing the number of iterations and thus the computation cost.
REFERENCES
[1] L. Xiao, X. Wan, X. Lu, Y. Zhang, and D. Wu, “IoT Security Techniques Based on Machine
Learning: How Do IoT Devices Use AI to Enhance Security”,IEEE Signal Process. Mag., vol. 35,
no.5,pp. 41–49, 2018.
[2] N. Kshetri and J. Voas, “Hacking Power,” no. December, pp. 91–95, 2017.
[3] Y. Li, J. Xia, S. Zhang, J. Yan, X. Ai, and K. Dai, “An efficient intrusion detection system based
on support vector machines and gradually feature removal method,” ExpertSyst.Appl., vol.39, no. 1,
pp. 424–430, 2012.
[4] W. C. Lin, S. W. Ke, and C. F. Tsai, “CANN: An intrusion detection system based on combining
cluster centers and nearest neighbors,”Knowledge-BasedSyst.,vol.78,no.1,pp.13–21,2015.
[5] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey of intrusion detection systems:
techniques, datasets and challenges,” Cybersecurity, vol. 2, no. 1, 2019.
16. International Journal of Computer Networks & Communications (IJCNC) Vol.16, No.2, March 2024
58
[6] D. Leeand H.S. Seung, “Algorithms for Non-Negative Matrix Factorization,” in Advances in Neural
Information Processing Systems, no. 1, 2000, pp. 556–562.
[7] X.Guan, W.Wang, and X.Zhang, “Fast intrusion detection based on a non-negative matrix
factorization model,” J. Netw. Comput. Appl., vol. 32, no. 1, pp. 31–44, 2009.
[8] S. Stolfo, “KDD-99 Dataset, ” online,1999.
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (accessed Dec. 24, 2022).
[9] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection
dataset and intrusion traffic characterization, ”ICISSP2018 -Proc.4thInt.Conf.Inf.Syst.Secur. Priv.,
vol. 2018-Janua, no. Cic, pp. 108–116, 2018.
[10] R. Hedjam, A. Abdesselam, and F.Melgani, “NMF with feature relationship preservation penalty
term for clustering problems,” Pattern Recogn., vol. 112, 2021.
[11] T. Masuda, T. Migita and N. Takahashi, "An Algorithm for Randomized Nonnegative Matrix
Factorization and Its Global Convergence, "2021 IEEE Symposium Series on Computational
Intelligence (SSCI), Orlando, FL, USA, 2021, pp. 1-7.
[12] J. Yin, L. Gao, and Z. Zhang, “Scalable Distributed Nonnegative Matrix Factorization with Block-
Wise Updates,” IEEE Trans. Knowl. Data Eng., vol. 30, no. 6, pp. 1136–1149, 2018, doi:
10.1109/TKDE.2017.2785326.
[13] R. Kannan, G. Ballard, and H. Park, “MPI-FAUN: An MPI –Based Framework for Alternating-
Updating Nonnegative Matrix Factorization,” vol. 30, no. 3, pp. 544–558, 2018.