Machine learning and artificial intelligence techniques are increasingly being used in cyber security to detect threats like malware, fraud, and intrusions. By analyzing large amounts of data, machine learning algorithms can learn patterns of both normal and anomalous behavior and make predictions about new or unseen data. This allows threats to be identified more accurately and in real-time without being explicitly programmed. Some key benefits of machine learning for cyber security include improved spam filtering, malware detection, identifying advanced threats, and detecting insider threats and data leaks. It is helping to address challenges of data overload, speed of threats, and unknown threats that traditional rule-based detection was unable to handle effectively.
2. Machine Learning
What Is It? Why now?
Why is it useful?
Machine
Learning
Artificial
Intelligence
Data
Mining
Statistics
• Machine Learning is an application of
Artificial Intelligence (AI) that allows
computers to learn without being
explicitly programmed to do so.
• Machine learning is the modern
science of finding patterns and
making predictions from data based
on work in multivariate statistics, data
mining, pattern recognition and
advanced/ predictive analytics.
Ex: when detecting fraud in the millisecond it takes to swipe a
credit card, machine learning rules not only on information
associated with the transaction, such as value and location, but
also by leveraging historical and social network data for accurate
evaluation of potential fraud
• Manage your team instead of the data.
Innovation
• Discover hidden patterns
• Adaptability
• Predictive analysis
• With falling profit margins, increasing
End Users expectations and
increasing competition from
competitors which need to cut costs
and improve their offering.
• The ability to extract value from such
vast amounts of data has never been
cheaper or more effective.
3. Machine Learning: How does it learn?
Machine Learning algorithms are categorised as being supervised or unsupervised. The former can apply what has been learned in the past to new data.
The latter can draw inferences from datasets.
Feedback
Training Data
Collect and prepare relevant data to support
analysis. If the learning objective includes
“expert” judgment, also collect the historical
“right answers.”
Algorithms
Algorithms learn to recognise patterns in training
data. Teach the programme how to know when it
is doing well or poorly, and how to self-correct in
the future.
Trained Machine
Machine is now trained and ready to spot
patterns in real world examples in order to
drive business value
Supervised Learning
What? Output variable specified. Algorithm learns mapping
function from input to output
Why? To make predictions
Example: Predicting credit default risk
Unsupervised Learning
What? Output variable unspecified so algorithm looks for
structure in data
Why? To describe hidden distribution or structure of data
Example: Customer segmentation and product targeting
Determine Objective
Decide what you would like the machine
to handle that has previously been done
based on expert knowledge or intuition.
OR
5. How Machine Learning benefits Cyber Security?
Traditionally Cyber Security
Deals problems were aided by Mathematical model.
e.g. – Data transformation[cryptography]
Modern Cyber Security
Deals with abstract threats which cannot be solved only by using mathematical models.
E.g. - Malware detection,
Intrusion detection,
Data leakage
SPAM mitigation
etc
Solution of Modern Cyber Security
6. How Machine Learning benefits Cyber Security?
* A Perfect example of Utilization of ML in Spam Filtering
7. Machine Learning Improved Some of the Areas
• Spam Mitigation
• Malware Detection
• Mitigation the Denial of Service Attacks
• Reputation in Cyber Space
• User Identification
• Detecting Identity Theft
• Information Leakage Detection & Prevention
• Social Network Security
• Detecting Advanced Persisted Threats
• Detecting Hidden Channels
8. Cyber Risk Analytics with Machine Learning
• Data Overload
• Disconnected & low quality data
• High false positive alerts
• Unknown unknowns- No Baseline
• Slow & manual Investigation processes
KEY CHALLENGES
• Focused Insight from Big Data
• Managing & rationalizing data
• Machine Learning identifies hidden patterns
• Diagnostics for understanding ‘normal’
• Targeted alerts based on anomalies
SOULTIONS
9. Threat Analytics
Areas
•Cyber Security refresh rate
•Custom payloads from
attackers
•Servers not the target
•Speed with volume
Why We need
Analytics ?
•Signature Based
•Anomaly Engines
•Analytics Workbench
•Learning Systems
Dissecting Detection
Systems •Credible / Clean training
data
•Positive & timely feedback
•Picking the right features
•Consistent feature variation
•Consistent data pattern
Benefits of ML
•DNS based detection
•DDos/ Traffic Anomaly
•SPAM Mail filters
•Authentication
•Application modelling
•Threat Intelligence
Improvement done by
ML
11. Fraud Detection
With regulations evolving in response to the financial crisis, and technology developing at an exponential rate, Companies should invest
in the latest software to reduce their exposure to risk.
1
1
Method Human Involvement AccuracySpeed
Machine
Learning
Traditional
Detection
Machine Learning
Summary
Lower fraud losses
Lower operational
costs
Improved customer
service
Reduced
reputational risk
Reduced regulatory
risk
• Algorithms analyse historical transaction data
for each customer to understand their individual
spending patterns. They can therefore spot
subtle anomalies that indicate fraud.
• Algorithms self-learn, meaning they quickly
adapt to new means of fraud, and can stay ahead
of fraudsters.
• Rely on pattern matching against recognised
past fraud types. Transactions then assessed
based on general rules, such as whether the
customer is buying abroad.
• Humans to identify trends and manually update
their models to account for changes in fraudulent
activity.
• Low
• Automatic -humans
to maintain the
algorithmic models.
• High
• Preventive over
corrective, meaning
higher rates of fraud
detection and fewer
false alarms.
• High
• Real-time, automatic
reviews of
transactions using
vast amounts of data
from multiple
sources.
• High
• Requires significant
manual analysis and
review, with regular
updates to fraud
systems.
• Medium
• Often corrective over
preventive with
limited use of data,
meaning lower
detection success
rates.
• Medium
• More human
involvement, often
using audit trails to
identify fraud.
• Less computing
power.
Credit Card Fraud Detection Scenario
12. Improvement of Security Incident
Internet-Scale measurement &
data collection (external)
• Malicious Activities: spam, phishing, scanning
• Network Mismanagement e.g. untrusted HTTPS
• Security Incident Reports: Victims VS Non-Victims
Data processing & feature
extraction
• Alignment in time & space
• Aggregate at the org. level
• 258 features, raw data & 1st/2nd order stats
Advanced data mining &
machine learning
• Classifier training
• Correlational Analysis
Prediction : the likelihood
of a future incident & type
of incident
Understanding causality
among features, security
inter-dependence
Incentive mechanism
design
14. Judgement Based
IT Sector evolving meaning they have a web of overly complex procedures built on multiple legacy platforms. Developments in
Robotics and Machine Learning mean automation of these processes is now more feasible and powerful than ever.
BusinessImpact
Nature Of Work
Rules Based
TransformationalTactical
Foundation
Simple, ad-hoc, project level
automation that can undertake
simple rule-based actions of a
single task within an application
when prompted (e.g. macros).
Robotic Process Automation
Also rule-based, but robots can
respond to external stimuli and have
their functions reprogrammed. They
can open and move structured data
between multiple applications, from
legacy systems to third party APIs
(application program interfaces).
Cognitive Automation
Self-learning, autonomous systems
driven by Machine Learning and
Natural Language Processing (NLP)
that can read and understand
unstructured information and
instruct a computer to act.
Understanding the Automation Landscape
Cognitive Automation
15. Cognitive automation has the power to automate many Business processes, in particular risk and regulatory reporting.
Cognitive Automation In Action – Document Processing Example
1 42 3 5
Open Email Classify according to
type
Comprehend & extract
relevant information
Validate information
against rules
Populate data into
Enterprise Resource
Planning system
Machine Learning
& NLP
Machine Learning
& NLP
Robotics
Machine Learning
& NLP
Robotics
Process&Technology
• Robotics can be thought of as the ‘hand’ work and cognitive the ‘head’ work – together they form a powerful alliance and can automate even
those processes that involve comprehending unstructured text or recognising voices, and making subjective decisions
• Benefits of cognitive automation include:
Reduce headcount and associated operational costs
Decreased cycle times for processes that can operate 24 hours per day (e.g. risk/regulatory reporting)
Improved accuracy – reduction of human error
Cognitive Automation
16. The following purpose, process and location checklist can be used to help you understand whether Machine Learning can be
successfully applied to a process.
Location: Front, Middle &
Back Office
Purpose: Prediction?
Purpose: Segmentation?
Process: Big Data?
Process: Digital?
Process: Repetitive &
Judgement Based?
Checklist Why?
Supervised learning: Algorithms spot trends in historical data and use this to make
predictions based on new data.
Unsupervised learning: Machine Learning can spot differences and similarities not visible
to the human eye between each data point and make sensible groupings based on these
characteristics.
Processes that involve the use of paper and physical contact between people are not
applicable to Machine Learning.
Algorithms thrive off large datasets, offering better results. They also have the computing
power to analyse big data at speed.
Algorithms learn and improve from each repetition, and the automation of such
processes offers huge cost saving potential.
The advent of tools such as Natural Language Processing and Speech Recognition mean
that Machine Learning can be applied to processes with and without customer/client
interaction.
Cognitive Automation: Process Checklist
Manage your team instead of the data. Machine Learning is based on algorithms that can learn from data without relying on rules-based programming, and its main benefit is the ability to relentlessly analyze data and every combination of variables.
Innovation: Machine Learning is designed to break benchmarks and reset the rules. Agents are not limited to the methods used the previous year, month, or day. Anything goes.
Automatically discover hidden patterns and anomalies within data through a simple visual interface. Instead of reports comprised of static data, get actionable feedback.
Adaptability is the foundation of Machine Learning. Challenges, target metrics and quizzes need to adapt to each individual agent’s pace. Without Machine Learning driving the system, progress is a one-size-fits-all proposition.
Machine Learning is the best model for combining hard science with human behavior. Predictive analysis provides insight into performance plateaus, engagement at work, and loyalty.
A predictive analytics approach to forecasting cyber security incidents. We start from Internet-scale measurement on the security postures of network entities. We also collect security incident reports to use as labels in a supervised learning framework. The collected data then goes through extensive processing and domain-specific feature extraction. Features are then used to train a classifier that generates predictions when we input new features, on the likelihood of a future incident for the entity associated with the input features. We are also actively seeking to understand the causal relationship among different features and the security interdependence among different network entities. Lastly, risk prediction helps us design better incentive mechanisms which is another facet of our research in this domain.
Analysis of data : Data traffic can be analyzed at the packet, connection or session level. The connection represents a bidirectional flow and the session represents multiple connections between the same source and destination. ‘Bro’ can monitor Transmission Control Protocol (TCP), User Datagram Protocol (UDP) and Internet Control Message Protocol (ICMP), and write the analyzed traffic to well-structured, tab-separated files suitable for post-processing. The platform interprets UDP and ICMP connection using flow semantics.
Extraction of features: Log file, for example, contains generic information about each connection, such as the time stamp, connection ID, source IP, source port, destination IP and destination port. This information is not enough. To extract more features from the network traffic, we need to create features and attributes to help us distinguish between normal and harmful traffic.
Selection of unique features: To add more depth to the analysis, we should determine whether the payload contains: Shellcode, JavaScript code, SQL command or SQL injection queries, Command injection or others. Those features can help the machine detect zero-day and web application attacks. To extract all the features, I limit the extraction process to the data sent by the source of the connection. Most features can be extracted using a regular expression or calculated directly from the connection content. Shellcode is a notable exception, because attackers can encrypt, compress or encode it. To solve this problem, at the suggestion of Dr. Ali Hadi, I used malware analysis platform Cuckoo Sandbox. Hadi suggested extracting more features from the traffic, such as the sequence of application program interfaces (APIs).
Creating useful datasets: Now that we Create a good data set with features to detect advanced attacks, we can use it to train the computer to classify new connections
Selecting & classifying features: we selected various important and generic features out of wide to train the computer to recognize the attacks:
Ex: Protocol; Service; Entropy; Number of nonprintable characters; Number of punctuation characters; Contains JavaScript; Contains SQL statement; Contains command injection; and Class.
For the classification, we can use ’ Weka’, a collection of machine learning algorithms for data mining tasks.