SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Classifying Phishing URLs Using
Recurrent Neural Networks
Sergio Villegas
Javier Vargas
*Alejandro Correa Bahnsen
Easy Solutions Research
Eduardo Contreras Bohorquez
Fabio A. Gonzalez
MindLab Research Group,
Universidad Nacional de
Colombia
Industry recognition
A leading global provider of electronic
fraud prevention for financial institutions
and enterprise customers
385 customers
In 30 countries
100 million
Users protected
27+ billion
Online connections monitored
About Easy Solutions®
Easy Solutions to be Acquired by New Joint Venture Creating Global, Secure Infrastructure Company
Phishing
3
Phishing is the act of defrauding an online
user in order to obtain personal information
by posing as a trustworthy institution or
entity.
Typical Phishing Example
4
Why Phishing Detection is Hard
5
Original Website Only Using Images Subtle Changes
Is It Phishing?
Ideal Phishing Detection System
7
Machine
Learning
Algorithm
Ideal Phishing Detection System - Issues
8
Issues with full content
analysis:
• Time consuming
• Impractical to process
millions of websites per day
• Hard to implement for
small devices
There is always the need for an URL
9
Database of URLs
1,000,000 Phishing URLs from PhishTank
10
http://moviesjingle.com/auto/163.com/index.php
1,000,000 Legitimate URLs from Common Crawl
http://paypal.com.update.account.toughbook.cl/8a30e847925afc597516
1aeabe8930f1/?cmd=_home&dispatch=d09b78f5812945a73610edf38
http://msystemtech.ru/components/com_users/Italy/zz/Login.php?run=
_login-submit&session=68bbd43c854147324d77872062349924
https://www.sanfordhealth.org/ChildrensHealth/Article/73980
http://www.grahamleader.com/ci_25029538/these-are-5-worst-super-
bowl-halftime-shows&defid=1634182
http://www.carolinaguesthouse.co.uk/onlinebooking/?industrytype=1&
startdate=2013-09-05&nights=2&location&productid=25d47a24-6b74
CLASSIFYING PHISHING USING
URL LEXICAL AND
STATISTICAL FREQUENCIES
11
URL Lexical and Statistical Frequencies
12
http://www.papaya.com/secure_login.php
URL length Alexa
Ranking
Path length
URL Entropy
# of .com
Punctuation
count
TLD count
Is IP?
Euclidean
distance
KS & KL
distance
URL Lexical and Statistical Frequencies
13
http://www.papaya.com/secure_login.php
URL length Alexa
Ranking
Path length
URL Entropy
# of .com
Punctuation
count
TLD count
Is IP?
Euclidean
distance
KS & KL
distance
Is It Phishing?
URL Lexical and Statistical Frequencies
14
3-Fold CV Accuracy Recall Precision
Average 93.47% 93.28% 93.64%
Deviation 0.01% 0.02% 0.03%
Results:
URL Lexical and Statistical Frequencies
15
Feature
Importance
MODELING PHISHING URLS
WITH RECURRENT
NEURAL NETWORKS
16
Normal Neural Network
17
Source: https://en.wikipedia.org/wiki/Artificial_neural_network
Recurrent Neural Networks RNN
Have loops!
19
The Problem of Long-Term Dependencies
20
Short term dependencies are easy
long term …
Long-Short Term Memory Networks LSTM
21
RNN contains
a single layer
LSTM contains
four interacting
layers
Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Long-Short Term Memory Networks LSTM
22
Key idea: Cell State
LSTM Step-by-Step
23
Step 1. Decide what information is going to be used
LSTM Step-by-Step
24
Step 2. Which new information is stored
LSTM Step-by-Step
25
Step 3. Update old cell state
LSTM Step-by-Step
26
Step 4. Make prediction
Modeling Architecture for URL Classification
27
URL
h
t
t
p
:
/
/
w
w
w
.
p
a
p
a
y
a
.
c
o
m
One hot
Encoding
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
Embedding
3.2 1.2 … 1.7
6.4 2.3 … 2.6
6.4 3.0 … 1.7
3.4 2.6 … 3.4
2.6 3.8 … 2.6
3.5 3.2 … 6.4
1.7 4.2 … 6.4
8.6 2.4 … 6.4
4.3 2.9 … 6.4
2.2 3.4 … 3.4
3.2 2.6 … 2.6
4.2 2.2 … 3.5
2.4 3.2 … 1.7
2.9 1.7 … 8.6
3.0 6.4 … 2.6
2.6 6.4 … 3.8
3.8 3.4 … 3.2
3.3 2.6 … 2.2
3.1 2.2 … 2.9
1.8 3.2 … 3.0
2.5 6.4 … 2.6
LSTM
LSTM
LSTM
LSTM
Sigmoid
…
Long-Short Term Memory Networks
28
3-Fold CV Accuracy Recall Precision
Average 98.76% 98.93% 98.60%
Deviation 0.04% 0.02% 0.02%
Results:
Models Comparison
29
90%
91%
92%
93%
94%
95%
96%
97%
98%
99%
100%
Accuracy Recall Precision
Long-Short Term Memory Network Random Forest
Models Comparison
30
Model
Random Forest
Long-Short Term
Memory Network
Memory
Consumption (MB)
289
0.56
Evaluation Time
(URLs per sec)
942
281
Training Time
(minutes)
2.95
238.7
What we learned
• Discerning URLs by their patterns is a good predictor of
phishing websites
• LSTM model shows an overall higher prediction
performance without the need of expert knowledge to
create the features
31
Free to use
32
Thank you!
Any questions or comments, please let me know.
Alejandro Correa Bahnsen, PhD
Chief Data Scientist
acorrea@easysol.net

Weitere ähnliche Inhalte

Was ist angesagt?

Blueliv Corporate Brochure 2017
Blueliv Corporate Brochure 2017Blueliv Corporate Brochure 2017
Blueliv Corporate Brochure 2017Blueliv
 
[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...
[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...
[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...CODE BLUE
 
Data Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 KeynoteData Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 KeynoteHPCC Systems
 
BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...
BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...
BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...BlueHat Security Conference
 
The Anatomy of a Data Breach
The Anatomy of a Data BreachThe Anatomy of a Data Breach
The Anatomy of a Data BreachDavid Hunt
 
InfoSec Monthly News Recap: April 2017
InfoSec Monthly News Recap: April 2017InfoSec Monthly News Recap: April 2017
InfoSec Monthly News Recap: April 2017Ettore Fantin
 
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...PROIDEA
 
Intelligent Application Security
Intelligent Application SecurityIntelligent Application Security
Intelligent Application SecurityPriyanka Aash
 
Symantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CK
Symantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CKSymantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CK
Symantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CKSymantec
 
(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...
(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...
(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...Priyanka Aash
 
BlueHat v18 || Modern day entomology - examining the inner workings of the bu...
BlueHat v18 || Modern day entomology - examining the inner workings of the bu...BlueHat v18 || Modern day entomology - examining the inner workings of the bu...
BlueHat v18 || Modern day entomology - examining the inner workings of the bu...BlueHat Security Conference
 
Cybersecurity: How to Use What We Already Know
Cybersecurity: How to Use What We Already KnowCybersecurity: How to Use What We Already Know
Cybersecurity: How to Use What We Already Knowjxyz
 
PHISHING DETECTION
PHISHING DETECTIONPHISHING DETECTION
PHISHING DETECTIONumme ayesha
 
How Machine Learning & AI Will Improve Cyber Security
How Machine Learning & AI Will Improve Cyber SecurityHow Machine Learning & AI Will Improve Cyber Security
How Machine Learning & AI Will Improve Cyber SecurityDevOps.com
 
"Inter- application vulnerabilities. hunting for bugs in secure applications"...
"Inter- application vulnerabilities. hunting for bugs in secure applications"..."Inter- application vulnerabilities. hunting for bugs in secure applications"...
"Inter- application vulnerabilities. hunting for bugs in secure applications"...PROIDEA
 
Netpluz - Managed Firewall & Endpoint Protection
Netpluz - Managed Firewall & Endpoint Protection Netpluz - Managed Firewall & Endpoint Protection
Netpluz - Managed Firewall & Endpoint Protection Netpluz Asia Pte Ltd
 
Why Organisations Need_Barac
Why Organisations Need_BaracWhy Organisations Need_Barac
Why Organisations Need_BaracBarac
 
Phishing Attacks: A Challenge Ahead
Phishing Attacks: A Challenge AheadPhishing Attacks: A Challenge Ahead
Phishing Attacks: A Challenge AheadeLearning Papers
 
Introduction to MITRE ATT&CK
Introduction to MITRE ATT&CKIntroduction to MITRE ATT&CK
Introduction to MITRE ATT&CKArpan Raval
 

Was ist angesagt? (20)

Blueliv Corporate Brochure 2017
Blueliv Corporate Brochure 2017Blueliv Corporate Brochure 2017
Blueliv Corporate Brochure 2017
 
[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...
[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...
[CB20] Operation Chimera - APT Operation Targets Semiconductor Vendors by CK ...
 
Data Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 KeynoteData Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 Keynote
 
BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...
BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...
BlueHat v18 || The law of unintended consequences - gdpr impact on cybersecur...
 
The Anatomy of a Data Breach
The Anatomy of a Data BreachThe Anatomy of a Data Breach
The Anatomy of a Data Breach
 
InfoSec Monthly News Recap: April 2017
InfoSec Monthly News Recap: April 2017InfoSec Monthly News Recap: April 2017
InfoSec Monthly News Recap: April 2017
 
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
 
Intelligent Application Security
Intelligent Application SecurityIntelligent Application Security
Intelligent Application Security
 
Symantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CK
Symantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CKSymantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CK
Symantec Webinar | How to Detect Targeted Ransomware with MITRE ATT&CK
 
(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...
(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...
(SACON) Nilanjan, Jitendra chauhan & Abhisek Datta - How does an attacker kno...
 
BlueHat v18 || Modern day entomology - examining the inner workings of the bu...
BlueHat v18 || Modern day entomology - examining the inner workings of the bu...BlueHat v18 || Modern day entomology - examining the inner workings of the bu...
BlueHat v18 || Modern day entomology - examining the inner workings of the bu...
 
Cybersecurity: How to Use What We Already Know
Cybersecurity: How to Use What We Already KnowCybersecurity: How to Use What We Already Know
Cybersecurity: How to Use What We Already Know
 
PHISHING DETECTION
PHISHING DETECTIONPHISHING DETECTION
PHISHING DETECTION
 
How Machine Learning & AI Will Improve Cyber Security
How Machine Learning & AI Will Improve Cyber SecurityHow Machine Learning & AI Will Improve Cyber Security
How Machine Learning & AI Will Improve Cyber Security
 
"Inter- application vulnerabilities. hunting for bugs in secure applications"...
"Inter- application vulnerabilities. hunting for bugs in secure applications"..."Inter- application vulnerabilities. hunting for bugs in secure applications"...
"Inter- application vulnerabilities. hunting for bugs in secure applications"...
 
Netpluz - Managed Firewall & Endpoint Protection
Netpluz - Managed Firewall & Endpoint Protection Netpluz - Managed Firewall & Endpoint Protection
Netpluz - Managed Firewall & Endpoint Protection
 
Why Organisations Need_Barac
Why Organisations Need_BaracWhy Organisations Need_Barac
Why Organisations Need_Barac
 
Insider theft detection
Insider theft detection Insider theft detection
Insider theft detection
 
Phishing Attacks: A Challenge Ahead
Phishing Attacks: A Challenge AheadPhishing Attacks: A Challenge Ahead
Phishing Attacks: A Challenge Ahead
 
Introduction to MITRE ATT&CK
Introduction to MITRE ATT&CKIntroduction to MITRE ATT&CK
Introduction to MITRE ATT&CK
 

Andere mochten auch

Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsAlejandro Correa Bahnsen, PhD
 
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionExample-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionAlejandro Correa Bahnsen, PhD
 
PhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationPhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationAlejandro Correa Bahnsen, PhD
 
Maximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningMaximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningAlejandro Correa Bahnsen, PhD
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...Alejandro Correa Bahnsen, PhD
 
2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practiceAlejandro Correa Bahnsen, PhD
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Alejandro Correa Bahnsen, PhD
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesAlejandro Correa Bahnsen, PhD
 

Andere mochten auch (13)

Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
 
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionExample-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
 
PhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationPhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive Classification
 
Analytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacionAnalytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacion
 
Modern Data Science
Modern Data ScienceModern Data Science
Modern Data Science
 
Maximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningMaximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learning
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...
 
1609 Fraud Data Science
1609 Fraud Data Science1609 Fraud Data Science
1609 Fraud Data Science
 
2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice
 
2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
 
Demystifying machine learning using lime
Demystifying machine learning using limeDemystifying machine learning using lime
Demystifying machine learning using lime
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slides
 

Ähnlich wie Classifying Phishing URLs Using Recurrent Neural Networks

BDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de PhishingBDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de PhishingBig-Data-Summit
 
Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...
Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...
Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...SSIMeetup
 
SCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber GriefSCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber GriefLancope, Inc.
 
CyberCrime in the Cloud and How to defend Yourself
CyberCrime in the Cloud and How to defend Yourself CyberCrime in the Cloud and How to defend Yourself
CyberCrime in the Cloud and How to defend Yourself Alert Logic
 
2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservicedevopsdaysaustin
 
CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)
CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)
CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)TI Safe
 
SCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber GriefSCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber GriefLancope, Inc.
 
IRJET - Improving Password System using Blockchain
IRJET - Improving Password System using BlockchainIRJET - Improving Password System using Blockchain
IRJET - Improving Password System using BlockchainIRJET Journal
 
CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...
CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...
CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...TI Safe
 
Blockchain on Azure
Blockchain on AzureBlockchain on Azure
Blockchain on AzureNuri Cankaya
 
Mris network architecture proposal r1
Mris network architecture proposal r1Mris network architecture proposal r1
Mris network architecture proposal r1Craig Burma
 
Blockchains and Adult Education
Blockchains and Adult EducationBlockchains and Adult Education
Blockchains and Adult EducationJohn Domingue
 
Industrial Control System Network Cyber Security Monitoring Solution (SCAB)
Industrial Control System Network Cyber Security Monitoring Solution (SCAB)Industrial Control System Network Cyber Security Monitoring Solution (SCAB)
Industrial Control System Network Cyber Security Monitoring Solution (SCAB)Enrique Martin
 
ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...Evention
 
How to measure your security response readiness?
How to measure your security response readiness?How to measure your security response readiness?
How to measure your security response readiness?Tomasz Jakubowski
 
Blockchain, Finance & Regulatory Development
Blockchain, Finance & Regulatory DevelopmentBlockchain, Finance & Regulatory Development
Blockchain, Finance & Regulatory DevelopmentAlex Makosz
 

Ähnlich wie Classifying Phishing URLs Using Recurrent Neural Networks (20)

BDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de PhishingBDAS-2017 | Deep Neural Networks Para la Detección de Phishing
BDAS-2017 | Deep Neural Networks Para la Detección de Phishing
 
Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...
Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...
Introduction to Ion – a layer 2 network for Decentralized Identifiers with Bi...
 
BLOCKHUNTER.pptx
BLOCKHUNTER.pptxBLOCKHUNTER.pptx
BLOCKHUNTER.pptx
 
SCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber GriefSCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber Grief
 
CyberCrime in the Cloud and How to defend Yourself
CyberCrime in the Cloud and How to defend Yourself CyberCrime in the Cloud and How to defend Yourself
CyberCrime in the Cloud and How to defend Yourself
 
2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice2016 - 10 questions you should answer before building a new microservice
2016 - 10 questions you should answer before building a new microservice
 
CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)
CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)
CLASS 2018 - Palestra de Edgard Capdevielle (Presidente e CEO – Nozomi)
 
SCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber GriefSCADA Security: The Five Stages of Cyber Grief
SCADA Security: The Five Stages of Cyber Grief
 
IRJET - Improving Password System using Blockchain
IRJET - Improving Password System using BlockchainIRJET - Improving Password System using Blockchain
IRJET - Improving Password System using Blockchain
 
CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...
CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...
CLASS 2022 - Marty Edwards (Tenable) - O perigo crescente de ransomware crimi...
 
Blockchain on Azure
Blockchain on AzureBlockchain on Azure
Blockchain on Azure
 
IoT meets Big Data
IoT meets Big DataIoT meets Big Data
IoT meets Big Data
 
Mris network architecture proposal r1
Mris network architecture proposal r1Mris network architecture proposal r1
Mris network architecture proposal r1
 
Blockchains and Adult Education
Blockchains and Adult EducationBlockchains and Adult Education
Blockchains and Adult Education
 
Industrial Control System Network Cyber Security Monitoring Solution (SCAB)
Industrial Control System Network Cyber Security Monitoring Solution (SCAB)Industrial Control System Network Cyber Security Monitoring Solution (SCAB)
Industrial Control System Network Cyber Security Monitoring Solution (SCAB)
 
Core intel
Core intelCore intel
Core intel
 
Cyber security
Cyber securityCyber security
Cyber security
 
ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...
 
How to measure your security response readiness?
How to measure your security response readiness?How to measure your security response readiness?
How to measure your security response readiness?
 
Blockchain, Finance & Regulatory Development
Blockchain, Finance & Regulatory DevelopmentBlockchain, Finance & Regulatory Development
Blockchain, Finance & Regulatory Development
 

Kürzlich hochgeladen

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Kürzlich hochgeladen (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Classifying Phishing URLs Using Recurrent Neural Networks

  • 1. Classifying Phishing URLs Using Recurrent Neural Networks Sergio Villegas Javier Vargas *Alejandro Correa Bahnsen Easy Solutions Research Eduardo Contreras Bohorquez Fabio A. Gonzalez MindLab Research Group, Universidad Nacional de Colombia
  • 2. Industry recognition A leading global provider of electronic fraud prevention for financial institutions and enterprise customers 385 customers In 30 countries 100 million Users protected 27+ billion Online connections monitored About Easy Solutions® Easy Solutions to be Acquired by New Joint Venture Creating Global, Secure Infrastructure Company
  • 3. Phishing 3 Phishing is the act of defrauding an online user in order to obtain personal information by posing as a trustworthy institution or entity.
  • 5. Why Phishing Detection is Hard 5 Original Website Only Using Images Subtle Changes
  • 6.
  • 7. Is It Phishing? Ideal Phishing Detection System 7 Machine Learning Algorithm
  • 8. Ideal Phishing Detection System - Issues 8 Issues with full content analysis: • Time consuming • Impractical to process millions of websites per day • Hard to implement for small devices
  • 9. There is always the need for an URL 9
  • 10. Database of URLs 1,000,000 Phishing URLs from PhishTank 10 http://moviesjingle.com/auto/163.com/index.php 1,000,000 Legitimate URLs from Common Crawl http://paypal.com.update.account.toughbook.cl/8a30e847925afc597516 1aeabe8930f1/?cmd=_home&dispatch=d09b78f5812945a73610edf38 http://msystemtech.ru/components/com_users/Italy/zz/Login.php?run= _login-submit&session=68bbd43c854147324d77872062349924 https://www.sanfordhealth.org/ChildrensHealth/Article/73980 http://www.grahamleader.com/ci_25029538/these-are-5-worst-super- bowl-halftime-shows&defid=1634182 http://www.carolinaguesthouse.co.uk/onlinebooking/?industrytype=1& startdate=2013-09-05&nights=2&location&productid=25d47a24-6b74
  • 11. CLASSIFYING PHISHING USING URL LEXICAL AND STATISTICAL FREQUENCIES 11
  • 12. URL Lexical and Statistical Frequencies 12 http://www.papaya.com/secure_login.php URL length Alexa Ranking Path length URL Entropy # of .com Punctuation count TLD count Is IP? Euclidean distance KS & KL distance
  • 13. URL Lexical and Statistical Frequencies 13 http://www.papaya.com/secure_login.php URL length Alexa Ranking Path length URL Entropy # of .com Punctuation count TLD count Is IP? Euclidean distance KS & KL distance Is It Phishing?
  • 14. URL Lexical and Statistical Frequencies 14 3-Fold CV Accuracy Recall Precision Average 93.47% 93.28% 93.64% Deviation 0.01% 0.02% 0.03% Results:
  • 15. URL Lexical and Statistical Frequencies 15 Feature Importance
  • 16. MODELING PHISHING URLS WITH RECURRENT NEURAL NETWORKS 16
  • 17. Normal Neural Network 17 Source: https://en.wikipedia.org/wiki/Artificial_neural_network
  • 18.
  • 19. Recurrent Neural Networks RNN Have loops! 19
  • 20. The Problem of Long-Term Dependencies 20 Short term dependencies are easy long term …
  • 21. Long-Short Term Memory Networks LSTM 21 RNN contains a single layer LSTM contains four interacting layers Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 22. Long-Short Term Memory Networks LSTM 22 Key idea: Cell State
  • 23. LSTM Step-by-Step 23 Step 1. Decide what information is going to be used
  • 24. LSTM Step-by-Step 24 Step 2. Which new information is stored
  • 25. LSTM Step-by-Step 25 Step 3. Update old cell state
  • 26. LSTM Step-by-Step 26 Step 4. Make prediction
  • 27. Modeling Architecture for URL Classification 27 URL h t t p : / / w w w . p a p a y a . c o m One hot Encoding … … … … … … … … … … … … … … … … … … … … … Embedding 3.2 1.2 … 1.7 6.4 2.3 … 2.6 6.4 3.0 … 1.7 3.4 2.6 … 3.4 2.6 3.8 … 2.6 3.5 3.2 … 6.4 1.7 4.2 … 6.4 8.6 2.4 … 6.4 4.3 2.9 … 6.4 2.2 3.4 … 3.4 3.2 2.6 … 2.6 4.2 2.2 … 3.5 2.4 3.2 … 1.7 2.9 1.7 … 8.6 3.0 6.4 … 2.6 2.6 6.4 … 3.8 3.8 3.4 … 3.2 3.3 2.6 … 2.2 3.1 2.2 … 2.9 1.8 3.2 … 3.0 2.5 6.4 … 2.6 LSTM LSTM LSTM LSTM Sigmoid …
  • 28. Long-Short Term Memory Networks 28 3-Fold CV Accuracy Recall Precision Average 98.76% 98.93% 98.60% Deviation 0.04% 0.02% 0.02% Results:
  • 29. Models Comparison 29 90% 91% 92% 93% 94% 95% 96% 97% 98% 99% 100% Accuracy Recall Precision Long-Short Term Memory Network Random Forest
  • 30. Models Comparison 30 Model Random Forest Long-Short Term Memory Network Memory Consumption (MB) 289 0.56 Evaluation Time (URLs per sec) 942 281 Training Time (minutes) 2.95 238.7
  • 31. What we learned • Discerning URLs by their patterns is a good predictor of phishing websites • LSTM model shows an overall higher prediction performance without the need of expert knowledge to create the features 31
  • 33. Thank you! Any questions or comments, please let me know. Alejandro Correa Bahnsen, PhD Chief Data Scientist acorrea@easysol.net