SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
1
Efficient String Matching Algorithm for
Intrusion Detection
Bhargavi Patel
Computer department, B.V.M Engineering College, India
bhargavi71291@yahoo.in
__________________________________________________________________________________________
ABSTRACT: Intrusion Detection Systems (IDSs) have become widely recognized as powerful tools for
identifying, deterring and deflecting malicious attacks over the network. Intrusion detection systems (IDSs)
are designed and installed to aid in deterring or mitigating the damage that can be caused by hacking, or
breaking into sensitive IT systems. . The attacks can come from outsider attackers on the Internet, authorized
insiders who misuse the privileges that have been given them and unauthorized insiders who attempt to gain
unauthorized privileges. IDSs cannot be used in isolation, but must be part of a larger framework of IT
security measures. Essential to almost every intrusion detection system is the ability to search through
packets and identify content that matches known attacks. Space and time efficient string matching
algorithms are therefore important for identifying these packets at line rate. In this paper we examine string
matching algorithm and their use for Intrusion Detection.
Keywords: System Design, Network Algorithm
___________________________________________________________________________
I. INTRODUCTION
With each passing day there is more critical data accessible in some form over the network. Any
publicly accessible system on the Internet today will be rapidly subjected to break-in attempts. These attacks
can range from email viruses, to corporate espionage, to general destruction of data, to attacks that hijack
servers from which to spread additional attacks. Even when a system cannot be directly broken into, denial of
service attacks can be just as harmful to individuals, and can cause nearly equal damage to the reputations
of companies that provide services over the Internet. Because of the increasing attacks held by the various users
of the internet, there has been widespread interest in combating these attacks at every level, from end hosts and
network taps to edge and core routers. Intrusion Detection Systems (or IDSs) are emerging as one of the
most promising ways of providing protection to systems on the network.
The Basis for Acquiring Idss
At least three reasons justify the acquisition of IDS. The three are:
1. To provide the means for detecting attacks and other security violations that cannot be prevented.
2. To prevent attackers from probing a network.
3. to document the intrusion threat to an organization.
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
2
As with firewalls, intrusion detection systems are growing in popularity because they provide a site
resilience to attacks without modifying end-node software. While firewalls only limit entry to a network
based on packet headers, intrusion detection systems go beyond this by identifying possible attacks that use
valid packet headers that pass through firewalls. Intrusion detection systems gain this capability by searching
both packet headers and payloads to identify attack signatures.
To define suspicious activities, IDS makes use of a set of rules which are applied to matching packets.
A rule consists at minimum of a type of packet to search, a string of content to match, a location where that
string is to be searched for, and an associated action to take if all the conditions of the rule are met.
In addition, as IDSs move from end-hosts into edge and core routers, the needs placed on algorithms for
intrusion detection will change. While common-case performance can be an acceptable metric for end-hosts
that are based on commodity processors, in order to be successful inside the network infrastructure,
algorithms must satisfy stringent worst-case performance bounds and tight constraints on memory. At the
heart of almost every modern intrusion detection system is a string matching algorithm. String matching is
crucial because it allows detection systems to base their actions on the content that is actually flowing to a
machine. From this sea of packets, the string identifies those packets that contain data matching the
fingerprint of a known attack. Essentially, the string matching algorithm compares the set of strings in the
rule-set to the data seen in the packets that flow across the network.String matching is computationally
intensive. Because string matching dominates the performance in this and many other IDS, in this we
concentrate our efforts on building smaller and faster string matching algorithms. We present optimized
techniques for matching large sets of strings in incoming packets in the context of network intrusion detection.
We characterize the properties of a real set of IDS string matching rules and examine both how the rules
have changed over time, and the effect of those changes on the data structures used. An important
contribution of this work is the development of an algorithm that performs well and has useful bounds on worst
case performance.
II. TYPES OF IDS
There are several types of IDS available. They are characterized by different monitoring and analysis
approaches. Each type has distinct uses, advantages, and disadvantages. IDSs can monitor events at three
different levels: network, host, and application. They can analyze these events using two techniques: signature
detection and anomaly detection. Some IDSs have the ability to respond automatically to attacks that are
detected.
1. Ids Monitoring Approaches
One way to define the types of IDSs is to look at what they monitor. Some IDSs listen on
network backbones and analyze network packets to find attackers. Other IDSs reside on the hosts that they are
defending and monitor the operating system for signs of intrusion. Still others monitor individual
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
3
applications.
1.1 Network base IDs
Network-based IDSs are the most common type of commercial product offering. These
mechanisms detect attacks by capturing and analyzing network packets. Listening on a network
backbone, a single network-based ID can monitor a large amount of information. Network-based IDSs usually
consist of a set of single-purpose hosts that “Sniff” or capture network traffic in various parts of a network and
report attacks to a single management console. Because no other applications run on the hosts that are used by
network-based IDS, they can be secured against attack. Many of them have “stealth” modes, which make it
extremely difficult for an attacker to detect their presence and to locate them.
Advantages: A few well-placed network-based IDSs can monitor a large network. The deployment of
network based IDSs has little impact on the performance of an existing network. Network-based IDSs are
typically passive devices that listen on a network wire without interfering with normal network operation.
Thus, usually, it is easy to retrofit a network to include network-based IDSs with a minimal installation effort.
Network-based IDSs can be made very secure against attack and can even be made invisible to many
attackers.
Disadvantages: Network-based IDSs may have difficulty processing all packets in a large or busy network.
Therefore, such mechanisms may fail to recognize an attack that is launched during periods of high traffic.
IDSs that are completely implemented in hardware are much faster than those that have been totally realized
in software. In addition, the need to analyze packets quickly forces vendors to try and detect attacks
with as few computing resources as possible. This may reduce detection effectiveness.
Many of the advantages of network-based IDSs do not always apply to the more modern switch-based
networks. Switches can subdivide networks into many small segments; this will usually be implemented
with one fast Ethernet wire per host. Switches can provide dedicated links between hosts that are serviced by
the same switch. Most switches do not provide universal monitoring ports. This reduces the monitoring
range of a network-based IDS sensor to a single host. In switches that do provide such monitoring ports,
the single port is frequently unable to mirror all the traffic that is moving through the switch.
Network-based IDs cannot analyze encrypted information. Increasingly, this limitation will become a
problem as the use of encryption, both by organizations and by the attackers, increases. Most network-based
IDSs do not report whether or not an attack was successful. These mechanisms only report that an attack was
initiated. After an attack has been detected, administrators must manually investigate each host that has been
attacked to determine which hosts were penetrated.
1.2 Host based Ids
Host-based IDSs analyze the activity on a particular computer. Thus, they must collect information
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
4
from the host they are monitoring. This allows IDS to analyze activities on the host at a very fine
granularity and to determine exactly which processes and users are performing malicious activities on the
operating system. Some host-based IDSs simplify the administration of a set of hosts by having the
administration functions and attack reports centralized at a single IT security console. Others generate
messages that are compatible with network administration systems.
Advantages Host-based IDSs can detect attacks that are not detectable by network-based IDS because this
type as a view of events that are local to a host. Host-based IDSs can operate in a network that is using
encryption when the encrypted information is decrypted on (or before) reaching the host that is being
monitored. Host based IDSs can operate in switched networks.
Disadvantages: the collection mechanisms must usually be installed and maintained on every host that is
to be monitored. Because portions of these systems reside on the host that is being attacked, host-based
IDSs may be attacked and disabled by a clever attacker. Host-based IDSs are not well-suited for detecting
network scans of all the hosts in a network because the IDS at each host sees only the network packets that the
host receives. Host- based IDSs frequently have difficulty detecting and operating in the face of denial-of-
service attacks. Host based IDSs use the computing resources of the hosts they are monitoring.
1.3 Application based Ids
Application-based IDSs monitor the events that are transpiring within an application. They often detect attacks
by analyzing the application‟s log files. By interfacing with an application directly and having significant
domain or application knowledge, application- based IDSs are more likely to have a more discerning or fine-
grained view of suspicious activity in the application.
Advantages: Application-based IDSs can monitor activity at a very fine granularity, which allows them,
often, to track unauthorized activity to individual users. Application-based IDSs can work in encrypted
environments, because they interface with the application that may be performing encryption.
Disadvantages: Application-based IDSs may be more vulnerable than host-based IDSs to being attacked and
disabled because they run as an application on the host that they are monitoring. The distinction between
application-based IDS and host-based IDS is not always clear. Thus, for the remainder of this article, both
types will be referred to as host- based IDSs.
2. IDS Event Approaches
There are two primary approaches to analyzing computer and networks events to detect attacks: signature
detection and anomaly detection. Signature detection is the primary technique used by most commercial IDS
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
5
products. However, anomaly detection is the subject of much research and is used in limited form by a
number of IDSs.
2.1 Signature based IDSs
Signature-based detection looks for activity that matches a predefined set of events that uniquely describe a
known attack. Signature-based IDSs must be specifically programmed to detect each known attack. This
technique is extremely effective and is the primary method used in commercial products for detecting attacks.
Advantages: Signature-based IDSs are very effective in detecting attacks without generating an
overwhelming number of false alarms.
Disadvantages: Signature-based IDSs must be programmed to detect each attack and thus must be constantly
updated with the signatures of new attacks. Many signatures based IDSs have narrowly defined signatures
that prevent them from detecting variants of common attacks.
2.2 Anomaly based IDS
Anomaly-based IDSs find attacks by identifying unusual behaviour(i.e., anomalies) that occurs on a host
or network. They function on the observation that some attackers behave differently than “normal” users and
thus can be detected by systems that identify these differences. Anomaly-based IDSs establish a baseline of
normal behaviour by profiling particular users or network connections and then statistically measure when the
activity being monitored deviates from the norm. These IDSs frequently produce a large number of false
alarms because normal user and network behaviours can vary widely. Despite this weakness, the researchers
working on applying this technology assert that anomaly-based IDSs are able to detect never-before-seen
attacks, unlike signature based IDSs that rely on an analysis of past attacks. Although some commercial
IDSs include restricted forms of anomaly detection, few, if any, rely solely on this technology.
However, research on anomaly detection IDS products continues.
Advantages: Anomaly-based IDSs detect unusual behaviour and thus have the ability to detect attacks
without having to be specifically programmed to detect them.
Disadvantages: Anomaly detection approaches typically produce a large number of false alarms due to the
unpredictable nature of computing and telecommunication users and networks. Anomaly detection
approaches frequently require extensive “training sets” of system event records to characterize normal
behaviour patterns.
III. EVASION TECHNIQUES
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
6
Fragmentation: by sending fragmented packets, the attacker will be under the radar and can easily by pass the
detection system's ability to detect the attack signature.
Avoiding defaults: The TCP port utilised by a protocol does not always provide an indication to the protocol
which is being transported. For example, IDS may expect to detect a Trojan on port 12345. If an attacker had
reconfigured it to use a different port the IDS may not be able to detect the presence of the Trojan.
Coordinated, low-bandwidth attacks: coordinating a scan among numerous attackers (or agents) and
allocating different ports or hosts to different attackers makes it difficult for the IDS to correlate the captured
packets and deduce that a network scan is in progress.
Address spoofing/proxying: attackers can increase the difficulty of the ability of Security Administrators to
determine the source of the attack by using poorly secured or incorrectly configured proxy servers to bounce an
attack. If the source is spoofed and bounced by a server then it makes it very difficult for IDS to detect the
origin of the attack.
Pattern change evasion: IDS generally rely on „pattern matching‟ to detect an attack. By changing the data
used in the attack slightly, it may be possible to evade detection. For example, an IMAP server may be
vulnerable to a buffer overflow, and IDS is able to detect the attack signature of 10 common attack tools. By
modifying the payload sent by the tool, so that it does not resemble the data that the IDS expect, it may be
possible to evade detection.
IV. STRING MATCHING FOR INTRUSION DETECTION
In the Introduction we motivated the need for string matching in Intrusion Detection Systems. In this
section we further demonstrate how string matching is used in an actual intrusion detection system. We also
examine the state of the art in string matching as it relates to intrusion detection.
Aho-Corasick string-matching algorithm
One of the earliest algorithms in precise multi-pattern string matching is due to Aho- Corasick , which
is able to match strings in worst case time linear in the size of the input. Aho-Corasick works by constructing a
state machine from the strings to be matched. The state machine starts with an empty root node which is the
default non-matching state. Each pattern to be matched adds states to the machine, starting at the root and going
to the end of the pattern. The state fig(2) machine is then traversed and failure pointers are added from each
node to the longest prefix of that node which also leads to a valid node in the trie.(Fig 3). Beyond this basic
notion, there are two choices for the algorithm. We can optimize the data structure further by using the failure
pointers to precompute the next state for every character from every state in the machine (Fig 1), or we can
leave these transitions undefined and traverse the failure pointers at run-time ( Fig 2).If the data structure is
optimized, then Aho- Corasick requires only a single memory reference (albeit a very wide memory reference)
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
7
per character in the input. If the data structure is left unoptimized, one can show via amortizedanalysis that only
two (again wide) memory references per character of input string are required to traverse the data structure. We
use the unoptimized data structure because the undefined pointers allow us significant opportunity for space
optimizations.
V. FIGURES
(1) Pattern matching machine.
(2) Aho-Corasick
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
8
VI. CONCLUSION
Guided by the analogy between IP lookup and string matching, our paper builds on the worst-case
guarantees of the classical Aho-Corasick string matching algorithm. As with multibit tries, Aho-Corasick is the
only string matching algorithm we know of that has deterministic worst-case lookup times and a data structure
friendly enough to use for wire speed hardware matching. Unfortunately, the classical Aho-Corasick data
structure takes more storage than is likely to fit in on-chip SRAM or the cache of a commodity processor.
The principal contribution of our paper is to apply bitmap node compression and path compression to
AhoCorasick to gain both compact storage and worst-case performance. In particular, we show that the use of
such compression gains factors of almost 50 times in database size reductions on current rule sets. While the
case is less clear for software implementations unless more predictable performance is desired; we believe that
our compressed AhoCorasick algorithms are the best choice for hardware implementations
of string matching for IDS using an FPGA, ASIC or network processor designs of the future.
REFERENCES
[1] Deterministic Memory-Efficient String Matching Algorithms for Intrusion Detection- By Nathan Tuck,
Timothy Sherwood, Brad Calder, George Varghese, Department of Computer Science and Engineering,
University of California, San Diego, Department of Computer Science, University of California, Santa
Barbara.
[2] EDPACS-THE EDP Audit, Control, And Security Newsletter, NOVEMBER 2001 Vol.XXIX, No.5
KNOWLEDGECUDDLE PUBLICATION
International Journal of Computer Engineering and Science, August- 2014
9
[3] M. Consens, G. Navarro, “N-Gram Similarity and Distance”. SPIRE 2005, LNCS 3772, 2005, pp. 115-
126
[4] Andreas Wespi, Marc Dacier, Herve Debar. “Intrusion Detection Using Variable-Length Audit Trail
Patterns”. Third International Workshop on the Recent Advances in Intrusion Detection(RAID 2000).
Toulouse, France, 2000, 110-129
[5] Andreas Wespi, Marc Dacier, Herve Debar. “An Intrusion-Detection System Based on the Teiresias Pattern-
Discovery Algorithm”. EICAR Proceedings, 1999, pp. 1-15
[6] Duan You-xiang, Huang Min, Xu Jiuyun, “Initialization Method of Gene Libray Based on Teiresias
Algorithm”. Networks Security, Wireless Communications and Trusted Computing, 2009. NSWCTC 09.
2009, pp, 294-297
[7] Jonassen Inge, “Efficient Discovery of Conserved Patterns using a Pattern Graph”. Comput Appl Biosci,
1997

Weitere ähnliche Inhalte

Was ist angesagt?

Intrusion detection systems
Intrusion detection systemsIntrusion detection systems
Intrusion detection systems
Seraphic Nazir
 
Intrusion Detection and Prevention System in an Enterprise Network
Intrusion Detection and Prevention System in an Enterprise NetworkIntrusion Detection and Prevention System in an Enterprise Network
Intrusion Detection and Prevention System in an Enterprise Network
Okehie Collins
 
Current Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy Logic
Current Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy LogicCurrent Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy Logic
Current Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy Logic
ijdpsjournal
 
Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013
ijcsbi
 
TACTiCS_WP Security_Addressing Security in SDN Environment
TACTiCS_WP Security_Addressing Security in SDN EnvironmentTACTiCS_WP Security_Addressing Security in SDN Environment
TACTiCS_WP Security_Addressing Security in SDN Environment
Saikat Chaudhuri
 

Was ist angesagt? (20)

Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
REAL-TIME INTRUSION DETECTION SYSTEM FOR BIG DATA
REAL-TIME INTRUSION DETECTION SYSTEM FOR BIG DATAREAL-TIME INTRUSION DETECTION SYSTEM FOR BIG DATA
REAL-TIME INTRUSION DETECTION SYSTEM FOR BIG DATA
 
Intrusion detection systems
Intrusion detection systemsIntrusion detection systems
Intrusion detection systems
 
An Extensive Survey of Intrusion Detection Systems
An Extensive Survey of Intrusion Detection SystemsAn Extensive Survey of Intrusion Detection Systems
An Extensive Survey of Intrusion Detection Systems
 
Intrusion Detection System
Intrusion Detection SystemIntrusion Detection System
Intrusion Detection System
 
Detection of Rogue Access Point in WLAN using Hopfield Neural Network
Detection of Rogue Access Point in WLAN using Hopfield Neural Network  Detection of Rogue Access Point in WLAN using Hopfield Neural Network
Detection of Rogue Access Point in WLAN using Hopfield Neural Network
 
N44096972
N44096972N44096972
N44096972
 
Introduction IDS
Introduction IDSIntroduction IDS
Introduction IDS
 
Intrusion Detection and Prevention System in an Enterprise Network
Intrusion Detection and Prevention System in an Enterprise NetworkIntrusion Detection and Prevention System in an Enterprise Network
Intrusion Detection and Prevention System in an Enterprise Network
 
Current Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy Logic
Current Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy LogicCurrent Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy Logic
Current Studies On Intrusion Detection System, Genetic Algorithm And Fuzzy Logic
 
IPS Product Comparison of Cisco 4255 & TippingPoint 5000E
IPS Product Comparison of Cisco 4255 & TippingPoint 5000EIPS Product Comparison of Cisco 4255 & TippingPoint 5000E
IPS Product Comparison of Cisco 4255 & TippingPoint 5000E
 
Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013Vol 6 No 1 - October 2013
Vol 6 No 1 - October 2013
 
Intrusion Detection System
Intrusion Detection SystemIntrusion Detection System
Intrusion Detection System
 
Survey on Host and Network Based Intrusion Detection System
Survey on Host and Network Based Intrusion Detection SystemSurvey on Host and Network Based Intrusion Detection System
Survey on Host and Network Based Intrusion Detection System
 
TACTiCS_WP Security_Addressing Security in SDN Environment
TACTiCS_WP Security_Addressing Security in SDN EnvironmentTACTiCS_WP Security_Addressing Security in SDN Environment
TACTiCS_WP Security_Addressing Security in SDN Environment
 
Intrusion Detection System
Intrusion Detection SystemIntrusion Detection System
Intrusion Detection System
 
Autonomic Anomaly Detection System in Computer Networks
Autonomic Anomaly Detection System in Computer NetworksAutonomic Anomaly Detection System in Computer Networks
Autonomic Anomaly Detection System in Computer Networks
 
Computer Security and Intrusion Detection(IDS/IPS)
Computer Security and Intrusion Detection(IDS/IPS)Computer Security and Intrusion Detection(IDS/IPS)
Computer Security and Intrusion Detection(IDS/IPS)
 
A Performance Analysis of Chasing Intruders by Implementing Mobile Agents
A Performance Analysis of Chasing Intruders by Implementing Mobile AgentsA Performance Analysis of Chasing Intruders by Implementing Mobile Agents
A Performance Analysis of Chasing Intruders by Implementing Mobile Agents
 
Intrusion Detection in WLANs
Intrusion Detection in WLANsIntrusion Detection in WLANs
Intrusion Detection in WLANs
 

Andere mochten auch

Hipaa security officer perfomance appraisal 2
Hipaa security officer perfomance appraisal 2Hipaa security officer perfomance appraisal 2
Hipaa security officer perfomance appraisal 2
tonychoper6104
 
Ke ipsos spec_poll_narrative_report _6th_may_2015
Ke ipsos spec_poll_narrative_report _6th_may_2015Ke ipsos spec_poll_narrative_report _6th_may_2015
Ke ipsos spec_poll_narrative_report _6th_may_2015
Ipsos
 
Wilderness slidecast group 3
Wilderness slidecast group 3Wilderness slidecast group 3
Wilderness slidecast group 3
Ian Dandridge
 
Top 8 physical therapist assistant resume samples
Top 8 physical therapist assistant resume samplesTop 8 physical therapist assistant resume samples
Top 8 physical therapist assistant resume samples
RoyKeane789
 

Andere mochten auch (15)

simulation and hardware implementation of grid connected solar charge control...
simulation and hardware implementation of grid connected solar charge control...simulation and hardware implementation of grid connected solar charge control...
simulation and hardware implementation of grid connected solar charge control...
 
Matt Schultz 4.4
Matt Schultz 4.4Matt Schultz 4.4
Matt Schultz 4.4
 
Hipaa security officer perfomance appraisal 2
Hipaa security officer perfomance appraisal 2Hipaa security officer perfomance appraisal 2
Hipaa security officer perfomance appraisal 2
 
Towards a critical history of child protection social work
Towards a critical history of child protection social workTowards a critical history of child protection social work
Towards a critical history of child protection social work
 
Spirito Young People who self produce sexual images.
Spirito Young People who self produce sexual images.Spirito Young People who self produce sexual images.
Spirito Young People who self produce sexual images.
 
Saroj_Mahanta
Saroj_MahantaSaroj_Mahanta
Saroj_Mahanta
 
Question 1
Question 1Question 1
Question 1
 
KSP-New beat system
KSP-New beat systemKSP-New beat system
KSP-New beat system
 
Stc call sheet 1-1
Stc call sheet 1-1Stc call sheet 1-1
Stc call sheet 1-1
 
Ke ipsos spec_poll_narrative_report _6th_may_2015
Ke ipsos spec_poll_narrative_report _6th_may_2015Ke ipsos spec_poll_narrative_report _6th_may_2015
Ke ipsos spec_poll_narrative_report _6th_may_2015
 
The Kenyan Economy: Perceptions and Realities
The Kenyan Economy: Perceptions and Realities  The Kenyan Economy: Perceptions and Realities
The Kenyan Economy: Perceptions and Realities
 
Wilderness slidecast group 3
Wilderness slidecast group 3Wilderness slidecast group 3
Wilderness slidecast group 3
 
Top 8 physical therapist assistant resume samples
Top 8 physical therapist assistant resume samplesTop 8 physical therapist assistant resume samples
Top 8 physical therapist assistant resume samples
 
vocabu
vocabuvocabu
vocabu
 
Context and Support Factors in Elementary and Middle School STEM Programs
Context and Support Factors in Elementary and Middle School STEM ProgramsContext and Support Factors in Elementary and Middle School STEM Programs
Context and Support Factors in Elementary and Middle School STEM Programs
 

Ähnlich wie Efficient String Matching Algorithm for Intrusion Detection

Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
Akhil Kumar
 
Optimized Intrusion Detection System using Deep Learning Algorithm
Optimized Intrusion Detection System using Deep Learning AlgorithmOptimized Intrusion Detection System using Deep Learning Algorithm
Optimized Intrusion Detection System using Deep Learning Algorithm
ijtsrd
 
Intrusion Detection Systems
Intrusion Detection SystemsIntrusion Detection Systems
Intrusion Detection Systems
vamsi_xmen
 
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORTINTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
IJMIT JOURNAL
 

Ähnlich wie Efficient String Matching Algorithm for Intrusion Detection (20)

Intrusion Detection System using AI and Machine Learning Algorithm
Intrusion Detection System using AI and Machine Learning AlgorithmIntrusion Detection System using AI and Machine Learning Algorithm
Intrusion Detection System using AI and Machine Learning Algorithm
 
A Study on Recent Trends and Developments in Intrusion Detection System
A Study on Recent Trends and Developments in Intrusion Detection SystemA Study on Recent Trends and Developments in Intrusion Detection System
A Study on Recent Trends and Developments in Intrusion Detection System
 
A Study On Recent Trends And Developments In Intrusion Detection System
A Study On Recent Trends And Developments In Intrusion Detection SystemA Study On Recent Trends And Developments In Intrusion Detection System
A Study On Recent Trends And Developments In Intrusion Detection System
 
50320130403001 2-3
50320130403001 2-350320130403001 2-3
50320130403001 2-3
 
50320130403001 2-3
50320130403001 2-350320130403001 2-3
50320130403001 2-3
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
06686259 20140405 205404
06686259 20140405 20540406686259 20140405 205404
06686259 20140405 205404
 
Optimized Intrusion Detection System using Deep Learning Algorithm
Optimized Intrusion Detection System using Deep Learning AlgorithmOptimized Intrusion Detection System using Deep Learning Algorithm
Optimized Intrusion Detection System using Deep Learning Algorithm
 
Bt33430435
Bt33430435Bt33430435
Bt33430435
 
Bt33430435
Bt33430435Bt33430435
Bt33430435
 
Automatic Intrusion Detection based on Artificial Intelligence Techniques: A ...
Automatic Intrusion Detection based on Artificial Intelligence Techniques: A ...Automatic Intrusion Detection based on Artificial Intelligence Techniques: A ...
Automatic Intrusion Detection based on Artificial Intelligence Techniques: A ...
 
Enhanced method for intrusion detection over kdd cup 99 dataset
Enhanced method for intrusion detection over kdd cup 99 datasetEnhanced method for intrusion detection over kdd cup 99 dataset
Enhanced method for intrusion detection over kdd cup 99 dataset
 
idps
idpsidps
idps
 
Intrusion Detection Systems
Intrusion Detection SystemsIntrusion Detection Systems
Intrusion Detection Systems
 
A STUDY ON INTRUSION DETECTION
A STUDY ON INTRUSION DETECTIONA STUDY ON INTRUSION DETECTION
A STUDY ON INTRUSION DETECTION
 
A STUDY ON INTRUSION DETECTION
A STUDY ON INTRUSION DETECTIONA STUDY ON INTRUSION DETECTION
A STUDY ON INTRUSION DETECTION
 
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORTINTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
 
1776 1779
1776 17791776 1779
1776 1779
 

Kürzlich hochgeladen

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 

Kürzlich hochgeladen (20)

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 

Efficient String Matching Algorithm for Intrusion Detection

  • 1. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 1 Efficient String Matching Algorithm for Intrusion Detection Bhargavi Patel Computer department, B.V.M Engineering College, India bhargavi71291@yahoo.in __________________________________________________________________________________________ ABSTRACT: Intrusion Detection Systems (IDSs) have become widely recognized as powerful tools for identifying, deterring and deflecting malicious attacks over the network. Intrusion detection systems (IDSs) are designed and installed to aid in deterring or mitigating the damage that can be caused by hacking, or breaking into sensitive IT systems. . The attacks can come from outsider attackers on the Internet, authorized insiders who misuse the privileges that have been given them and unauthorized insiders who attempt to gain unauthorized privileges. IDSs cannot be used in isolation, but must be part of a larger framework of IT security measures. Essential to almost every intrusion detection system is the ability to search through packets and identify content that matches known attacks. Space and time efficient string matching algorithms are therefore important for identifying these packets at line rate. In this paper we examine string matching algorithm and their use for Intrusion Detection. Keywords: System Design, Network Algorithm ___________________________________________________________________________ I. INTRODUCTION With each passing day there is more critical data accessible in some form over the network. Any publicly accessible system on the Internet today will be rapidly subjected to break-in attempts. These attacks can range from email viruses, to corporate espionage, to general destruction of data, to attacks that hijack servers from which to spread additional attacks. Even when a system cannot be directly broken into, denial of service attacks can be just as harmful to individuals, and can cause nearly equal damage to the reputations of companies that provide services over the Internet. Because of the increasing attacks held by the various users of the internet, there has been widespread interest in combating these attacks at every level, from end hosts and network taps to edge and core routers. Intrusion Detection Systems (or IDSs) are emerging as one of the most promising ways of providing protection to systems on the network. The Basis for Acquiring Idss At least three reasons justify the acquisition of IDS. The three are: 1. To provide the means for detecting attacks and other security violations that cannot be prevented. 2. To prevent attackers from probing a network. 3. to document the intrusion threat to an organization.
  • 2. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 2 As with firewalls, intrusion detection systems are growing in popularity because they provide a site resilience to attacks without modifying end-node software. While firewalls only limit entry to a network based on packet headers, intrusion detection systems go beyond this by identifying possible attacks that use valid packet headers that pass through firewalls. Intrusion detection systems gain this capability by searching both packet headers and payloads to identify attack signatures. To define suspicious activities, IDS makes use of a set of rules which are applied to matching packets. A rule consists at minimum of a type of packet to search, a string of content to match, a location where that string is to be searched for, and an associated action to take if all the conditions of the rule are met. In addition, as IDSs move from end-hosts into edge and core routers, the needs placed on algorithms for intrusion detection will change. While common-case performance can be an acceptable metric for end-hosts that are based on commodity processors, in order to be successful inside the network infrastructure, algorithms must satisfy stringent worst-case performance bounds and tight constraints on memory. At the heart of almost every modern intrusion detection system is a string matching algorithm. String matching is crucial because it allows detection systems to base their actions on the content that is actually flowing to a machine. From this sea of packets, the string identifies those packets that contain data matching the fingerprint of a known attack. Essentially, the string matching algorithm compares the set of strings in the rule-set to the data seen in the packets that flow across the network.String matching is computationally intensive. Because string matching dominates the performance in this and many other IDS, in this we concentrate our efforts on building smaller and faster string matching algorithms. We present optimized techniques for matching large sets of strings in incoming packets in the context of network intrusion detection. We characterize the properties of a real set of IDS string matching rules and examine both how the rules have changed over time, and the effect of those changes on the data structures used. An important contribution of this work is the development of an algorithm that performs well and has useful bounds on worst case performance. II. TYPES OF IDS There are several types of IDS available. They are characterized by different monitoring and analysis approaches. Each type has distinct uses, advantages, and disadvantages. IDSs can monitor events at three different levels: network, host, and application. They can analyze these events using two techniques: signature detection and anomaly detection. Some IDSs have the ability to respond automatically to attacks that are detected. 1. Ids Monitoring Approaches One way to define the types of IDSs is to look at what they monitor. Some IDSs listen on network backbones and analyze network packets to find attackers. Other IDSs reside on the hosts that they are defending and monitor the operating system for signs of intrusion. Still others monitor individual
  • 3. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 3 applications. 1.1 Network base IDs Network-based IDSs are the most common type of commercial product offering. These mechanisms detect attacks by capturing and analyzing network packets. Listening on a network backbone, a single network-based ID can monitor a large amount of information. Network-based IDSs usually consist of a set of single-purpose hosts that “Sniff” or capture network traffic in various parts of a network and report attacks to a single management console. Because no other applications run on the hosts that are used by network-based IDS, they can be secured against attack. Many of them have “stealth” modes, which make it extremely difficult for an attacker to detect their presence and to locate them. Advantages: A few well-placed network-based IDSs can monitor a large network. The deployment of network based IDSs has little impact on the performance of an existing network. Network-based IDSs are typically passive devices that listen on a network wire without interfering with normal network operation. Thus, usually, it is easy to retrofit a network to include network-based IDSs with a minimal installation effort. Network-based IDSs can be made very secure against attack and can even be made invisible to many attackers. Disadvantages: Network-based IDSs may have difficulty processing all packets in a large or busy network. Therefore, such mechanisms may fail to recognize an attack that is launched during periods of high traffic. IDSs that are completely implemented in hardware are much faster than those that have been totally realized in software. In addition, the need to analyze packets quickly forces vendors to try and detect attacks with as few computing resources as possible. This may reduce detection effectiveness. Many of the advantages of network-based IDSs do not always apply to the more modern switch-based networks. Switches can subdivide networks into many small segments; this will usually be implemented with one fast Ethernet wire per host. Switches can provide dedicated links between hosts that are serviced by the same switch. Most switches do not provide universal monitoring ports. This reduces the monitoring range of a network-based IDS sensor to a single host. In switches that do provide such monitoring ports, the single port is frequently unable to mirror all the traffic that is moving through the switch. Network-based IDs cannot analyze encrypted information. Increasingly, this limitation will become a problem as the use of encryption, both by organizations and by the attackers, increases. Most network-based IDSs do not report whether or not an attack was successful. These mechanisms only report that an attack was initiated. After an attack has been detected, administrators must manually investigate each host that has been attacked to determine which hosts were penetrated. 1.2 Host based Ids Host-based IDSs analyze the activity on a particular computer. Thus, they must collect information
  • 4. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 4 from the host they are monitoring. This allows IDS to analyze activities on the host at a very fine granularity and to determine exactly which processes and users are performing malicious activities on the operating system. Some host-based IDSs simplify the administration of a set of hosts by having the administration functions and attack reports centralized at a single IT security console. Others generate messages that are compatible with network administration systems. Advantages Host-based IDSs can detect attacks that are not detectable by network-based IDS because this type as a view of events that are local to a host. Host-based IDSs can operate in a network that is using encryption when the encrypted information is decrypted on (or before) reaching the host that is being monitored. Host based IDSs can operate in switched networks. Disadvantages: the collection mechanisms must usually be installed and maintained on every host that is to be monitored. Because portions of these systems reside on the host that is being attacked, host-based IDSs may be attacked and disabled by a clever attacker. Host-based IDSs are not well-suited for detecting network scans of all the hosts in a network because the IDS at each host sees only the network packets that the host receives. Host- based IDSs frequently have difficulty detecting and operating in the face of denial-of- service attacks. Host based IDSs use the computing resources of the hosts they are monitoring. 1.3 Application based Ids Application-based IDSs monitor the events that are transpiring within an application. They often detect attacks by analyzing the application‟s log files. By interfacing with an application directly and having significant domain or application knowledge, application- based IDSs are more likely to have a more discerning or fine- grained view of suspicious activity in the application. Advantages: Application-based IDSs can monitor activity at a very fine granularity, which allows them, often, to track unauthorized activity to individual users. Application-based IDSs can work in encrypted environments, because they interface with the application that may be performing encryption. Disadvantages: Application-based IDSs may be more vulnerable than host-based IDSs to being attacked and disabled because they run as an application on the host that they are monitoring. The distinction between application-based IDS and host-based IDS is not always clear. Thus, for the remainder of this article, both types will be referred to as host- based IDSs. 2. IDS Event Approaches There are two primary approaches to analyzing computer and networks events to detect attacks: signature detection and anomaly detection. Signature detection is the primary technique used by most commercial IDS
  • 5. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 5 products. However, anomaly detection is the subject of much research and is used in limited form by a number of IDSs. 2.1 Signature based IDSs Signature-based detection looks for activity that matches a predefined set of events that uniquely describe a known attack. Signature-based IDSs must be specifically programmed to detect each known attack. This technique is extremely effective and is the primary method used in commercial products for detecting attacks. Advantages: Signature-based IDSs are very effective in detecting attacks without generating an overwhelming number of false alarms. Disadvantages: Signature-based IDSs must be programmed to detect each attack and thus must be constantly updated with the signatures of new attacks. Many signatures based IDSs have narrowly defined signatures that prevent them from detecting variants of common attacks. 2.2 Anomaly based IDS Anomaly-based IDSs find attacks by identifying unusual behaviour(i.e., anomalies) that occurs on a host or network. They function on the observation that some attackers behave differently than “normal” users and thus can be detected by systems that identify these differences. Anomaly-based IDSs establish a baseline of normal behaviour by profiling particular users or network connections and then statistically measure when the activity being monitored deviates from the norm. These IDSs frequently produce a large number of false alarms because normal user and network behaviours can vary widely. Despite this weakness, the researchers working on applying this technology assert that anomaly-based IDSs are able to detect never-before-seen attacks, unlike signature based IDSs that rely on an analysis of past attacks. Although some commercial IDSs include restricted forms of anomaly detection, few, if any, rely solely on this technology. However, research on anomaly detection IDS products continues. Advantages: Anomaly-based IDSs detect unusual behaviour and thus have the ability to detect attacks without having to be specifically programmed to detect them. Disadvantages: Anomaly detection approaches typically produce a large number of false alarms due to the unpredictable nature of computing and telecommunication users and networks. Anomaly detection approaches frequently require extensive “training sets” of system event records to characterize normal behaviour patterns. III. EVASION TECHNIQUES
  • 6. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 6 Fragmentation: by sending fragmented packets, the attacker will be under the radar and can easily by pass the detection system's ability to detect the attack signature. Avoiding defaults: The TCP port utilised by a protocol does not always provide an indication to the protocol which is being transported. For example, IDS may expect to detect a Trojan on port 12345. If an attacker had reconfigured it to use a different port the IDS may not be able to detect the presence of the Trojan. Coordinated, low-bandwidth attacks: coordinating a scan among numerous attackers (or agents) and allocating different ports or hosts to different attackers makes it difficult for the IDS to correlate the captured packets and deduce that a network scan is in progress. Address spoofing/proxying: attackers can increase the difficulty of the ability of Security Administrators to determine the source of the attack by using poorly secured or incorrectly configured proxy servers to bounce an attack. If the source is spoofed and bounced by a server then it makes it very difficult for IDS to detect the origin of the attack. Pattern change evasion: IDS generally rely on „pattern matching‟ to detect an attack. By changing the data used in the attack slightly, it may be possible to evade detection. For example, an IMAP server may be vulnerable to a buffer overflow, and IDS is able to detect the attack signature of 10 common attack tools. By modifying the payload sent by the tool, so that it does not resemble the data that the IDS expect, it may be possible to evade detection. IV. STRING MATCHING FOR INTRUSION DETECTION In the Introduction we motivated the need for string matching in Intrusion Detection Systems. In this section we further demonstrate how string matching is used in an actual intrusion detection system. We also examine the state of the art in string matching as it relates to intrusion detection. Aho-Corasick string-matching algorithm One of the earliest algorithms in precise multi-pattern string matching is due to Aho- Corasick , which is able to match strings in worst case time linear in the size of the input. Aho-Corasick works by constructing a state machine from the strings to be matched. The state machine starts with an empty root node which is the default non-matching state. Each pattern to be matched adds states to the machine, starting at the root and going to the end of the pattern. The state fig(2) machine is then traversed and failure pointers are added from each node to the longest prefix of that node which also leads to a valid node in the trie.(Fig 3). Beyond this basic notion, there are two choices for the algorithm. We can optimize the data structure further by using the failure pointers to precompute the next state for every character from every state in the machine (Fig 1), or we can leave these transitions undefined and traverse the failure pointers at run-time ( Fig 2).If the data structure is optimized, then Aho- Corasick requires only a single memory reference (albeit a very wide memory reference)
  • 7. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 7 per character in the input. If the data structure is left unoptimized, one can show via amortizedanalysis that only two (again wide) memory references per character of input string are required to traverse the data structure. We use the unoptimized data structure because the undefined pointers allow us significant opportunity for space optimizations. V. FIGURES (1) Pattern matching machine. (2) Aho-Corasick
  • 8. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 8 VI. CONCLUSION Guided by the analogy between IP lookup and string matching, our paper builds on the worst-case guarantees of the classical Aho-Corasick string matching algorithm. As with multibit tries, Aho-Corasick is the only string matching algorithm we know of that has deterministic worst-case lookup times and a data structure friendly enough to use for wire speed hardware matching. Unfortunately, the classical Aho-Corasick data structure takes more storage than is likely to fit in on-chip SRAM or the cache of a commodity processor. The principal contribution of our paper is to apply bitmap node compression and path compression to AhoCorasick to gain both compact storage and worst-case performance. In particular, we show that the use of such compression gains factors of almost 50 times in database size reductions on current rule sets. While the case is less clear for software implementations unless more predictable performance is desired; we believe that our compressed AhoCorasick algorithms are the best choice for hardware implementations of string matching for IDS using an FPGA, ASIC or network processor designs of the future. REFERENCES [1] Deterministic Memory-Efficient String Matching Algorithms for Intrusion Detection- By Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese, Department of Computer Science and Engineering, University of California, San Diego, Department of Computer Science, University of California, Santa Barbara. [2] EDPACS-THE EDP Audit, Control, And Security Newsletter, NOVEMBER 2001 Vol.XXIX, No.5
  • 9. KNOWLEDGECUDDLE PUBLICATION International Journal of Computer Engineering and Science, August- 2014 9 [3] M. Consens, G. Navarro, “N-Gram Similarity and Distance”. SPIRE 2005, LNCS 3772, 2005, pp. 115- 126 [4] Andreas Wespi, Marc Dacier, Herve Debar. “Intrusion Detection Using Variable-Length Audit Trail Patterns”. Third International Workshop on the Recent Advances in Intrusion Detection(RAID 2000). Toulouse, France, 2000, 110-129 [5] Andreas Wespi, Marc Dacier, Herve Debar. “An Intrusion-Detection System Based on the Teiresias Pattern- Discovery Algorithm”. EICAR Proceedings, 1999, pp. 1-15 [6] Duan You-xiang, Huang Min, Xu Jiuyun, “Initialization Method of Gene Libray Based on Teiresias Algorithm”. Networks Security, Wireless Communications and Trusted Computing, 2009. NSWCTC 09. 2009, pp, 294-297 [7] Jonassen Inge, “Efficient Discovery of Conserved Patterns using a Pattern Graph”. Comput Appl Biosci, 1997