Survey on spam filtering

A Study on Spam Filtering Techniques
Chippy Thomas1
, Dr.M. Azath2
1
Department of CSE, MET’S School of Engineering, MALA
Email: chippythomas18@gmail.com
2
HOD, Department of CSE, MET’S School of Engineering ,MALA
Email: mailmeazath@gmail.com
Abstract
Spam is one of the major problems of the today’s
Internet, bringing financial damage to companies and
annoying individual users. Spam messages are almost
always commercial or fraudulent messages, and are quite
often sent with false return address information. Among
the approaches developed to stop spam, filtering is an
important and popular one. A spam filter is a program
that is used to detect unsolicited and unwanted email and
prevent those messages from getting to a user's inbox.
Like other types of filtering programs, a spam filter looks
for certain criteria on which it bases judgments.
Researchers have proposed different types of spam filters
and this paper provides a review of them.
Keywords:
Spam, Spam Filter, Spam Sending, Traditional Methods
1. Introduction
A spam filter is a piece of software that scans through a
message to determine if it is spam or not. Spam filters is
installed on an email server whose sole purpose is to
analyze incoming and outgoing emails and, based on a set
of rules, to determine if the email is spam or not. Spam
filters can take on many different roles including content
filtering, blacklist filtering, malware filtering, virus
detection, etc. All incoming emails are filtered through a
complex set of rules and stamped with a spam score,
based on the settings, some email are completely discard
and never seen by the recipients. The image below
illustrates how an email goes from the sender to the email
server, through the spam filter, and finally ends up in
either the Inbox, Junk folder, or worst case just simply
discarded. Companies that develop spam filters keep a
tight lid on the specifics of how they work. This is a good
thing as it keeps the spammers guessing as to how to get
around the rules
.
Figure 1: Spam Filtering
Basically, the spam filter programs compare the rules
they're programmed with against the email. For example,
it's commonly accepted that sending an email with just an
embedded image and no text is an attempt to get around
the spam filters rules. If the spam filter in question has a
rule that says "if this message contains only an image and
no text" and it receives an email with an image and no
text, that email will get marked as spam [2].
Spam is a waste of time and resources. Filtering the
genuine mails from the spam on a daily basis wastes a
sizeable amount of time (and therefore money) over the
span of a year. Installing an anti-spam filter will provide
the necessary blocks needed to ensure that minimal time
is spent sifting through unwanted emails every morning.
Whilst no single anti-spam software can guarantee to
automatically remove all spam, there are spam controls
that will do a very good job of intercepting most spam
mail. Apart from removing spam messages that could
cause embarrassment or offense due to their content, an
anti-spam filter will also help by saving time that would
Page | 1

otherwise be spent manually filtering the spam messages
from the valid one.
This paper reviews and discusses some of the research
works done in the field of spam filters, which detect and
prevent spam. The remaining sections of the paper are
organized as follows. Section 2 gives a brief overview of
spam phenomenon and general characteristics and
definitions of spam. Section 3 discusses spams in
different media. Section 4 discusses about spam filtering
techniques and classifications. Section 5 presents spam
sending techniques.
2. The Spam Phenomenon
This section provides an introduction to the phenomenon
of spam, including the definition and general
characteristics of spam, as well as a brief overview of
non-filtering methods of anti-spam protection, namely
anti-spam legislation and changes in the process of email
transmission. Not being directly related to spam filtering,
this methods either influence the ways in which spam can
be formed and transmitted, or provide new architectures
in which a filter can be used. Therefore, a brief
introduction to these methods is needed before passing to
filtering itself.
2.1 Definition and General Characteristics of Spam
There exist various definitions of what spam (also called
junk mail) is and how it differs from legitimate mail (also
called non-spam, genuine mail or ham). The shortest
among the popular definitions characterizes spam as
“unsolicited bulk email”. Sometimes the word
commercial is added, but this extension is argued. The
TREC Spam Track relies on a similar definition: spam is
“unsolicited, unwanted email that was sent
indiscriminately, directly or indirectly, by a sender having
no current relationship with the user”. Another widely
accepted definition states that “Internet spam is one or
more unsolicited messages, sent or posted as part of a
larger collection of messages, all having substantially
identical content”. Direct Marketing Association
proposed to use the word “spam” only for messages with
certain kinds of content, such as pornography, but this
idea met no enthusiasm, being considered an attempt to
legalize other kinds of spam . As we can see, the common
point is that spam is unsolicited, according to a widely
cited formula “spam is about consent, not content”. It is
necessary to mention that the notion of being unsolicited
is hard to capture. In fact, despite the wide agreement on
this type of definitions the filters have to rely on content
and ways of delivery of messages to recognize spam from
legitimate mail [7].
Spam Characteristics
Antispam developers while designing filters always take
the general characteristics of a spam mail into
consideration. More than 99% of spam falls into one or
more of the categories given below.
• To advertise some goods, services, or ideas
• To cheat users out of their private information
and to deliver malicious software
• To cause a temporary crash of a mail server.
Each of the categories, in turn, is being studied by several
researchers. Advertising spam mails promote different
kinds of products or services, however, careful scrutiny
has shown that spammers change the percentage of
advertisements dedicated to each category of products or
services over time. Characteristics of spam traffic are
different from those of legitimate mail traffic in particular
legitimate mail is concentrated on diurnal periods, while
spam arrival rate is stable over time. A very important
fact is that spammers are reactive, namely they actively
oppose every successful anti-spam effort, so that
performance of a new method usually decreases after its
deployment. Pu and Webb analyze the evolution of
spamming techniques. They showed that spam
constructing methods become extinct if filters are
effective to cope with them or if other successful efforts
are taken against them. A study of the network-level
behavior of spammers by showed that the majority of
spam comes from a few concentrated parts of IP address
space. Moreover, they also found that only a small subset
of sophisticated spammers uses temporary route
announcements in order to remain untraceable [5].
3. Spam in Different Media
The most widely recognized form of spam is e-mail spam,
the term is applied to similar abuses in other
media: instant messaging spam, Usenet newsgroup
spam, Web search engine spam, spam in blogs, wiki
spam, online classified ads spam, mobile phone
messaging spam, Internet forum spam, junk fax
transmissions, social spam, television advertising and file
sharing spam .
E-mail spam, also known as unsolicited bulk E-mail
(UBE), junk mail, or unsolicited commercial e-mail
(UCE), is the practice of sending unwanted e-mail
Page | 2

messages, frequently with commercial content, in large
quantities to an indiscriminate set of recipients.
Instant messaging spam, makes use of instant
messaging systems. Although less ubiquitous than its e-
mail counterpart, according to a report from Ferris
Research, 500 million spam IMs were sent in 2003, twice
the level of 2002. As instant messaging tends to not be
blocked by firewalls, it is an especially useful channel for
spammers. This is very common on many instant
messaging systems such as Skype.
Newsgroup spam, is a type of spam where the targets are
Usenet newsgroups. Spamming of Usenet newsgroups
actually pre-dates e-mail spam. Usenet convention
defines spamming as excessive multiple posting, that is,
the repeated posting of a message.
Forum spam, is the creating of messages that are
advertisements on Internet forums. It is generally done by
automated spam bots. Most forum spam consists of links
to external sites, with the dual goals of increasing search
engine visibility in highly competitive areas such as
weight loss, pharmaceuticals, gambling, pornography,
real estate or loans, and generating more traffic for these
commercial websites. Some of these links contain code to
track the spambot's identity; if a sale goes through, the
spammer behind the spambot works on commission.
Mobile phone spam, is directed at the text
messaging service of a mobile phone. This can be
especially irritating to customers not only for the
inconvenience, but also because of the fee they may be
charged per text message received in some markets.
Despite the high number of phone users, there has not
been so much phone spam, because there is a charge for
sending SMS, and installing trojans into other's phones
that send spam (common for e-mail spam) is hard
because applications normally must be downloaded from
a central database.
Social spam, Spreading beyond the centrally managed
social networking platforms, user-generated content
increasingly appears on business, government, and
nonprofit websites worldwide. Fake accounts and
comments planted by computers programmed to issue
social spam can infiltrate these websites.
Online game messaging, Many online games allow
players to contact each other via player-to-player
messaging, chat rooms, or public discussion areas. What
qualifies as spam varies from game to game, but usually
this term applies to all forms of message flooding,
violating the terms of service contract for the website.
This is particularly common in MMORPGs where the
spammers are trying to sell game-related "items" for real-
world money, chiefly among them being in-game
currency.
Spam targeting search engines (Spamdexing), refers to
a practice on the World Wide Web of
modifying HTML pages to increase the chances of them
being placed high on search engine relevancy lists. These
sites use "black hat search engine optimization (SEO)
techniques" to deliberately manipulate their rank in search
engines. Many modern search engines modified their
search algorithms to try to exclude web pages utilizing
spamdexing tactics.
SPIT (Spam over Internet Telephony) is VoIP (Voice
over Internet Protocol) spam, usually using SIP (Session
Initiation Protocol). This is almost identical to
telemarketing calls over traditional phone lines. When the
user chooses to receive the spam call, a pre-recorded
message is usually played back. This is generally easier
for the spammer as VoIP services are cheap and easy to
anonymize over the Internet, and there are many options
for sending mass amounts of calls from a single location.
Accounts or IP addresses being used for VoIP spam can
usually be identified by a large number of outgoing calls,
low call completion and short call length.
Spam targeting video sharing sites Video sharing sites,
such as YouTube, are now frequently targeted by
spammers. The most common technique involves
spammers posting links to sites, most
likely pornographic or dealing with online dating, on the
comments section of random videos or people's profiles.
Social networking spam, Facebook and Twitter are not
immune to messages containing spam links. Most
insidiously, spammers hack into accounts and send false
links under the guise of a user's trusted contacts such as
friends and family.[8] As for Twitter, spammers gain
credibility by following verified accounts such as that of
Lady Gaga; when that account owner follows the
spammer back, it legitimizes the spammer and allows him
or her to proliferate[1].
4. Spam Filtering Techniques
Today there are a large number of solutions designed to
help eliminate the spam problem. These solutions use
different techniques for analyzing email and determining
if it is indeed spam. The accuracy of spam blocking
techniques are evaluated on two dimensions: How much
spam you successfully filter out, and how little legitimate
messages you accidentally delete. Maintaining accuracy
Page | 3

can be difficult because spam is constantly changing, the
most effective spam blocking solutions contain more than
one of the following techniques to help ensure that all
spam, and only spam, is blocked [4].
As we know that the spam is “unsolicited, unwanted
email that was sent indiscriminately, directly or
indirectly, by sender having no current relationship with
the recipient”. A huge amount of spam is being generated
every day and waste significant Internet resources as well
as users time. Spam attacks both the computer and its
users. Spam email can contain viruses, key loggers,
phishing attacks and more. These types of malware can
compromise a user’s sensitive private data by capturing
bank account information username and passwords.
When we talk about Spam filters it is a classifier which
classifies email messages sent to user, as accurately as
possible into Spam or ham (nonspam). in this proposal we
are primarily concerned with the online personal spam
filtering process shown in figure 1.
Figure 2: Spam Filter Process
As the figure shown the email arrives the Spam filter
classifies them as spam that are put in the inbox, or Spam,
which are quarantined (that is it is kept in the junk folder).
It is supposed the user reads that inbox regularly; while
the junk folder is not been checked frequently as it
supposed that it will not contain legitimate emails. The
user can note the misclassification errors by the filter
Spam emails in the inbox and in the junk folder and
report those learning based filter. Now the filter uses the
feedback to update its internal model. Basically it is
improving the future perception of the predictive
performance. Now it is quite cumbersome that the user
always reports the errors [6].
4.1 Classification of Spam Filtering methods
Depending on used techniques spam filtering methods are
generally divided into two categories:
1) Methods to avoid spam distribution in their origins;
2) Methods to avoid spam at destination point.
Let’s consider these methods in detailed form
4.1.1. Methods to Avoid Spam Distribution in their
origins
Legislative measures limiting spam distribution,
development of e-mail protocols using sender
authentication, blocking mail servers which distribute
spam are the methods which avoid spam distribution in
origin. Using these methods alone doesn’t give
considerable results. For example, there are many hard
legislative restrictions for spam distribution in USA;
nevertheless, the greatest amount of spam is distributed
from this region. One of the reasons is an existence of
high level broad-band Internet access in USA. There is a
number of the approaches, offering to make spam sending
economically unprofitable. One of these statements is to
make sending of each e-mail paid. The payment for one
e-mail should be the extremely insignificant. In this case
for the usual user it will be imperceptible. For spammers
who send thousand and millions messages the cost of
such mailing becomes considerable that makes it
economically un-profitable.
4.1.2. Methods to Avoid Spam at Receiving point
Methods which filter spam in destination point can be
divided into the following categories:
I) Depending on used theoretical approaches:
1. Traditional methods
2. Learning-based methods
3. Hybrid methods
II) Depending on filtration area:
1. Server side,
2. Client side
3. Filtration in public mail-servers.
4.1.2.1. Classification of Spam Filtering Methods
Depending on Theoretical Approaches
Page | 4

As we noted above depending on used theoretical
approaches spam filtering methods are divided into
traditional, learning-based and hybrid methods.
In traditional methods the classification model or the data
(rights, pat-terns, keywords, lists of IP addresses of
servers), based on which messages are classified, is
defined by expert. The data storage collected by experts is
called as the knowledge base. There are also used trusted
and mistrusted senders lists, which help to select legal
mail. Actually it makes sense only creation of the “white”
list, because spammers use fictitious e-mail addresses.
This technique can’t represent itself as a high-grade anti-
spam filter, but can reduce considerably amount of false
operations, being a part of e-mail filtration system based
on other classification methods.
In learning-based methods the classification model is
developed using Data Mining techniques. There are some
problems from the point of view of data mining as
changing of spam content with time, the proportion of
spam to legitimate mail, insufficient amount of training
data are characteristic for learning-based methods.
I Traditional methods
Traditional methods are divided into the following
categories:
1) Methods based on analysis of messages. The
received e-mail is analyzed for specific signs of spam on
the base of:
• formal signs;
• content using signature in updated database;
• content applying statistic methods based on
Bayes theorem;
• Content by means of use SURBL (Spam URL
Real-time Block Lists), when run search for
located references in e-mail and their verification
under base of SURBL. This method is effective.
2) Detectors of mass distribution.
Their task is to detect distributions of similar e-mails to
the bulk of users. The following methods are used for the
detection:
• users’ voting
• analysis of e-mails coming through mail system
(DCC) receipt of e-mail to the spam “trap” and
its following analyses
Independent from a way of bulk detection the idea of a
method is that for spam filtration the calculated e-mail
signature (the control sum) is used. For the methods based
on detection of repetitions two vital issues are
characteristic. The first is a spam “personification”. This
means that each spam e-mail has insignificant differences
at the cost of which it is hard to collect steady sig-natures.
To solve this problem the various steady signatures are
used. For example, in Yandex Mail System the method of
shingles is realized. The second problem is a detection of
legitimate bulk mailings.
3) Methods based on acceptance of sender as a
spammer.
This method relies on different blackhole lists of IP and e-
mail addresses. It is possible to apply own blackhole and
white lists or to use RBL services (Real-time Blackhole
List) and DNSBL (DNS-based Blackhole List) for
address verification. Advantage of these methods is
detection of spam in early step of mail receiving process.
Disadvantage is that the policy of ad-dition and deletion
of addresses is not always transparent. Often the whole
subnets belonging to providers get to the Black lists. For
such systems it is actually impossible to estimate the level
of false positives (the legitimate e-mail wrongly classified
as spam) on real mail streams.
4) Methods based on verification of sender’s e-mail
address and domain name.
This is the simplest method of filtration if DNS request’s
name is the same with the domain name of sender. But
spammers can use real ad-dresses, so that current method
is ineffective. In this case it may be verified with
possibility of sending the message from current IP
address. Firstly, the Sender ID technology can be used
where sender’s e-mail ad-dress is protected from
falsification by means of publishing the policy of domain
name use in DNS. Secondly, there can be used SPF
(Sender Policy Framework) technology, where DNS
protocol is used for verification of sender’s e-mail
address. The principle is that if do-main’s owner wants
support SPF verification, then he adds special entry to
DNS entry of his domain, where indicates the release of
SPF and ranges of IP addresses from where may become
an email from users of current domain.
5) Method based on SMTP server response emulation.
Page | 5

If the real mail delivery systems, which follow the SMTP
protocol correctly, observe such error, they get some
interval (1 - 2 hours) and repeat attempt again . But the
majority of spam-bots has very short time out periods. So
filters based on this method slow down the SMTP
transaction to the point that some SPAM senders will fail
but where real mail delivery systems will still continue
and deliver mail successfully.
All above methods are based on some data for analysis
collected by experts of third-party suppliers and same for
all users.
So that traditional method’s has the following
disadvantages:
it is necessary to update the knowledge base regularly
• there is a dependence on update suppliers
• the security level is low
• “impersonalized” model of classification doesn’t
consider individual specifics of user’s
correspondence
• dependence on natural language of
correspondence
• low level of detection because of general models
of classification
II Learning-based methods
Nowadays there is actively developed trainable or
intellectual methods based on Data Mining algorithms for
e-mail filtration. These algorithms divide the object to
some categories using classification model previously
defined on the base precedential information.
Assume spam filtration is defined by the function:
F(m,¥) ={mspam, if the decision is “spam”
{mleg, otherwise
Where m is a classified mail, m spam and m leg are spam and
legitimate e-mail.
Many spam filters based on classification using machine
learning techniques. In learning-based methods the vector
of parameters is a result of classification trainings on
previously collected e-mails.
¥= Z(M)
M= {(m1,y1),(m2,y2)...(mn,yn)};
Y1 €{ m spam,mleg }
Where m1,m2,......mn are previously collected messages,
y1, y2,.....yn are the corresponding labels and Z is the
training function[7].
The following types are belonged to learning-based
methods.
1) Image-based spam filtering. Image spam has be-
come a new type of e-mail spam. Spammers embed the
message into the image and then attach it to the mail.
Image based spam are mails where the spammer’s
message is sent in the form of a graphic or an image and
will be in human readable format. Some traditional
methods based on analysis of text-based information do
not work in this case. Image filtering process is costly and
time-consuming work. it is offered statistical feature
extraction for classification of image-based spam using
artificial neural networks. They consider statistical image
feature histogram and mean value of block of image for
image classification.
2) Bag of words Model. The bag-of-words model is a
simplifying assumption used in natural language
processing and information retrieval. In this model, a text
(such as a sentence or a document) is represented as an
unordered collection of words, disregarding grammar and
even word order. In spam filtering two bags of words are
considered. One bag is filled with word found in spam e-
mails, and the other bag is filled with words met in
legitimate e-mails. Considering e-mail as a pile of words
from one of these bags, there used Bayesian probability to
determine to which bag this e-mail belongs. k-Nearest
neighbour, SVM (Support Vector Machine), boosting
classifiers are also applicable to the bag of words.
3) Collaborative spam filtering. In collaborative
approaches, server-side automatic monitoring systems
com pare incoming messages to known spam as classified
by an automatic mechanism or by final recipients. These
solutions have achieved considerable success as they
overcome the single point of failure typical of centralized
architecture. This is gathering spam reports between P2P
users or from mail server (Google Gmail). The
collaborative centralized spam filtration is more economic
in comparison with personal approach, but only under
condition of presence of adequate procedures of the
analysis of false operations and operative reclassification
of not correctly classified messages [8].
Page | 6

4) Social networking against spam. This is a one of the
latest methods where the information extracted from
social networks is used to fight spammers. Social
networks are arguably one of the most remarkable
Internet phenomena in recent years. They have emerged
as one of the most-sought applications of the World Wide
Web—a killer app. They provide a ubiquitous ecosystem
which allows users to identify themselves, interact, share,
and collaborate [9].
So in case of learning-based methods user defines the
classification model himself, so that the majority
disadvantages of traditional methods are solved
successfully; intellectual methods are autonomous,
independent on external knowledge base, doesn’t require
regular update, multilingual, independent of natural
language, able to study new types of spam user-aided.
There is advantage as construction of personalized mail
classification model, where user himself defines which
mail is legal or which one is a spam. Therefore learning-
based methods have higher rank in spam determination.
In many spam filtration systems based on the learning-
based methods the Bayes’ theorem, Marcov’s chain and
others are successfully applied. Learning-based methods
have also a couple of disadvantages as over fitting,
dependence on quality and compound of trainee set,
resource intensivety. Application of statistic algorithms
with complicated mathematic calculations led to high
loading of computing system’s resources. For the spam
filtering systems processing fair amount of requests the
productivity of algorithm is a main importance, so
resource-intensivety factor is the most important
disadvantage of learning- based methods [3].
III Hybrid methods
One of the latest approaches in spam filtering is hybrid
filtration system which is a combination of different
algorithms, especially if they use unrelated features to
produce a solution. In this case it can be applied various
filtering techniques and get the advantages of the
traditional and learning-based methods. Hybrid solutions
need to be carefully designed as the combination might
increase time complexity while increasing security and
accuracy [5].
4.1.2.2. Classification of Spam Filtering Methods
Depending on Filtration Scope
Depending on filtration scope spam filtration methods are
divided into the following categories.
I Client Side/Personal Filters
Client side filters works directly on user’s computer. In
client side filtration e-mail loading to the user’s local
computer anyway, and only after that classified what
leads to additional loading of data transfer in network.
Client side spam filtration more accurately due to usage
methods of ma-chine learning. In client side filters users’
personal in-formation are used, in server side filters the
filtration model is defined at once for all users. In spite of
the fact that for the majority of users it is obvious what is
spam, the concept of spam for each of them is enough
personified. The e-mail message marked as spam by
someone may be the important information for other one.
From filtration quality point of view the personal model is
the most preferable as characteristics of user’s
correspondence are considered. Generally, absence of
personification reduces the level of detection and
increases quantity of false positives. On the other hand,
use of personal model of e-mail classification involves an
inevitable overhead cost. Firstly the user should construct
his personal model of filtration himself as only he can
define what legal e-mail is, and what spam is for him.
Secondly, construction, storage and use of personal model
demands additional computing resources.
II Server side/general filters.
Server side filters work at mail server level. Generally in
server side filtration systems the traditional methods of
filtration are applied, but at client level the learning-based
or hybrid one. Server side filtration also own priority. As
centralized solution reduces expenses, simplifies support
and control of this system. User becomes more mobile, so
that it is comfortable to store mail centralized in server
and to have an access to him from different points, using
different devices. Hereby, classification at mail-server
level more preferably and development of these methods
more actual.
III Spam filtering in public mail-servers.
This solution sometimes is better than client or server
solution. In this case users are mobile as in case of server
side filtration, and personalized as in case of client side
solution. But disadvantage of usage of public mail-servers
is that users depend on filtration product installed there.
For example, the mail-server of Google.Inc company
gmail.com uses its own products against spam. This
system considers personal information about user to
minimize false positives. The public mail provider
Mail.ru uses Kaspersky Anti-Spam product based on
“Spamtest” technology, and absolutely based on
Page | 7

traditional filtration methods, as well RBL, the base of
fuzzy signature of mails with spam, heuristics base. These
knowledge bases are maintenanced and updated regularly
till 3 times in an hour. Processing of attached files,
detection of iterations is supported also. The system as a
general model of classification applicable for all users,
but at the same time personalization is absent [10].
5. Spam Sending Techniques
To understand the issues involved in controlling spam,
the methods employed by spammers should be
investigated. First generation spammers used the simplest
technique: Send out thousands or millions of e-mail
messages from their own e-mail accounts. However this
was easily combated by service providers by blacklisting
these users. Using mail volumes, subject line and message
analysis, and user complaints, the service providers could
identify spammers and ban them from the network, a
simple policy that was easily enforced. Spammers quickly
switched to a new technique using open mail proxies. In
brief, an open mail proxy is a server that accepts
connections from any network address, acting as a blind
intermediary to virtually any other network address. To
the recipient (and the intervening network infrastructure),
the spam message seems to originate from the mail proxy,
effectively masking the sender's true identity. Service
providers responded with a second kind of blacklist, this
time of known mail servers that were sending spam. In
response to the server blacklist, spammers developed an
even more sophisticated method of attack-the spam
zombie. By infecting unprotected computers with a
Trojan horse program, a spammer effectively recruits an
army of unwitting users who can be activated by a remote
command to launch a spam attack. Such an attack has
characteristics similar to a DDoS attack: The large
number of attacking machines makes it difficult or
impossible either to identify the source of the attack or to
take effective corrective action in real time without
causing massive disruptions to legitimate users [4].
6. Conclusion
In this paper we have briefly discussed the problem of
Spam and try to give an overview of Spam characteristics
and Spam filter features. Spam is a problem that is
continuing to grow from day to day, costing corporations
billions of dollars in lost productivity. Fortunately though,
there is different spam blocking techniques to help
counter the various types of spam. Because spammers are
always trying to bypass anti-spam techniques by changing
the methods they use to send spam, it’s best for
corporations to protect themselves with a spam blocking
solution that uses more than one spam blocking
technique. Each one of these techniques has advantages,
disadvantages, as well as limitations. To minimize the
amount of spam that enters an organization, a spam
blocking solution that includes a combination of the most
effective techniques should be implemented.
References
[1] Wikipedia, http://en.wikipedia.org/wiki/Spamming.
[2] Spamfilter,http://searchmidmarketsecurity.techtarget.
com/definition/spam-filter
[3] Saadat Nazirova, “Survey on Spam Filtering
Techniques”, Communications and Network, 2011,
3, 153-160, doi:10.4236/cn.2011.33019 Published
Online
[4] Chris Lucas, Software Engineering 4C03,SPAM,
http://www4.ncsu.edu/~kksivara/sfwr4c03/projects/4
c03projects/CGLucas-Project.pdf
[5] S. Dhanaraj , Dr. V. Karthikeyani ,“A Study on E-
mail Image Spam Filtering Techniques”, Proceedings
of the 2013 International Conference on Pattern
Recognition, Informatics and Mobile Engineering
(PRIME) , 2013 IEEE.
[6] Prachi Oswal and Prof. Anurag Jain, “Spam and The
Techniques Used for Spam Filters: A Review”
International Journal of Engineering Trends and
Technology (IJETT) - Volume4Issue5- May 2013.
[7] Enrico Blanzieri , Anton Bryl “A survey of Learning
Based Techniques of email spam filtering”,Springer
Science+Business Media B.V. 2009.
[8] Ernesto Damiani, Sabrina De Capitani di Vimercati,
Stefano Paraboschi, Pierangela Samarati, ”P2P-Based
Collaborative Spam Detection and Filtering”.
[9] GODWIN CARUANA, MAOZHEN LI, “A Survey
of Emerging Approaches to Spam Filtering”. ACM
Computing Surveys, Vol. 44, No. 2, Article 9,
Publication date: February 2012.
[10]Bhawana S.Dakhare, Ujwala V.Gaikwad, “Spam
Detection and Filtering using Different Methods” ,
MEDHA-2012, Proceedings published by
International Journal of Computer Applications®
(IJCA).
Page | 8

Survey on spam filtering

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (16)

Ähnlich wie Survey on spam filtering

Ähnlich wie Survey on spam filtering (20)

Mehr von Chippy Thomas

Mehr von Chippy Thomas (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Survey on spam filtering