Adversarial Pattern Classification Using Multiple Classifiers and Randomisation

P R A
Pattern Recognition and Applications Group
University of Cagliari, Italy
Department of Electrical and Electronic Engineering

Adversarial pattern classification
using multiple classifiers and randomization

Battista Biggio, Giorgio Fumera, Fabio Roli

S+SSPR 2008,Orlando, Florida, December 4th, 2008

Standard pattern classification model
learning
algorithm
x1
physical pattern acquisition/
(image, text x2
process measurement ...
document, ...)‫‏‬
xn classifier
feature
random vector
noise

ed by sets of coupled
Example: OCR s for formal neurons
ation of essentials
feat

But many security applications, such as spam filtering, do not fit well with the
above model:

noise is not random, but adversarial. Malicious errors.
false negatives are not random, they are created to evade the classifier
training data can be “tainted” by the attacker
an important classifier’s feature is its “hardness of evasion”, that is, the effort that
the attacker has to do for evading the classifier

Adversarial pattern classification
learning
pattern x1 algorithm
(e-mail,
measurement x2
network packet, ...
fingerprint, ...)‫‏‬ classifier
xn
adversarial feature
noise vector

Example: Spam message:
spam e-mails CNBC Features MPRG on Power
Lunch Today, Price Climbs
74%!
The Motion Picture Group
Symbol: MPRG
Price: $0.33 UP 74%

It’s a game with two players: the classifier and the adversary
The adversary camouflages illegitimate patterns in adversarial way to evade the classifier
The classifier should be adversary-aware to handle the adversarial noise and to
implement defence strategies

An example of adversarial classification
Spam Filtering
From: spam@example.it
Buy Viagra !

1st round Feature weights
Linear Classifier buy = 1.0
viagra = 5.0

Total score = 6.0 > 5.0 (threshold)

Spam

Note that the popular SpamAssassin ﬁlter is really a linear classiﬁer
 See http://spamassassin.apache.org

A game in the feature space…
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

X2
yc(x) Buy Viagra!
++
+
+ +
-- Feature weights
1st round -
- - buy = 1.0
viagra = 5.0
X1

Classifier’s weights are learnt using an initial “untainted” training set

See, for example, the case of the SpamAssassin filter
http://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron

Spammer attacks by adding “good” words…
Buy Viagra!
Florida UniversityNanjing

2nd round Feature weights
buy = 1.0
Linear Classifier viagra = 5.0
University = -2.0
Florida = -3.0

Total score = 1.0 < 5.0 (threshold)

Ham

A game in the feature space…

Spammer attacks by adding “good” words…
X2 From: spam@example.it
yc(x) Buy Viagra!
++ Florida UniversityNanjing
+
+
+ Feature weights
2nd round -
--
-
- buy = 1.0
- viagra = 5.0
X1 University = -2.0
Florida = -3.0

Adding good words is a typical trick used by spammers for evading a filter

The spammer’s goal is modifying the mail so that the filter is evaded but the
message is still understandable by humans

Modelling the spammer’s attack strategy
X2
The adversary uses a cost
yc(x)
+
function A(x) to select malicious x
- x’
A(x)
patterns that can be
camouflaged as innocent with
minimum cost W(x, x’)
X1

A(x)=argmax[U A(yc (x'),+)"W(x,x')]
x'! X
Adversary utility is higher when malicious patterns are misclassified: U A (!,+)>U A (+,+)
For spammers, the cost W(x, x’) is related to adding words, replacing words, etc.

The adversary transforms a malicious pattern x into an innocent pattern x’ if the
camouflage cost W(x, x’) is lower than the utility gain

In spam filtering, the adversary selects spam mails which can be camouflaged as ham
mails with a minimum number of modifications of mail content


Classifier reaction by retraining…
Buy Viagra!
Florida UniversityNanjing

3rd round Feature weights
buy = 1.0
Linear Classifier viagra = 5.0
University = -0.3
Florida = -0.3

Total score = 5.4 > 5.0 (threshold)

Spam

Modelling classifier reaction

Classifier retraining…
X2 From: spam@example.it
yc(x) Buy Viagra!
++ Florida UniversityNanjing
+
+
+ Feature weights
- - -
- buy = 1.0
3rd round -
-
viagra = 5.0
X1 University = -2.0
Florida = -3.0

The classifier is adversary-aware, it takes into account the previous moves of
the adversary
In real cases, this means that the filter’s user provides the correct labels for
mislabelled mails

The classifier constructs a new decision boundary yc(x) if this move gives an
utility higher than the cost for extracting features and re-training

Adversary-aware classifier

Results reported in this paper showed that classifier performance
significantly degrades if the adversarial nature of the task is not taken
into account, while an adversary-aware classifier can perform
significantly better

By anticipating the adversary strategy, we can defeat it.
Real anti-spam filters should be adversary-aware, which means that
they should adapt to and anticipate adversary’s moves: exploiting the
feedback of the user, changing their operation, etc.

“If you know the enemy and know yourself, you need not fear
the result of a hundred battles”
(Sun Tzu, 500 BC)

Defence strategies in adversarial classification
Beyond classifier retraining…

Real anti-spam filters can be re-trained by the feedback of the users
x2
which can provide correct labels for the mislabelled mails. In the model
of Dalvi et al., this corresponds to the assumption of perfect knowledge
of the adversary’s strategy function A(x)

Beyond retraining, are there other defence strategies that
we can implement?

Mimimum cost
+'
x
camouflage(s)
BUY VI@GRA!

C(x) = ! C(x) = + x1

A defence strategy: hiding information by randomization

X2 y2(x)
Am I evading it?
Two random
realizations of the
y1(x)
- + x
+
boundary yc(x)
-
x’

X1

An intuitive strategy for making a classifier harder to evade is to hide
information about it to the adversary

A possible implementation of this strategy is to introduce some randomness in
the placement of the classification boundary

“Keep the adversary guessing. If your strategy is a mystery, it cannot be
counteracted. This gives you a signiﬁcant advantage”
(Sun Tzu, 500 BC)


X2 y2(x)
Am I evading it?
Two random
realizations of yc(x)
y1(x)
- + x
+
A(x)=x’ does not
evade y1(x) !
-
x’
A(x)=x’

X1

Consider a randomized classifier yc(x, T), where the random variable is the training set T
Example: assume that UA(-,+)=5, UA(+,+)=0, W(x’, x)=3

Case 1: the adversary knows the actual boundary y2(x)
The adversary’s gain if the pattern x is changed into x’ is UA(x’, x) - W(x’, x)= 5 - 3 = 2,
then the adversary does the transformation ad evades the classifier.

Case 2: two random boundaries with P(y1(x))=P(y1(x))=0.5
The expected gain is:
[UA(x’, x) * P(y1(x)) + UA(x’, x) * P(y2(x))] - W(x’, x)= [0 * 0.5 - 5 * 0.5] - 3 = 2.5 - 3 < 0,
then the adversary does not move, even if such move would allow evading the classifier.

X2 y2(x)
Am I evading it?

- +
y1(x)
x
+
-
x’
A(x)=x’

X1

Why is a randomized classifier harder to evade?
Key points:

yc(x) becomes a random variable: Yc
The adversary has to compute the C
Y x'!X
[
E{A(x)} = argmax E{U A(y c (x'),+)} "W(x,x')
YC ]
E{A(x)} # A(x/y c (x'))= A (x)
expected value of A(x) by averaging opt
over possible realizations of yc(x)
YC

In the Proceedings paper we show that adversary’s strategy A(x) becomes
suboptimal. Adversary does not camouflage malicious patterns that would allow
evading the classifier, or camouflage malicious patterns which are misclassified
by the classifier.

Evade hard MCS with randomization

Header Analysis
Black/White List
URL Filter
Signature Filter
…
Σ legitimate
spam

Assigned class
Content Analysis

http://spamassassin.apache.org

The defence strategy based on “randomization” can be implemented in several
ways
We implemented it using the multiple classifiers approach, by the randomisation
of the combination function
For our experiments, we used the SpamAssassin filter that is basically a linearly
weighted combination of classifiers, and randomized the weights by training set
bootstrapping

Experiments with multiple classifiers and randomization

E-mail data set: TREC 2007 Assume that the adversary can make any
75,419 real e-mail messages modification which reduces the score of a rule
received between Apr.-July 2007
25,220 ham, 50,199 spam
Key point: the adversary does not know the
actual set of weights deployed for combining
Experimental set up multiple classifiers (filtering rules). So it can
We used SpamAssassin filter with a devise only a suboptimal strategy A(x).
weighted sum as combination function
(a SVM with linear kernel)
rnd
FN (%) FN (%)
det rnd det rnd det
UA UA UC UC
Randomization of the combination
function by bootstrap. The adversary 0.98 0.56 1.30 1.46 19.55 11.21
“sees” 100 different sets of weights
with identical probability.
SpamAssassin architecture The average false negative rate decreases
from 19.55% to 11.25% when the classifier
uses randomization
This is confirmed by the decrease of
adversary’s utility and the increase of
classifier’s utility

Adversarial Pattern Classification Using Multiple Classifiers and Randomisation

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Mehr von Pluribus One

Mehr von Pluribus One (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Adversarial Pattern Classification Using Multiple Classifiers and Randomisation