General Principles of Intellectual Property: Concepts of Intellectual Proper...
Adversarial Pattern Classification Using Multiple Classifiers and Randomisation
1. P R A
Pattern Recognition and Applications Group
University of Cagliari, Italy
Department of Electrical and Electronic Engineering
Adversarial pattern classification
using multiple classifiers and randomization
Battista Biggio, Giorgio Fumera, Fabio Roli
S+SSPR 2008,Orlando, Florida, December 4th, 2008
2. Standard pattern classification model
learning
algorithm
x1
physical pattern acquisition/
(image, text x2
process measurement ...
document, ...)
xn classifier
feature
random vector
noise
ed by sets of coupled
Example: OCR s for formal neurons
ation of essentials
feat
But many security applications, such as spam filtering, do not fit well with the
above model:
noise is not random, but adversarial. Malicious errors.
false negatives are not random, they are created to evade the classifier
training data can be “tainted” by the attacker
an important classifier’s feature is its “hardness of evasion”, that is, the effort that
the attacker has to do for evading the classifier
3. Adversarial pattern classification
learning
pattern x1 algorithm
(e-mail,
measurement x2
network packet, ...
fingerprint, ...) classifier
xn
adversarial feature
noise vector
Example: Spam message:
spam e-mails CNBC Features MPRG on Power
Lunch Today, Price Climbs
74%!
The Motion Picture Group
Symbol: MPRG
Price: $0.33 UP 74%
It’s a game with two players: the classifier and the adversary
The adversary camouflages illegitimate patterns in adversarial way to evade the classifier
The classifier should be adversary-aware to handle the adversarial noise and to
implement defence strategies
4. An example of adversarial classification
Spam Filtering
From: spam@example.it
Buy Viagra !
1st round Feature weights
Linear Classifier buy = 1.0
viagra = 5.0
Total score = 6.0 > 5.0 (threshold)
Spam
Note that the popular SpamAssassin filter is really a linear classifier
See http://spamassassin.apache.org
5. A game in the feature space…
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
X2
From: spam@example.it
yc(x) Buy Viagra!
++
+
+ +
-- Feature weights
1st round -
- - buy = 1.0
viagra = 5.0
X1
Classifier’s weights are learnt using an initial “untainted” training set
See, for example, the case of the SpamAssassin filter
http://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron
6. An example of adversarial classification
Spammer attacks by adding “good” words…
From: spam@example.it
Buy Viagra!
Florida UniversityNanjing
2nd round Feature weights
buy = 1.0
Linear Classifier viagra = 5.0
University = -2.0
Florida = -3.0
Total score = 1.0 < 5.0 (threshold)
Ham
7. A game in the feature space…
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
Spammer attacks by adding “good” words…
X2 From: spam@example.it
yc(x) Buy Viagra!
++ Florida UniversityNanjing
+
+
+ Feature weights
2nd round -
--
-
- buy = 1.0
- viagra = 5.0
X1 University = -2.0
Florida = -3.0
Adding good words is a typical trick used by spammers for evading a filter
The spammer’s goal is modifying the mail so that the filter is evaded but the
message is still understandable by humans
8. Modelling the spammer’s attack strategy
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
X2
The adversary uses a cost
yc(x)
+
function A(x) to select malicious x
- x’
A(x)
patterns that can be
camouflaged as innocent with
minimum cost W(x, x’)
X1
A(x)=argmax[U A(yc (x'),+)"W(x,x')]
x'! X
Adversary utility is higher when malicious patterns are misclassified: U A (!,+)>U A (+,+)
For spammers, the cost W(x, x’) is related to adding words, replacing words, etc.
The adversary transforms a malicious pattern x into an innocent pattern x’ if the
camouflage cost W(x, x’) is lower than the utility gain
In spam filtering, the adversary selects spam mails which can be camouflaged as ham
mails with a minimum number of modifications of mail content
9. An example of adversarial classification
Classifier reaction by retraining…
From: spam@example.it
Buy Viagra!
Florida UniversityNanjing
3rd round Feature weights
buy = 1.0
Linear Classifier viagra = 5.0
University = -0.3
Florida = -0.3
Total score = 5.4 > 5.0 (threshold)
Spam
10. Modelling classifier reaction
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
Classifier retraining…
X2 From: spam@example.it
yc(x) Buy Viagra!
++ Florida UniversityNanjing
+
+
+ Feature weights
- - -
- buy = 1.0
3rd round -
-
viagra = 5.0
X1 University = -2.0
Florida = -3.0
The classifier is adversary-aware, it takes into account the previous moves of
the adversary
In real cases, this means that the filter’s user provides the correct labels for
mislabelled mails
The classifier constructs a new decision boundary yc(x) if this move gives an
utility higher than the cost for extracting features and re-training
11. Adversary-aware classifier
N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004
Results reported in this paper showed that classifier performance
significantly degrades if the adversarial nature of the task is not taken
into account, while an adversary-aware classifier can perform
significantly better
By anticipating the adversary strategy, we can defeat it.
Real anti-spam filters should be adversary-aware, which means that
they should adapt to and anticipate adversary’s moves: exploiting the
feedback of the user, changing their operation, etc.
“If you know the enemy and know yourself, you need not fear
the result of a hundred battles”
(Sun Tzu, 500 BC)
12. Defence strategies in adversarial classification
Beyond classifier retraining…
Real anti-spam filters can be re-trained by the feedback of the users
x2
which can provide correct labels for the mislabelled mails. In the model
of Dalvi et al., this corresponds to the assumption of perfect knowledge
of the adversary’s strategy function A(x)
Beyond retraining, are there other defence strategies that
we can implement?
Mimimum cost
+'
x
camouflage(s)
BUY VI@GRA!
C(x) = ! C(x) = + x1
13. A defence strategy: hiding information by randomization
X2 y2(x)
Am I evading it?
Two random
realizations of the
y1(x)
- + x
+
boundary yc(x)
-
x’
X1
An intuitive strategy for making a classifier harder to evade is to hide
information about it to the adversary
A possible implementation of this strategy is to introduce some randomness in
the placement of the classification boundary
“Keep the adversary guessing. If your strategy is a mystery, it cannot be
counteracted. This gives you a significant advantage”
(Sun Tzu, 500 BC)
14. A defence strategy: hiding information by randomization
X2 y2(x)
Am I evading it?
Two random
realizations of yc(x)
y1(x)
- + x
+
A(x)=x’ does not
evade y1(x) !
-
x’
A(x)=x’
X1
Consider a randomized classifier yc(x, T), where the random variable is the training set T
Example: assume that UA(-,+)=5, UA(+,+)=0, W(x’, x)=3
Case 1: the adversary knows the actual boundary y2(x)
The adversary’s gain if the pattern x is changed into x’ is UA(x’, x) - W(x’, x)= 5 - 3 = 2,
then the adversary does the transformation ad evades the classifier.
Case 2: two random boundaries with P(y1(x))=P(y1(x))=0.5
The expected gain is:
[UA(x’, x) * P(y1(x)) + UA(x’, x) * P(y2(x))] - W(x’, x)= [0 * 0.5 - 5 * 0.5] - 3 = 2.5 - 3 < 0,
then the adversary does not move, even if such move would allow evading the classifier.
15. A defence strategy: hiding information by randomization
X2 y2(x)
Am I evading it?
- +
y1(x)
x
+
-
x’
A(x)=x’
X1
Why is a randomized classifier harder to evade?
Key points:
yc(x) becomes a random variable: Yc
The adversary has to compute the C
Y x'!X
[
E{A(x)} = argmax E{U A(y c (x'),+)} "W(x,x')
YC ]
E{A(x)} # A(x/y c (x'))= A (x)
expected value of A(x) by averaging opt
over possible realizations of yc(x)
YC
In the Proceedings paper we show that adversary’s strategy A(x) becomes
suboptimal. Adversary does not camouflage malicious patterns that would allow
evading the classifier, or camouflage malicious patterns which are misclassified
by the classifier.
16. Evade hard MCS with randomization
Header Analysis
Black/White List
URL Filter
Signature Filter
…
Σ legitimate
spam
Assigned class
Content Analysis
http://spamassassin.apache.org
The defence strategy based on “randomization” can be implemented in several
ways
We implemented it using the multiple classifiers approach, by the randomisation
of the combination function
For our experiments, we used the SpamAssassin filter that is basically a linearly
weighted combination of classifiers, and randomized the weights by training set
bootstrapping
17. Experiments with multiple classifiers and randomization
E-mail data set: TREC 2007 Assume that the adversary can make any
75,419 real e-mail messages modification which reduces the score of a rule
received between Apr.-July 2007
25,220 ham, 50,199 spam
Key point: the adversary does not know the
actual set of weights deployed for combining
Experimental set up multiple classifiers (filtering rules). So it can
We used SpamAssassin filter with a devise only a suboptimal strategy A(x).
weighted sum as combination function
(a SVM with linear kernel)
rnd
FN (%) FN (%)
det rnd det rnd det
UA UA UC UC
Randomization of the combination
function by bootstrap. The adversary 0.98 0.56 1.30 1.46 19.55 11.21
“sees” 100 different sets of weights
with identical probability.
SpamAssassin architecture The average false negative rate decreases
from 19.55% to 11.25% when the classifier
uses randomization
This is confirmed by the decrease of
adversary’s utility and the increase of
classifier’s utility