6. A simpler time
• Sending spam is an incredibly complicated
scheme these days
• Highly distributed bot nets of unsuspecting,
heterogenous machines
• The result of a long long arms race
• That means that combatting it is
complicated as well
7. Skynet is here
• Bots/scripts are able to signup for accounts
(including filling out captcha), log into
Flickr, upload photos, set their buddy icon,
and start sending spam.
• You can also buy these accounts in bulk...
8.
9. The Harsh Truth
Someone whose time is really cheap is
constantly working to send spam through
your site
15. Social Sites as Gateways
• User-generated content
• “User” is a broad category that includes
“spammer asshole”
16. Social Sites as Gateways
• User-generated content
• “User” is a broad category that includes
“spammer asshole”
• Email notifications for said content
17. Social Sites as Gateways
• User-generated content
• “User” is a broad category that includes
“spammer asshole”
• Email notifications for said content
• Relationship based, trust inducing
18. Social Sites as Gateways
• User-generated content
• “User” is a broad category that includes
“spammer asshole”
• Email notifications for said content
• Relationship based, trust inducing
• Mom gets excited any time she gets an
email from Flickr
19. What Trust Means
• Something familiar that a user is used to
opening
• Increases the likelihood that a user will
open the email and perform whatever it is
that you want them to
• Piggybacking on the research and work
done by the site itself!
20. More on Trust
• Very easy to lose - other services will
blackhole mail coming from your domains
• Users stop coming
• Very hard to regain - the burden of proof
ends up entirely on you
27. Econ Continued
The more well-known your site gets, the
higher the demand for your message delivery
mechanism - more likely a recipient will
actually open the message
30. Some Numbers
"Spamalytics: An Empirical Analysis of Spam
Marketing Conversion"
C. Kanich, C. Kreibich, K. Levchenko, B.
Enright, G. Voelker,V. Paxson, and S. Savage.
15th ACM Conference on Computer and
Communications Security (CCS), 27-31
October 2008, Alexandria,VA.
http://www.icsi.berkeley.edu/pubs/
networking/2008-ccs-spamalytics.pdf
31. Some Numbers
• 0.0000081% overall conversion rate
• 28 conversions for every 347,590,389
emails attempted
32. Where we fit
• Only ~25% of the attempted emails sent
were actually accepted by the mail server
( first step in the funnel )
• Using a social site as a gateway almost
guarantees a higher number
• A whole lot of effort goes into making
sure notifications get delivered
33. Put some $$ on it
• $3.5 million dollars of revenue in a year
• 5% increase in delivery rate = $175,000/yr
35. Back to Trust
• This can’t be ignored
• Remember, once you lose that trust, it’s a
long way back up
• As you lose your trustworthiness as a
message gateway, the spammers go away
• ... but so do the users
44. A Confession
I almost always have to type a captcha code twice
Bots consistently evolve to solve incrementally
complex variations
45. A Confession
I almost always have to type a captcha code twice
Bots consistently evolve to solve incrementally
complex variations
Draw your conclusions
48. The Tension
• Want to be able to allow users to send
messages and generally enjoy themselves
49. The Tension
• Want to be able to allow users to send
messages and generally enjoy themselves
• Don’t want to make it too easy to send
spam
50. The Tension
• Want to be able to allow users to send
messages and generally enjoy themselves
• Don’t want to make it too easy to send
spam
• Traditional prevention techniques like
captchas result in epic degradation of UX
and ultimately end up ineffective
51. Traditional Response
• User reports
• Manual account removal
• Manual message cleanup
• except you can’t clean up the email once it’s
sent
• Manually Adding patterns to a list of things to
filter
• Engineers running mass deletion/cleanup scripts
54. What a Waste
• Responding to incidents this way is a huge
drain on resources and morale
• That’s time your team could be spending
on projects, features, being happy...
58. Make Time
• Product teams might be reticent to put
spam on the roadmap and dedicate
resources to it
59. Make Time
• Product teams might be reticent to put
spam on the roadmap and dedicate
resources to it
• ... until you miss a bunch of deadlines
because you’re too busy cleaning up spam
60. Make Time
• Product teams might be reticent to put
spam on the roadmap and dedicate
resources to it
• ... until you miss a bunch of deadlines
because you’re too busy cleaning up spam
• ... and your notifications aren’t being
delivered because you’re blacklisted
61.
62. Develop a Strategy
• A spam attack is no different than a typical
DoS or outage - you need a plan
• Figure out what data you need and whether
or not you already have it
• Figure out ways to consolidate and
automate the work
63. Build your Tools
• Make things reusable
• a user should look the same in all tools
• tools that show lists of users should
reuse the same logic for batch ops
• Leave a consistent trail
64. Look at the Big Picture
• Your tools should be very well integrated
• your user report tools should pop
suspected accounts into review tools
• deleting accounts and messages should
automatically close user report cases
65. The goal is to be able to have one person look at
a single tool, make decisions, and go back to sleep
69. Work Smart
• Spam is limited to going from one user on
your site to another user on your site
• That forces certain behavior patterns -
know what those are for your site
70. Work Smart, continued
• If you have some obstacles at signup time
(captcha, mass signup detection), you can pretty
much expect two things:
• a slow trickle of signups (to get around signup-
time mass signup checks)
• a sudden surge of messages
• Constant “under the radar” trickle doesn’t make
sense - if you delete the accounts after a few
user reports, they don’t get their payload sent
71. Work Smart, continued
You know a LOT about your users by default
• The signup - when, where
• Relationships are key
• You can see what’s happening globally
• patterns are important
• The message contents are less helpful, and
really, less important
72. Examine What You Send
• Separate the act of sending a message from
the actual delivery
• Obviously doesn’t work with all content
• Queue up messages at some reasonable
interval instead of sending them instantly
• Examine what’s in the queue before sending
it out
73. Clustering is your friend
• Cluster the messages in the queue using as
many characteristics as possible
• Doing this will make most spam look really
obvious
• Fairly straight forward to implement ( don’t
need a massive cluster or Hadoop, at least
initially )
74. Clustering Scores
• (I’m sure there’s a more scientific term for this)
• The size of the cluster a particular message
belongs to as a percentage of the total number
of messages
• Example: if you have 200 messages and a
message falls into a cluster of 10, that message’s
cluster score for that particular characteristic is
5 (10/200 = .05 = 5%)
I am living proof that you can work at a photosite, own a really nice camera and still take really crappy photos. Simon couldn’t make because he got married this past weekend and is off on his honeymoon.
Fighting spam can be very depressing
Whenever you “optimize” an email, you’re optimizing it for the spammers as well
Mom example - not great with computers, only uses Flickr when I send something along. Likely to assume that any mail from Flickr is from me.
* other sites spam detection - it all looks like “flickr.com” to them!
Good, you’re popular. but that also means more spam
You know all that work you did to make sure your emails get delivered? The spammers thank you.
JUST the storm botnet
story about spamhaus
An important point: once the spam leaves your site, it damages your site’s reputation on other sites trying to combat spam - namely email providers
The amount of work involved in dealing with the aftermath of a largescale spam attack when operating this way is insane. Engineers, support staff, ops - everyone is just doing manual, tedious work. Deleting accounts, going through user reports etc.
Thank Simon for fighting the good fight
Thank Simon for fighting the good fight
Thank Simon for fighting the good fight
( I started out as a tools guy)
Your tools should be very clear and easy to use
allow for easy batch operations
how long can you delay sending a message? in most cases, quite a bit of time; things like comments have to show up immediately, but you can delay the email notification.