SlideShare ist ein Scribd-Unternehmen logo
1 von 79
Finding True Love on
        the Internet
With Matthew Rothenberg and Stewart Butterfield
Just kidding. This is much less exciting.
Fighting Spam at Flickr




         Your Spam. Our Balls.
 Mikhail Panchenko and Simon Batistoni
Spammers

• Numerous
• Diverse
• Inventive
• Ubiquitous - if there’s a textbox with an
  implied recipient, they will spam it.
A simpler time
• Sending spam is an incredibly complicated
  scheme these days
• Highly distributed bot nets of unsuspecting,
  heterogenous machines
• The result of a long long arms race
• That means that combatting it is
  complicated as well
Skynet is here

• Bots/scripts are able to signup for accounts
  (including filling out captcha), log into
  Flickr, upload photos, set their buddy icon,
  and start sending spam.
• You can also buy these accounts in bulk...
The Harsh Truth


 Someone whose time is really cheap is
constantly working to send spam through
               your site
http://icanhascheezburger.com/2008/01/22/funny-pictures-sisyphus-cat-tries-again/ - see more Lolcats and funny pictures
"The struggle itself...is enough to fill a man's
 heart. One must imagine Sisyphus happy."
                           Albert Camus, The Myth of Sisyphus
... but there’s hope; we’ll get to that later
Social Sites as Gateways
Social Sites as Gateways
• User-generated content
Social Sites as Gateways
• User-generated content
 • “User” is a broad category that includes
    “spammer asshole”
Social Sites as Gateways
• User-generated content
 • “User” is a broad category that includes
    “spammer asshole”
• Email notifications for said content
Social Sites as Gateways
• User-generated content
 • “User” is a broad category that includes
    “spammer asshole”
• Email notifications for said content
• Relationship based, trust inducing
Social Sites as Gateways
• User-generated content
 • “User” is a broad category that includes
    “spammer asshole”
• Email notifications for said content
• Relationship based, trust inducing
 • Mom gets excited any time she gets an
    email from Flickr
What Trust Means
• Something familiar that a user is used to
  opening
• Increases the likelihood that a user will
  open the email and perform whatever it is
  that you want them to
  • Piggybacking on the research and work
    done by the site itself!
More on Trust

• Very easy to lose - other services will
  blackhole mail coming from your domains
• Users stop coming
• Very hard to regain - the burden of proof
  ends up entirely on you
The Answer is Simple
The Answer is Simple


Don’t let users generate content!
The Economics of Spam
  ( an excuse to pretend to use my degree )
The Demand: sites want exposure,
sometimes at any cost
The Supply: trusted message gateways
The Demand: sites want exposure,
sometimes at any cost
The Supply: trusted message gateways


      the broken part - someone
      else is selling your gateway
Econ 101
Econ Continued

  The more well-known your site gets, the
higher the demand for your message delivery
   mechanism - more likely a recipient will
          actually open the message
ANOTHER GRAPH!
In a Perfect World
Some Numbers
"Spamalytics: An Empirical Analysis of Spam
Marketing Conversion"
C. Kanich, C. Kreibich, K. Levchenko, B.
Enright, G. Voelker,V. Paxson, and S. Savage.
15th ACM Conference on Computer and
Communications Security (CCS), 27-31
October 2008, Alexandria,VA.
http://www.icsi.berkeley.edu/pubs/
networking/2008-ccs-spamalytics.pdf
Some Numbers


• 0.0000081% overall conversion rate
• 28 conversions for every 347,590,389
  emails attempted
Where we fit
• Only ~25% of the attempted emails sent
  were actually accepted by the mail server
  ( first step in the funnel )
• Using a social site as a gateway almost
  guarantees a higher number
  • A whole lot of effort goes into making
    sure notifications get delivered
Put some $$ on it


• $3.5 million dollars of revenue in a year
• 5% increase in delivery rate = $175,000/yr
They figured this out
Back to Trust
• This can’t be ignored
• Remember, once you lose that trust, it’s a
  long way back up
• As you lose your trustworthiness as a
  message gateway, the spammers go away
• ... but so do the users
Fighting Back
Traditional Prevention
Traditional Prevention
• Captchas
Traditional Prevention
• Captchas
• Mass Signup detection using IPs
Traditional Prevention
• Captchas
• Mass Signup detection using IPs
• Rate Limiting
these are mostly good things, and it certainly
         doesn’t hurt to have them




                ... however ...
A Confession
A Confession
I almost always have to type a captcha code twice
A Confession
I almost always have to type a captcha code twice

 Bots consistently evolve to solve incrementally
              complex variations
A Confession
I almost always have to type a captcha code twice

 Bots consistently evolve to solve incrementally
              complex variations

             Draw your conclusions
Photo from http://www.flickr.com/photos/azkid2dc
The Tension
The Tension
• Want to be able to allow users to send
  messages and generally enjoy themselves
The Tension
• Want to be able to allow users to send
  messages and generally enjoy themselves
• Don’t want to make it too easy to send
  spam
The Tension
• Want to be able to allow users to send
  messages and generally enjoy themselves
• Don’t want to make it too easy to send
  spam
 • Traditional prevention techniques like
    captchas result in epic degradation of UX
    and ultimately end up ineffective
Traditional Response
•   User reports
•   Manual account removal
•   Manual message cleanup
    •   except you can’t clean up the email once it’s
        sent
•   Manually Adding patterns to a list of things to
    filter
•   Engineers running mass deletion/cleanup scripts
Photo from http://www.flickr.com/photos/mekin/
What a Waste

• Responding to incidents this way is a huge
  drain on resources and morale
• That’s time your team could be spending
  on projects, features, being happy...
The Alternative
A holistic, comprehensive approach
The Alternative
A holistic, comprehensive approach



 ( aka “take this shit seriously” )
Make Time
Make Time
• Product teams might be reticent to put
  spam on the roadmap and dedicate
  resources to it
Make Time
• Product teams might be reticent to put
  spam on the roadmap and dedicate
  resources to it
• ... until you miss a bunch of deadlines
  because you’re too busy cleaning up spam
Make Time
• Product teams might be reticent to put
  spam on the roadmap and dedicate
  resources to it
• ... until you miss a bunch of deadlines
  because you’re too busy cleaning up spam
• ... and your notifications aren’t being
  delivered because you’re blacklisted
Develop a Strategy

• A spam attack is no different than a typical
  DoS or outage - you need a plan
• Figure out what data you need and whether
  or not you already have it
• Figure out ways to consolidate and
  automate the work
Build your Tools

• Make things reusable
 • a user should look the same in all tools
 • tools that show lists of users should
    reuse the same logic for batch ops
• Leave a consistent trail
Look at the Big Picture

• Your tools should be very well integrated
 • your user report tools should pop
    suspected accounts into review tools
  • deleting accounts and messages should
    automatically close user report cases
The goal is to be able to have one person look at
a single tool, make decisions, and go back to sleep
Photo from http://www.flickr.com/photos/dreamcicle
... but we can get close!
Work Smart
Work Smart

• Spam is limited to going from one user on
  your site to another user on your site
• That forces certain behavior patterns -
  know what those are for your site
Work Smart, continued
•   If you have some obstacles at signup time
    (captcha, mass signup detection), you can pretty
    much expect two things:
•   a slow trickle of signups (to get around signup-
    time mass signup checks)
•   a sudden surge of messages
    •   Constant “under the radar” trickle doesn’t make
        sense - if you delete the accounts after a few
        user reports, they don’t get their payload sent
Work Smart, continued
You know a LOT about your users by default

• The signup - when, where
• Relationships are key
• You can see what’s happening globally
 • patterns are important
• The message contents are less helpful, and
  really, less important
Examine What You Send
• Separate the act of sending a message from
  the actual delivery
 • Obviously doesn’t work with all content
• Queue up messages at some reasonable
  interval instead of sending them instantly
• Examine what’s in the queue before sending
  it out
Clustering is your friend
• Cluster the messages in the queue using as
  many characteristics as possible
• Doing this will make most spam look really
  obvious
• Fairly straight forward to implement ( don’t
  need a massive cluster or Hadoop, at least
  initially )
Clustering Scores
• (I’m sure there’s a more scientific term for this)
• The size of the cluster a particular message
  belongs to as a percentage of the total number
  of messages
• Example: if you have 200 messages and a
  message falls into a cluster of 10, that message’s
  cluster score for that particular characteristic is
  5 (10/200 = .05 = 5%)
Example
             Signup Date Score   Signup IP Score
Message 1            5                  3
Message 2            4                  8
Message 3            6                 12
Message 4            7                  4
Message 5            6                 10
Message 6           20                 19
Message 7           20                 20
Message 8           20                 19
Message 9           20                 19
Message 10          20                 20
Example
             Signup Date Score   Signup IP Score
Message 1            5                  3
Message 2            4                  8
Message 3            6                 12
Message 4            7                  4
Message 5            6                 10
Message 6           20                 19
Message 7           20                 20
Message 8           20                 19
Message 9           20                 19
Message 10          20                 20
JACKPOT



 Photo from http://www.flickr.com/photos/aresauburnphotos/
The Tough Questions

• What do you do with this information?
• Just how much can you automate?
• We’re still looking for that balance
Further Reading

• http://www.icsi.berkeley.edu/pubs/
  networking/2008-ccs-spamalytics.pdf
• http://www.slideshare.net/
  hadoopusergroup/mail-antispam

Weitere ähnliche Inhalte

Ähnlich wie Fighting Spam at Flickr

increase your impact with e-newsletters
increase your impact with e-newslettersincrease your impact with e-newsletters
increase your impact with e-newsletterscathpadfield
 
Advanced Error Handling Strategies for ColdFusion
Advanced Error Handling Strategies for ColdFusion Advanced Error Handling Strategies for ColdFusion
Advanced Error Handling Strategies for ColdFusion Mary Jo Sminkey
 
Scaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON TutorialScaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON Tutorialduleepa
 
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28Dave Archer
 
Lessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of SpamLessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of SpamSparkPost
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea conInnismir
 
Thoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for SitecoreThoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for SitecorePINT Inc
 
How an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software SystemsHow an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software SystemsSecurity Innovation
 
JUG CH September 2021 - Debugging distributed systems
JUG CH September 2021 - Debugging distributed systemsJUG CH September 2021 - Debugging distributed systems
JUG CH September 2021 - Debugging distributed systemsBert Jan Schrijver
 
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...Jason Hong
 
Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Yahoo Developer Network
 
Lecture 2 blogging
Lecture 2   bloggingLecture 2   blogging
Lecture 2 bloggingrskslides
 
Ar design reality2018
Ar design reality2018Ar design reality2018
Ar design reality2018Anselm Hook
 
Growth Hacking Workshop
Growth Hacking WorkshopGrowth Hacking Workshop
Growth Hacking WorkshopJared Waxman
 
Startup Fuze: Lean Startup, Customer Development & Validation Process
Startup Fuze: Lean Startup, Customer Development & Validation ProcessStartup Fuze: Lean Startup, Customer Development & Validation Process
Startup Fuze: Lean Startup, Customer Development & Validation ProcessLuis Almanza
 
Empowerment Technology By: Zyrhell Rafer and Bretny Roces
Empowerment Technology By: Zyrhell Rafer and Bretny RocesEmpowerment Technology By: Zyrhell Rafer and Bretny Roces
Empowerment Technology By: Zyrhell Rafer and Bretny RocesPadsromel
 
Innovation prezo
Innovation prezoInnovation prezo
Innovation prezoDavid Cohn
 
Social engineering-Attack of the Human Behavior
Social engineering-Attack of the Human BehaviorSocial engineering-Attack of the Human Behavior
Social engineering-Attack of the Human BehaviorJames Krusic
 
Annual Scary Episode on What's Scaring Us for 2016
Annual Scary Episode on What's Scaring Us for 2016Annual Scary Episode on What's Scaring Us for 2016
Annual Scary Episode on What's Scaring Us for 2016HighRoad Solution
 

Ähnlich wie Fighting Spam at Flickr (20)

increase your impact with e-newsletters
increase your impact with e-newslettersincrease your impact with e-newsletters
increase your impact with e-newsletters
 
Advanced Error Handling Strategies for ColdFusion
Advanced Error Handling Strategies for ColdFusion Advanced Error Handling Strategies for ColdFusion
Advanced Error Handling Strategies for ColdFusion
 
Scaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON TutorialScaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON Tutorial
 
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
 
Lessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of SpamLessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of Spam
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea con
 
Thoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for SitecoreThoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for Sitecore
 
How an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software SystemsHow an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software Systems
 
JUG CH September 2021 - Debugging distributed systems
JUG CH September 2021 - Debugging distributed systemsJUG CH September 2021 - Debugging distributed systems
JUG CH September 2021 - Debugging distributed systems
 
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan...
 
Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010
 
Lecture 2 blogging
Lecture 2   bloggingLecture 2   blogging
Lecture 2 blogging
 
Ar design reality2018
Ar design reality2018Ar design reality2018
Ar design reality2018
 
Growth Hacking Workshop
Growth Hacking WorkshopGrowth Hacking Workshop
Growth Hacking Workshop
 
Startup Fuze: Lean Startup, Customer Development & Validation Process
Startup Fuze: Lean Startup, Customer Development & Validation ProcessStartup Fuze: Lean Startup, Customer Development & Validation Process
Startup Fuze: Lean Startup, Customer Development & Validation Process
 
Empowerment Technology By: Zyrhell Rafer and Bretny Roces
Empowerment Technology By: Zyrhell Rafer and Bretny RocesEmpowerment Technology By: Zyrhell Rafer and Bretny Roces
Empowerment Technology By: Zyrhell Rafer and Bretny Roces
 
Innovation prezo
Innovation prezoInnovation prezo
Innovation prezo
 
From OSINT to Phishing presentation
From OSINT to Phishing presentationFrom OSINT to Phishing presentation
From OSINT to Phishing presentation
 
Social engineering-Attack of the Human Behavior
Social engineering-Attack of the Human BehaviorSocial engineering-Attack of the Human Behavior
Social engineering-Attack of the Human Behavior
 
Annual Scary Episode on What's Scaring Us for 2016
Annual Scary Episode on What's Scaring Us for 2016Annual Scary Episode on What's Scaring Us for 2016
Annual Scary Episode on What's Scaring Us for 2016
 

Kürzlich hochgeladen

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Kürzlich hochgeladen (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Fighting Spam at Flickr

  • 1. Finding True Love on the Internet With Matthew Rothenberg and Stewart Butterfield
  • 2. Just kidding. This is much less exciting.
  • 3. Fighting Spam at Flickr Your Spam. Our Balls. Mikhail Panchenko and Simon Batistoni
  • 4.
  • 5. Spammers • Numerous • Diverse • Inventive • Ubiquitous - if there’s a textbox with an implied recipient, they will spam it.
  • 6. A simpler time • Sending spam is an incredibly complicated scheme these days • Highly distributed bot nets of unsuspecting, heterogenous machines • The result of a long long arms race • That means that combatting it is complicated as well
  • 7. Skynet is here • Bots/scripts are able to signup for accounts (including filling out captcha), log into Flickr, upload photos, set their buddy icon, and start sending spam. • You can also buy these accounts in bulk...
  • 8.
  • 9. The Harsh Truth Someone whose time is really cheap is constantly working to send spam through your site
  • 11. "The struggle itself...is enough to fill a man's heart. One must imagine Sisyphus happy." Albert Camus, The Myth of Sisyphus
  • 12. ... but there’s hope; we’ll get to that later
  • 13. Social Sites as Gateways
  • 14. Social Sites as Gateways • User-generated content
  • 15. Social Sites as Gateways • User-generated content • “User” is a broad category that includes “spammer asshole”
  • 16. Social Sites as Gateways • User-generated content • “User” is a broad category that includes “spammer asshole” • Email notifications for said content
  • 17. Social Sites as Gateways • User-generated content • “User” is a broad category that includes “spammer asshole” • Email notifications for said content • Relationship based, trust inducing
  • 18. Social Sites as Gateways • User-generated content • “User” is a broad category that includes “spammer asshole” • Email notifications for said content • Relationship based, trust inducing • Mom gets excited any time she gets an email from Flickr
  • 19. What Trust Means • Something familiar that a user is used to opening • Increases the likelihood that a user will open the email and perform whatever it is that you want them to • Piggybacking on the research and work done by the site itself!
  • 20. More on Trust • Very easy to lose - other services will blackhole mail coming from your domains • Users stop coming • Very hard to regain - the burden of proof ends up entirely on you
  • 21. The Answer is Simple
  • 22. The Answer is Simple Don’t let users generate content!
  • 23. The Economics of Spam ( an excuse to pretend to use my degree )
  • 24. The Demand: sites want exposure, sometimes at any cost The Supply: trusted message gateways
  • 25. The Demand: sites want exposure, sometimes at any cost The Supply: trusted message gateways the broken part - someone else is selling your gateway
  • 27. Econ Continued The more well-known your site gets, the higher the demand for your message delivery mechanism - more likely a recipient will actually open the message
  • 29. In a Perfect World
  • 30. Some Numbers "Spamalytics: An Empirical Analysis of Spam Marketing Conversion" C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. Voelker,V. Paxson, and S. Savage. 15th ACM Conference on Computer and Communications Security (CCS), 27-31 October 2008, Alexandria,VA. http://www.icsi.berkeley.edu/pubs/ networking/2008-ccs-spamalytics.pdf
  • 31. Some Numbers • 0.0000081% overall conversion rate • 28 conversions for every 347,590,389 emails attempted
  • 32. Where we fit • Only ~25% of the attempted emails sent were actually accepted by the mail server ( first step in the funnel ) • Using a social site as a gateway almost guarantees a higher number • A whole lot of effort goes into making sure notifications get delivered
  • 33. Put some $$ on it • $3.5 million dollars of revenue in a year • 5% increase in delivery rate = $175,000/yr
  • 35. Back to Trust • This can’t be ignored • Remember, once you lose that trust, it’s a long way back up • As you lose your trustworthiness as a message gateway, the spammers go away • ... but so do the users
  • 39. Traditional Prevention • Captchas • Mass Signup detection using IPs
  • 40. Traditional Prevention • Captchas • Mass Signup detection using IPs • Rate Limiting
  • 41. these are mostly good things, and it certainly doesn’t hurt to have them ... however ...
  • 43. A Confession I almost always have to type a captcha code twice
  • 44. A Confession I almost always have to type a captcha code twice Bots consistently evolve to solve incrementally complex variations
  • 45. A Confession I almost always have to type a captcha code twice Bots consistently evolve to solve incrementally complex variations Draw your conclusions
  • 48. The Tension • Want to be able to allow users to send messages and generally enjoy themselves
  • 49. The Tension • Want to be able to allow users to send messages and generally enjoy themselves • Don’t want to make it too easy to send spam
  • 50. The Tension • Want to be able to allow users to send messages and generally enjoy themselves • Don’t want to make it too easy to send spam • Traditional prevention techniques like captchas result in epic degradation of UX and ultimately end up ineffective
  • 51. Traditional Response • User reports • Manual account removal • Manual message cleanup • except you can’t clean up the email once it’s sent • Manually Adding patterns to a list of things to filter • Engineers running mass deletion/cleanup scripts
  • 53.
  • 54. What a Waste • Responding to incidents this way is a huge drain on resources and morale • That’s time your team could be spending on projects, features, being happy...
  • 55. The Alternative A holistic, comprehensive approach
  • 56. The Alternative A holistic, comprehensive approach ( aka “take this shit seriously” )
  • 58. Make Time • Product teams might be reticent to put spam on the roadmap and dedicate resources to it
  • 59. Make Time • Product teams might be reticent to put spam on the roadmap and dedicate resources to it • ... until you miss a bunch of deadlines because you’re too busy cleaning up spam
  • 60. Make Time • Product teams might be reticent to put spam on the roadmap and dedicate resources to it • ... until you miss a bunch of deadlines because you’re too busy cleaning up spam • ... and your notifications aren’t being delivered because you’re blacklisted
  • 61.
  • 62. Develop a Strategy • A spam attack is no different than a typical DoS or outage - you need a plan • Figure out what data you need and whether or not you already have it • Figure out ways to consolidate and automate the work
  • 63. Build your Tools • Make things reusable • a user should look the same in all tools • tools that show lists of users should reuse the same logic for batch ops • Leave a consistent trail
  • 64. Look at the Big Picture • Your tools should be very well integrated • your user report tools should pop suspected accounts into review tools • deleting accounts and messages should automatically close user report cases
  • 65. The goal is to be able to have one person look at a single tool, make decisions, and go back to sleep
  • 67. ... but we can get close!
  • 69. Work Smart • Spam is limited to going from one user on your site to another user on your site • That forces certain behavior patterns - know what those are for your site
  • 70. Work Smart, continued • If you have some obstacles at signup time (captcha, mass signup detection), you can pretty much expect two things: • a slow trickle of signups (to get around signup- time mass signup checks) • a sudden surge of messages • Constant “under the radar” trickle doesn’t make sense - if you delete the accounts after a few user reports, they don’t get their payload sent
  • 71. Work Smart, continued You know a LOT about your users by default • The signup - when, where • Relationships are key • You can see what’s happening globally • patterns are important • The message contents are less helpful, and really, less important
  • 72. Examine What You Send • Separate the act of sending a message from the actual delivery • Obviously doesn’t work with all content • Queue up messages at some reasonable interval instead of sending them instantly • Examine what’s in the queue before sending it out
  • 73. Clustering is your friend • Cluster the messages in the queue using as many characteristics as possible • Doing this will make most spam look really obvious • Fairly straight forward to implement ( don’t need a massive cluster or Hadoop, at least initially )
  • 74. Clustering Scores • (I’m sure there’s a more scientific term for this) • The size of the cluster a particular message belongs to as a percentage of the total number of messages • Example: if you have 200 messages and a message falls into a cluster of 10, that message’s cluster score for that particular characteristic is 5 (10/200 = .05 = 5%)
  • 75. Example Signup Date Score Signup IP Score Message 1 5 3 Message 2 4 8 Message 3 6 12 Message 4 7 4 Message 5 6 10 Message 6 20 19 Message 7 20 20 Message 8 20 19 Message 9 20 19 Message 10 20 20
  • 76. Example Signup Date Score Signup IP Score Message 1 5 3 Message 2 4 8 Message 3 6 12 Message 4 7 4 Message 5 6 10 Message 6 20 19 Message 7 20 20 Message 8 20 19 Message 9 20 19 Message 10 20 20
  • 77. JACKPOT Photo from http://www.flickr.com/photos/aresauburnphotos/
  • 78. The Tough Questions • What do you do with this information? • Just how much can you automate? • We’re still looking for that balance
  • 79. Further Reading • http://www.icsi.berkeley.edu/pubs/ networking/2008-ccs-spamalytics.pdf • http://www.slideshare.net/ hadoopusergroup/mail-antispam

Hinweis der Redaktion

  1. I am living proof that you can work at a photosite, own a really nice camera and still take really crappy photos. Simon couldn’t make because he got married this past weekend and is off on his honeymoon.
  2. Fighting spam can be very depressing
  3. Whenever you “optimize” an email, you’re optimizing it for the spammers as well Mom example - not great with computers, only uses Flickr when I send something along. Likely to assume that any mail from Flickr is from me.
  4. * other sites spam detection - it all looks like “flickr.com” to them!
  5. Good, you’re popular. but that also means more spam
  6. You know all that work you did to make sure your emails get delivered? The spammers thank you.
  7. JUST the storm botnet
  8. story about spamhaus
  9. An important point: once the spam leaves your site, it damages your site’s reputation on other sites trying to combat spam - namely email providers
  10. The amount of work involved in dealing with the aftermath of a largescale spam attack when operating this way is insane. Engineers, support staff, ops - everyone is just doing manual, tedious work. Deleting accounts, going through user reports etc.
  11. Thank Simon for fighting the good fight
  12. Thank Simon for fighting the good fight
  13. Thank Simon for fighting the good fight
  14. ( I started out as a tools guy) Your tools should be very clear and easy to use allow for easy batch operations
  15. how long can you delay sending a message? in most cases, quite a bit of time; things like comments have to show up immediately, but you can delay the email notification.