Presentation by Alana Glassco, anti-abuse engineer at Smyte, at Quora ML Workshop: Protecting Online Spaces with Applied Machine Learning, on September 27, 2017.
9. For example...
● Business goals
○ Enforce company values
○ Gain good press
● Nature of the problem
○ Short-term
○ High FP cost
● Is ML a good fit?
○ No
● Business goals
○ Reduce bad press
○ Recover advertising loss
● Nature of the problem
○ Long-term
○ High FN cost
● Is ML a good fit?
○ Yes
10. Get the right training data
● Understand policies in practice
● “Free” data won’t cut it
● Invest in a human review team
11. Example: building a “spam” classifier
Repetitive
content
Keyword
stuffing
Artificial traffic Scams /
phishing
Behavioral
signals
Bots / fake
accounts
Real users
Bots / fake
accounts
Bots or real
users
Optics
Looks fine in
isolation
Easy to
identify
Invisible w/o
account
signals
Looks bad to
a trained
reviewer
Severity Harms
reputation
Harms search
results
Harms
ranking
Harms users
12. Design a solution
● Model selection
● Implementation
● Maintenance & retraining