The Retail Strategy and Planning Series is designed to provide retail executives with the tactical tips, insights, metrics and trend data needed to guide 2017 strategies. Tune into Are Bot Operators Eating Your Lunch? and learn how to protect your brand image, reputation and SEO rankings from bad bots: rtou.ch/2c5cPmx.
5. #RSP16
About Retail TouchPoints
ü Launched in 2007
ü Over 30,000 retail subscribers
ü To provide executives with relevant,
insightful content across a variety of
digital medium
Sign up for our weekly newsletter:
www.retailtouchpoints.com/subscribe
8. Agenda
Bots 101
The growing bot problem
How bots are eating your lunch!
Hayneedle case study
Selection criteria for a bot detection solution
Q & A
9. Good Bots
Search Engine Crawling
Power APIs
Check system connectivity and
status
Bad Bots
Steal content
Scan for vulnerabilities
Perform fraud
etc.
The Basics of Bots
A “Bot” is an automated program that runs on the internet
Traffic Distribution by Type, 2016
10. High Profile Web Scraping in the Ecommerce Industry
QVC is an American television home
shopping network and online ecommerce
site.
Aggressive price and inventory scraping by shopping
aggregator app resulted in the following repercussions for
QVC
● Two day website outage
● Loss of $2M in revenue
● Highly publicized lawsuit
● Damage to QVC Brand
11. Traffic by Size, Ecommerce Sites, 2014 vs 2015
Small and medium ecommerce sites saw about a
100% increase in bad bots between 2014 and 2015
12. Majority of Bots are Advanced Persistent Bots (APBs)
APBs have one or more of the following abilities:
Advanced
Mimick human behavior
Load JavaScript
Load external resources
Support cookies
Browser automation (Selenium, PhantomJS)
Persistent
Dynamic IP rotation
Distribute attacks across IP addresses
Hide behind anonymous and peer-to-peer proxies
2016 Distil Bad Bot Report
13. Why the Massive Increase in APBs?
Online data has increased in value
Pricing information, product availability, product
descriptions, and vendor reviews are changing
daily and highly valuable to competitors
Anyone can get in the game
Cheap or free virtual servers, bandwidth, easy-to-
use tools, and scrapers for hire
Bots no longer tied to IP addresses
Bots cycle through random IP addresses
Bots hide behind anonymous proxies
Consumer IPs now infected with bot traffic too
14. Loading Assets & Bots Mimicking Humans
% of bots able to load external assets
(e.g. JavaScript)
% of bots able to mimic human
behavior
These bots skew marketing tools such as
(Google Analytics, A/B testing,
conversion tracking, etc.)
These bots fly under the radar of most
security tools
15. That Majority of Bad Bots Now Use Multiple IP Addresses
Bots which dynamically rotate IP addresses, or distribute attacks are significantly harder
to detect and mitigate
16. Bad Bots Cause the Majority of Website Problems
19% of Traffic Causes the Following Problems
17. How Bots Eat Your Lunch
How bots are eating your lunch!
18. How Bots Eat Your Lunch
LOST PROFITS
Decreased Customer Loyalty
Reduce Findability
Lost Cross/Upsell Opportunities
Decreased Customer Satisfaction
Increased Costs
Increased Fraud
19. Bots and Competitive Data Mining
Duplicating your Product Portfolio
Bots can easily gather product and supplier lists
for replication elsewhere
Undermining your Prices
Bots monitor your prices, ensuring competitors
can undercut with lower price listings
Availability Tracking
Identifying when your supply has been exhausted provides competitors a unique
opportunity to raise the price of their goods.
20. Negative SEO Attacks Damage Relevancy
Bots steal content, product lists, and prices for
duplication elsewhere on the Internet
Duplicated content reduces your company’s
uniqueness and thus quality score
SEO damage may result, especially if
○ Your prices are undercut
○ The content is repurposed on a more popular site
Duplicate Content Results in Diminished SEO
21. Common hacking tools like network
mappers and vulnerability scanners are
automated programs
Once a victim’s network has been mapped,
automated vulnerability scanning can be
used to find security flaws that can be
exploited
These bots let hackers scale their
operations
Vulnerability Scanning and Target Exploitation
22. Bots Make Large Scale Account Takeover Possible
Over 1 billion usernames,
passwords combinations exist in the
wild
Bot operators create bots to test
millions of username/password
combinations from breaches at
other websites to find the
credentials that work on your site
Newly compromised accounts are
then used for various forms of
fraud/theft
23. Automated Stolen Credit Card Testing Enables Fraud
“Carding” uses micro-transactions on stolen credit cards
against e-commerce sites to test their validity
Carding results in poor user experiences and lots of expensive
chargebacks
24. Bots Plant Malicious Links in Fake Comments
Comment spam is frequently used to redirect users to malicious websites
Malicious Site
26. About Hayneedle
Leading online retailer for indoor
and outdoor home furnishings
and decor
1,000s of top brands - including
Hayneedle exclusive designs -
and millions of products for
every space, style, and budget
27. Hayneedle Bad Bot Challenges
Bad Bot Challenges Business Impact
Competitive price scraping Competitors attempt to undercut pricing
Automated CVV guessing
games
Fraudsters use stolen credit cards in carding
attacks
Time investigating and reporting the problem
Bot traffic competing with real
customers
Web performance and the user experience
Skewed analytics
Conversion funnel optimization
A/B testing
Inefficient DIY approach
“Battle-of-the-bots” ate up 20% of team resources
Only 30% effective (at best)
Quality of life issues
28. Hayneedle Bad Bot Challenges
Constant game of bad bot “Whack-a-mole”
Log file analysis and performance monitoring
Agent-string analysis
IP blocking
Traffic redirects
Tarpits
...but the bad bots keep changing their identities,
scripts, and IP addresses
29. Hayneedle Bot Selection Criteria
Bot Detection and Mitigation Solution
Requirements
No impact on human visitors
“Self tuning” for defending against emerging
and unknown threats
Crowd-sourced threat intelligence model
Seamlessly co-exist with existing solutions
(CDN, WAF, etc.)
No “black boxes”
33. How to Manage Transactional Traffic
Best Practices and Lessons
Learned
Monitor (don’t CAPTCHA) traffic on
your checkout and account
subdomains
Review Threats by Organization
Understand the rationale of scrapers
Selectively Block nefarious
organizations
34. Blocking Nefarious Organizations
Can probably block
traffic coming from
this data center,
especially when
70% of the traffic is
from Automated
Browsers and/or
Known Violators
35. Hayneedle Results with Distil Networks
Eliminated competitive data mining
Intercepting bot traffic with negligible false
positives
Clean analytics for funnel optimization and A/B
testing
Distil is a key piece of our fraud detection and
prevention suite of tools
Upstream HTTP Errors Report highlighted an issue
with our CDN provider
Web infrastructure dedicated to serving humans
Boosted team morale!
36. The Only Easy and Accurate Way
to Protect Web Applications from
Bad Bots, API Abuse, and Fraud
37. Browser Validation
Detects all known browser automation tools, such as Selenium and
Phantom JS
Protects against browser spoofing by validating each incoming
request as self reported
Advanced Bot Detection Increases Accuracy
Behavioral Modeling and Machine Learning
Machine-learning algorithms pinpoint behavioral anomalies specific
to your site’s unique traffic patterns
Self optimizing algorithms improve bot detection and mitigation
without manual configuration
38. Sticky Bot Tracking With No Impact On Real Users
Device Fingerprinting
Fingerprints stick to the bot even if it
attempts to reconnect from random IP
addresses or hide behind an
anonymous proxy or peer-to-peer
network
Tracks distributed attacks that would
normally fly under the radar
Without Distil With Distil
Without Impacting Users Sharing the Same IP
Avoids blocking residential users or organizations
that might share the same NAT as the bot or botnet
39. Threat Intelligence From All Distil-Protected Sites
Known Violators Database
Real-time updates from the world’s largest Known
Violators Database, which is based on the
collective intelligence of all Distil-protected sites
Distil customers are automatically protected
against new threats discovered anywhere on the
network
40. Automated Attackers Leverage APIs as an Attack Vector
Web Applications
API Endpoints
When blocked from a website, Bots frequently use APIs as an
attack vector
APIs tend to have access to the same content, but without as many
security controls
41. ○ Install on virtualized or bare metal appliance(s)
○ High availability configurations with failover
monitoring
○ Heartbeat up to Distil Cloud
○ Deploys in days
Flexible Deployment Options
○ Automatically compresses and optimizes content for faster
delivery
○ 17 global datacenters automatically fail over when a
primary location goes offline
○ Automatically increases infrastructure and bandwidth to
accommodate spikes
○ Deploys in hours
Physical or Virtual Appliances
Content Delivery Network