3. Awards and Analyst Recognition
“Distil’s ability to analyze behavior provides
the best chance of detecting and blocking
bot-driven attacks.”
5 Stars across the board.“
Verdict: For monitoring the impact of bots on
a network this is the tool one needs.”
The only anti-bot solution to be included
in Gartner’s Online Fraud Detection
Market Guide
Ovum puts Distil Networks On The Radar.
“Clear innovation compared to similar
services.”
4. Fortune 500 & Alexa Global 10,000 Customers
Ecommerce
Travel
Publishers
Directories
Traditional Media
Marketplace
Services
8. What Is Web Scraping?
Web Scraping
Also known as screen scraping, web scraping is the act of
copying large amounts of data from a website – either
manually or with an automated program.
Legitimate Scraping
Scraping can sometimes be benevolent and totally
acceptable. For example, the search engine bots that index
your website
Malicious Scraping
A systematic theft of intellectual property accessible on a
website, including pricing, content, images, and proprietary
data
9. Who is behind Web Scraping?
Competitors
Content Theft
Competitive Intel
Price Scraping
Aggregators
Start-ups
Unauthorized Middlemen
Hackers
Content for Fake Pages
Search Engines
Google
Bing
Yahoo
Baidu
11. In 2015 the most targeted verticals were digital publishing and real
estate. Real Estate sites saw a 300% increase in bad bot traffic!
Traffic by Type of Site, 2014 vs 2015
12. Bad Web Scraping
Web scraping is the act of taking content from a website with the intent of using
it for purposes outside the direct control of the site owner.
It can be used to
○ Steal intellectual property
○ Gain competitive advantage
○ Create aggregation or meta-sites
○ Perform market research
○ Damage SEO rankings
13. Alexa – monitor traffic levels
SE Ranking – track search rankings
InfiniGraph – watch social media trends
Open Site Explorer – monitor backlinks
SpyFu – view advertising keywords
14. Moat – find where ads are running
iSpionage – organic search keywords
Compete PRO – get demographic info
Quantcast – view audience insights
SpyOnWeb – see behind the curtain
16. Freelancer.com Rates
Scraping three real estate sites
Data Manipulation (de-duping, etc.)
Importing into new software
Average Cost - $130 USD
The Going Rate for Scraping Less than $130/day
17. Posting Stolen Data is Quick and Easy due to Turnkey Platforms
Real Estate Portal Platforms start at $299
19. Bottom Line
Scrapers scrape because they are
making money with your listings!
And the Real Estate industry is left
with...
Higher Costs
Lost Revenues
Why Bots / Scraping is a Problem in Real Estate
22. Delivering a Clear Picture of Your Web Traffic
Low Resolution Fingerprint
“Unactionable”
Hi-Def Fingerprint
“Actionable”
23. Hi-Def Fingerprinting Eliminates Blind Defense
IP Address
Header & User Agent Information
Cookie Browser
200+ Attributes of data
Navigator, WebGL, Plugins, Audio, Video, etc.
Tamper proofing layer
Hi-Def Fingerprint
24. That Majority of Bad Bots Now Use Multiple IP Addresses
Bots which dynamically rotate IP addresses, or distribute attacks are
significantly harder to detect and mitigate
25. Sticky Bot Tracking With No Impact On Real Users
Device Fingerprinting
Fingerprints stick to the bot even if it
attempts to reconnect from random IP
addresses or hide behind an anonymous
proxy or peer-to-peer network
Tracks distributed attacks that would
normally fly under the radar
Without Distil With Distil
Without Impacting Users Sharing the Same IP
Avoids blocking residential users or organizations
that might share the same NAT as the bot or botnet
28. In 2015 the most targeted verticals were digital publishing and real
estate. Real Estate sites saw a 300% increase in bad bot traffic!
Traffic by Type of Site, 2014 vs 2015
29. Web scraping hurts your KPIs...
Slowdowns, downtime, and poor user experiences
Increase in costs (infrastructure and people)
Distortion of web analytics
Digital ad fraud, reputation and trust (bad leads)
How Web Scrapers Impact KPIs
30. Majority of Bots are Advanced Persistent Bots (APBs)
APBs have one or more of the following abilities:
Advanced
Mimick human behavior
Load JavaScript
Load external resources
Support cookies
Browser automation (Selenium, PhantomJS)
Persistent
Dynamic IP rotation
Distribute attacks across IP addresses
Hide behind anonymous and peer-to-peer proxies
2016 Distil Bad Bot Report
31. Loading Assets & Bots Mimicking Humans
% of bots able to load external
assets (e.g. JavaScript)
% of bots able to mimic human
behavior
These bots will skew marketing tools such as
(Google Analytics, A/B testing, conversion
tracking, etc.)
These bots will fly under the radar of most
security tools
33. Impressions and Clicks Remain the Biggest Targets
Impressions
(CPM/CPV)
Clicks
(CPC)
Search
$18.8B
86% digital spend
Display
$7.9B
Video
$3.5B
Mobile
$6.2B$6.2B
Leads
(CPL)
Sales
(CPA)
Lead Gen
$2.0B
Other
$5.0B
• classifieds
• sponsorship
• rich media
estimated
fraud
not at risk
$42.5B $7B