Jason Woodford was invited to join a panel discussion and presentation today on data scraping by our esteemed legal friends at DMH Stallard, “the business people who happen to be lawyers”.
I joined speakers from Sentor who introduced the audience to their data scraping monitoring service called Assassin as well as their customer www.yell.com who explained how they have overcome and now manage the data scraping issues they face as a business.
I was uncomfortably placed on the side of the “scrapers” whereas Sentor and Yell were defending the “scrapees” with Frank Jennings adjudicating from DMH Stallard. It’s clear there are strong arguments for and against scraping......
12. What does SEO rely on? 8 SEO Basics Findability - keywords need to be in Meta Titles, Headings, Content, URLs AND in hyperlinks linking BACK to a particular URL Indexability: eg. Duplicate content, suffixes and sitemaps? The other 6 are Accessibility, Usability, Sharability, Linkability, Convertability, andTrackability Google likes original, high quality, keyword rich content from high authority sites......
13. Scraping - legal or illegal? Data scraping from public data repositories is very common and in most cases legal. However, if your purpose is to steal site Y's content so you can put it on your site and benefit from it then that is classed as copyright infringement Scraping on this basis is illegal and… Violates the Digital Millennium Copyright Act It can often hurt search engine rankings of websites (bad for search engine optimisation – SEO)
14. What Action Can I Take? Option 1 – Report them to Google & their ISP and / or take legal action - Could cost you time & money Option 2– Deal with it - There are some technical mitigations Option 3- Think ahead… - Set up your website to take advantage of these scrapers and gain some SEO benefit.
29. The Risks and Rewards of Data Scraping for SEOSiteVisibilitySearch & Digital Marketing ExpertsWe think beyond the click™
30.
31. Data mining your competitor's website to compare prices, products offered, business partners acquired and other critical data.
32. Reputation Management! What if you were alerted to every good or bad comment said about your company or product on a blog, forum or website and could respond with correction or enhancements before mis-information was spread around the Internet?
Hinweis der Redaktion
Environment in Business, a Lexis Nexis specialist B2B publication The problem = need to drive relevant traffic to drive subscriptions
Use absolute URLs in your links.Include the full path (http://www.yoursite.com/page.php) instead of relative URLs (/page.php). Use internal linking strategically.In the first couple paragraphs of your content try to find a place where it makes sense, and link to another page of your site using relevant anchor text similar to keywords you want to rank for. Make sure each content headline is a link.Turn the headline of the web page into a link to that page. Add copyright notice and a link to your site in the RSS feed.Most website scrapers just use your RSS feed to scrape content. They won’t realise that when they post the contents of your feed to their own site, they will be giving you a link with keywords in the anchor text, along with information saying the copyright for the content belongs to you and your site.