Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Best Kept Secrets from Robots.txt - BrightonSEO

803 Aufrufe

Veröffentlicht am

Francois Goube during BrightonSEO on September 2019 about the best kept secrets from robots.txt
Explore all the ways search engines are exploiting your robots.txt. Francois will help you understand how Google is using the robots.txt file to understand your website. Spam detection, optimizing crawl budget, prioritizing crawling queues, we will dive into the most strategic use cases to show how to leverage the use of robots.txt.

Veröffentlicht in: Internet
  • Als Erste(r) kommentieren

Best Kept Secrets from Robots.txt - BrightonSEO

  1. 1. By @FrancoisGoube, CEO @Oncrawl The best kept secrets from Robots.txt
  2. 2. Founder & CEO @Oncrawl 17 lucky years in the SEO industry Passion for Search Engines Patents 1 . C A S H B U R N @FrancoisGoube
  3. 3. 1 month Free Trial Send an email to Hello@Oncrawl.com
  4. 4. Get back to the basics of Robots.txt
  5. 5. This guy invented the Robots.txt 25 years ago Since then, the vast majority of Search Engines tries to comply with Robots.txt rules 1 . C A S H B U R N Thank you Martijn Koster!
  6. 6. A robots.txt file tells search engine crawlers which pages or files the crawler can or can't request from your site.
  7. 7. « It is not a mechanism for keeping a web page out of Google. » https://support.google.com/webmasters/answer/6062608?hl=en
  8. 8. WHY?
  9. 9. If a page is linked it can be indexed by Google
  10. 10. If a page is linked it can be indexed by Google Google can build a description for your page even if blocked by Robots.txt Very useful for : • Pages answering 4XX or 5XX errors • Pages blocked by Robots.txt
  11. 11. Even if Google does not crawl the page it can understand its link graph, and describe it. A page blocked by Robots.txt can be indexed and ranked
  12. 12. « We do sometimes show robotted pages in the search results just because we’ve seen that they work really well. » - John Muller
  13. 13. Given a particular User Intent, a Page blocked by robots.txt having great Usage Metrics (CTR) and Links can rank
  14. 14. Think about the quality of the search experience Who cares about your Robots.txt rules in the age of voice search?
  15. 15. One main goal: Make Robots.txt an official standard 1 . C A S H B U R N Recent changes
  16. 16. Issues right now w/ Robots.txt § Syntax is not standard § Limitations are unknown § The Use of RegEx is at your own risks1 . C A S H B U R N That’s good news!
  17. 17. SEO Implications
  18. 18. With a standard you will really know what you are doing
  19. 19. They open-sourced their Robots.txt parser https://github.com/google/robotstxt And Google is helping us
  20. 20. We are going to be able to test the limitations of Robots.txt parsing And test our Rules.
  21. 21. You need to update your SEO tricks: Using Robots.txt to reallocate crawl budget is not an SEO strategy
  22. 22. … But it’s short-term thinking You can Correct the problem fast Of course you can…
  23. 23. … But it’s short-term thinking You can Correct the problem fast But you need a real SEO strategy about your Architecture Of course you can… Because you’ll get tremendous Long-Term results
  24. 24. « It is not a mechanism for keeping a web page out of Google. » https://support.google.com/webmasters/answer/6062608?hl=en Remember ?
  25. 25. Results can be achieved overnight Working on content and internal links
  26. 26. Robots.txt can be your BFF But it can also be a weapon of massive destruction!
  27. 27. Because sometimes people are doing stupid things Things can get worst
  28. 28. Let think outside of the box Robots.txt can be used to detect PBN and Black hat behaviour Things can get worst because of Bots
  29. 29. Let think outside of the box There are patents to prove it! Things can get worst because of bots
  30. 30. Let’s think outside the box it’s easy to detect a Robots.txt that belong to the same author Things can get worst because of bots
  31. 31. It’s really easy to spot your writing style
  32. 32. One good news though… Tons of patents show Robots.txt is the starting point of a crawl by a search engine
  33. 33. One good news though… Tons of patents show Robots.txt is the starting point of a crawl by a search engine Chances are with Data Science we will be able to Predict which pages will be crawled next Try to run ML algos like Process Mining ;-)
  34. 34. SEOs should embrace the Data Science Era
  35. 35. Thank You! @OnCrawl Francois@oncrawl.com
  36. 36. 1 month Free Trial Send an email to hello@Oncrawl.com

×