SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Bot Herding
               presented by Stephan Spencer,
             Founder & President, Netconcepts


© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Duplicate Content Mitigation
 Dup content is rampant on blogs. Herd bots to permalink
  URL & lead in everywhere else (Archives by Date
  pages, Category pages, Tag pages, Home page, etc.)
  with paraphrased “Optional Excerpt”
   – Not just the first couple paragraphs, i.e. the <!--more--> tag!
   – Requires you to revise your Main Index Template theme file:
     if (empty($post->post_excerpt) || is_single() || is_page()) { the_content(); }
     else { the_excerpt(); echo quot;<a href='”; the_permalink(); echo quot;'
     rel='nofollow'>Continue reading &raquo;</a>quot;; }



                 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Duplicate Content Mitigation
 Include sig line (& headshot photo!) at bottom of
  post/article. Link to original article/post permalink URL!
   – http://www.naturalsearchblog.com/archives/2008/06/03/syndic
     ating-your-articles/
   – http://www.businessblogconsulting.com/2008/05/brand-
     yourself-with-photo-sig-line




               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Duplicate Content Mitigation
 On ecommerce sites, dup content also rampant:
   – Manufacturer-provided product descriptions, inconsistent order
     of query string parameters, “guided navigation”, pagination
     within categories, tracking parameters
 Selectively append tracking codes for humans w/ “white
  hat cloaking” or use JavaScript to append the codes
   – REI.com used to append a quot;vcatquot; parameter on all brand links
     on their Shop By Brand page (see
     http://web.archive.org/web/20060823085548/www.rei.com/rei/s
     ales_and_events/brands.html)
               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Pagination
 Not only creates many pages that share the same
  keyword theme, also very large categories with
  thousands of products result in hundreds of pages of
  product listings not getting crawled. Thus lowered
  product page indexation.
 Herd bots through keyword-rich subcat links or “View
  All” link or both? How to display page number links?
  Optimal # of products to display/link per page? Test!

              © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
PageRank Leakage?
 If you’re using Robots.txt Disallow, you’re probably
  leaking PageRank
 Robots.txt Disallow & Meta Robots Noindex both
  accumulate and pass PageRank
   – Meta Noindex tag on a Master sitemap page will de-index the
     page but still pass PageRank to linked sub-sitemap pages
 Meta Robots Nofollow blocks the flow of PageRank
   – http://www.stonetemple.com/articles/interview-matt-cutts.shtml


               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Rewriting Spider-Unfriendly URLs
 3 approaches:
   1) Use a “URL rewriting” server module / plugin – such as
      mod_rewrite for Apache, or ISAPI_Rewrite for IIS Server
   2) Recode your scripts to extract variables out of the “path_info”
      part of the URL instead of the “query_string”
   3) Or, if IT department involvement must be minimized, use a
      proxy server based solution (e.g. Netconcepts' GravityStream)
   – With (1) and (2), replace all occurrences of your old URLs in
      links on your site with your new search-friendly URLs. 301
      redirect the old to new URLs too, so no link juice is lost.
               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
mod_rewrite – the Foundation for URL
Rewriting, Remapping & Redirecting
 Works with Apache and IBM HTTP Server
 Place “rules” within .htaccess or your Apache config file
  (e.g. httpd.conf, sites_conf/…)
   –   RewriteEngine on
   –   RewriteBase /
   –   RewriteRule ^products/([0-9]+)/?$ /get_product.php?id=$1 [L]
   –   RewriteRule ^([^/]+)/([^/]+).htm$
       /webapp/wcs/stores/servlet/ProductDisplay?storeId=10 001&cat
       alogId=10001&langId=-1 &categoryID=$1&productID=$2
       [QSA,P,L]
                © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Regular Expressions
 The magic of regular expressions / pattern matching
   –   * means 0 or more of the immediately preceding character
   –   + means 1 or more of the immediately preceding character
   –   ? means 0 or 1 occurrence of the immediately preceding char
   –   ^ means the beginning of the string, $ means the end of it
   –   . means any character (i.e. wildcard)
   –    “escapes” the character that follows, e.g. . mea dot
                                                          ns
   –   [ ] is for character ranges, e.g. [A-Za-z].
   –   ^ inside [] brackets means “not”, e.g. [^/]
                © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Regular Expressions
   – () puts whatever is wrapped within it into memory
   – Access what’s in memory with $1 (what’s in first set of parens),
     $2 (what’s in second set of parens), and so on
 Gotchas to beware of:
   – “Greedy” expressions. Use [^ instead of .*
   – .* can match on nothing. Use .+ instead
   – Unintentional substring matches because ^ or $ wasn’t
     specified


                © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
mod_rewrite Specifics
 Proxy page using [P] flag
   – RewriteRule /blah.html$ http://www.google.com/ [P]
 [QSA] flag is for when you don’t want query string
  params dropped (like when you want a tracking param
  preserved)
 [L] flag saves on server processing
 Got a huge pile of rewrites? Use RewriteMap and have
  a lookup table as a text file

               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
IIS? ISAPI_Rewrite!
 What if your site is running Microsoft IIS Server?
 ISAPI_Rewrite plugin! Not that different from mod_rewrite
 In httpd.ini :
   – [ISAPI_Rewrite]
     RewriteRule ^/category/([0-9]+).htm$
     /index.asp?PageAction=VIEWCATS&Category=$1 [L]
   – Will rewrite a URL like
     http://www.example.com/index.asp?PageAction=VIEWCATS&Ca
     tegory=207 to something like
     http://www.example.com/category/207.htm
              © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Implementing 301 Redirects Using
Redirect Directives
 In .htaccess (or httpd.conf), you can redirect individual
  URLs, the contents of directories, entire domains… :
   – Redirect 301 /old_url.htm
     http://www.example.com/new_url.htm
   – Redirect 301 /old_dir/ http://www.example.com/new_dir/
   – Redirect 301 / http://www.example.com
 Pattern matching can be done with RedirectMatch 301
   – RedirectMatch 301 ^/(.+)/index.html$
     http://www.example.com/$1/

               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Implementing 301 Redirects Using
Rewrite Rules
 Or use a rewrite rule with the [R=301] flag
   – RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
   – RewriteRule ^(.*)$ http://www.example.com/$1
     [L,QSA,R=301]
 [NC] flag makes the rewrite condition case-insensitive




              © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Conditional Redirects
 Conditional 301 for bots – great for capturing the link juice
  from inbound affiliate links
 Only works if you manage your own affiliate program
 Most are outsourced and 302  (e.g. C.J.)
 By outsourcing your affiliate marketing, none of your deep
  affiliate links are counting
 If Amazon’s doing it, why can’t you? 
   – (Credit to Brian Klais for hypothesizing Amazon was doing this)
   – http://tinyurl.com/5ubc28
               © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Status Code
                                                                       200 for humans

© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
301 for all bots.
                                                                         Muahaha!!




© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Implementing Conditional Redirects
Using Rewrite Rules
 Selectively redirect bots that request URLs with session
  IDs to the URL sans session ID:
   – RewriteCond %{QUERY_STRING} PHPSESSID
     RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR]
     RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR]
     RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
     RewriteCond %{HTTP_USER_AGENT} Ask Jeeves
     RewriteRule ^/(.*)$ /$1 [R=301,L]
 Utilize browscap.ini instead of having to keep up with
  each spider’s name and version changes
              © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Error Pages




        © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Error Pages
 Traditional approach is to serve up a 404, which drops that error
  page with the obsolete or wrong URL out of the search indexes.
  This squanders the link juice to that page.
 But what if you return a 200 status code instead, so that the
  spiders follow the links! Then include a meta robots noindex so the
  error page itself doesn’t get indexed. 
 Or do a 301 redirect to something valuable (e.g. your home page)
  and dynamically include a small error notice 
 (Credit to Francois Planque for this clever approach.)


                © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
URL Stability
 An annually recurring feature, like a Holiday Gift Buying
  Guide, should have a stable, date-unspecified URL
   – No need for any 301s
   – When the current edition is to be retired and replaced with a
     new edition, assign a new URL to the archived edition
 Otherwise link juice earned over time is not carried over
  to future years’ editions



                © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
URL Testing
 URL affects
  searcher
  clickthrough
  rates
 Short URLs
  get clicked on
  2X long URLs

  (Source: MarketingSherpa,
  used with permission)


                 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
URL Testing
 Further, long URLs appear to act as a deterrent to clicking, drawing
  attention away from its listing and instead directing it to the listing
  below it, which then gets clicked 2.5x more frequently.
    – http://searchengineland.com/080515-084124.php
 Don’t be complacent with search-friendly URLs. Test and optimize.
 Make iterative improvements to URLs, but don’t lose link juice to
  previous URLs. 301 previous URLs to latest. No ch of 301s.
                                                    ains
 WordPress handles 301s automatically when renaming post slugs
 Mass editing URLs (post slugs) in WordPress – announcement
  tomorrow in Give It Up session
                 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Yank Competitor’s Grouped Results
from Google page 1 SERPs
 Knock out your competitor’s second indented (grouped)
  listing by directing link juice to other non-competitive
  listings (e.g. on page 2 SERPs, or directly below
  indented result’s true position)
 First, find the true position of their indented result by
  appending &num=9 to the URL and see if the indented
  listing drops off. If not, append &num=8. Rinse and
  repeat until the indented listing falls away. Indented
  listing is more susceptible the worse its true position.
              © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
This isn’t
                                       really #3




© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Nope,
                                         not yet




© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Gone!
                                        It’s true
                                        position
                                         was #9




© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
SEO the
                                        title of #12
                                        to bump it
                                        up to page
                                       1 – it will be
                                       grouped to
                                         #2. Then
                                        link to #11
                                       and bump it
                                        up to page
                                        1 to knock
                                       #4 to page 2



© 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
More Things I Wish I Had Time to
Cover
 Robots.txt gotchas
 Webmaster Central tools (www vs no www, crawl rate, robots.txt
  builder, Sitemaps, etc.)
 Yahoo's Dynamic URLs tab in Site Explorer
 <div class=quot;robots-nocontentquot;>
 If-Modified-Since
 Status codes 404, 401, 500 etc.
 PageRank transfer from PDFs, RSS feeds, Word docs etc.
 Diagnostic tools (e.g. livehttpheaders, User Agent Switcher)

                © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
Thanks!
 This Powerpoint can be downloaded from
  www.netconcepts.com/learn/bot-herding.ppt
 For 180 minute long screencast (including 90 minutes
  of Q&A) on SEO for large dynamic websites (taught
  by myself and Chris Smith) – including transcripts –
  email seo@netconcepts.com
 Questions after the show? Email me at
  stephan@netconcepts.com

             © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com

Weitere ähnliche Inhalte

Andere mochten auch

Actualités sur Google et le SEO - Février 2015
Actualités sur Google et le SEO - Février 2015Actualités sur Google et le SEO - Février 2015
Actualités sur Google et le SEO - Février 2015Philippe YONNET
 
21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google
21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google
21 avril 2015 : la compatibilité mobile, critère SEO officiel chez GoogleAurélien Delefosse
 
SEO Camp'us 2015 - Atelier pratique Digital Analytics
SEO Camp'us 2015 - Atelier pratique Digital AnalyticsSEO Camp'us 2015 - Atelier pratique Digital Analytics
SEO Camp'us 2015 - Atelier pratique Digital AnalyticsNicolas Malo
 
Backlinks : pépites et pommes pourries - SEO Camp'us 2015
Backlinks : pépites et pommes pourries - SEO Camp'us 2015Backlinks : pépites et pommes pourries - SEO Camp'us 2015
Backlinks : pépites et pommes pourries - SEO Camp'us 2015512banque
 
#Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez !
#Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez ! #Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez !
#Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez ! Guillaume Eouzan
 
#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité
#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité
#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre VisibilitéGuillaume Eouzan
 

Andere mochten auch (6)

Actualités sur Google et le SEO - Février 2015
Actualités sur Google et le SEO - Février 2015Actualités sur Google et le SEO - Février 2015
Actualités sur Google et le SEO - Février 2015
 
21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google
21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google
21 avril 2015 : la compatibilité mobile, critère SEO officiel chez Google
 
SEO Camp'us 2015 - Atelier pratique Digital Analytics
SEO Camp'us 2015 - Atelier pratique Digital AnalyticsSEO Camp'us 2015 - Atelier pratique Digital Analytics
SEO Camp'us 2015 - Atelier pratique Digital Analytics
 
Backlinks : pépites et pommes pourries - SEO Camp'us 2015
Backlinks : pépites et pommes pourries - SEO Camp'us 2015Backlinks : pépites et pommes pourries - SEO Camp'us 2015
Backlinks : pépites et pommes pourries - SEO Camp'us 2015
 
#Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez !
#Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez ! #Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez !
#Seocamp Paris 2015 Google Adwords: Domptez le et vous Convertirez !
 
#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité
#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité
#SeoCamp 2015 Google Adwords: Innovez et Améliorez votre Visibilité
 

Ähnlich wie Bot Herding and Duplicate Content Mitigation Strategies

Advanced SEO for Web Developers
Advanced SEO for Web DevelopersAdvanced SEO for Web Developers
Advanced SEO for Web DevelopersNathan Buggia
 
Getting More Traffic From Search Advanced Seo For Developers Presentation
Getting More Traffic From Search  Advanced Seo For Developers PresentationGetting More Traffic From Search  Advanced Seo For Developers Presentation
Getting More Traffic From Search Advanced Seo For Developers PresentationSeo Indonesia
 
.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana Stingu.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana StinguRoxana Stingu
 
3 coding101 fewd_lesson3_your_first_website 20210105
3 coding101 fewd_lesson3_your_first_website 202101053 coding101 fewd_lesson3_your_first_website 20210105
3 coding101 fewd_lesson3_your_first_website 20210105John Picasso
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
Nanoformats
NanoformatsNanoformats
Nanoformatsrozario
 
Using Amazon Simple Db With Rails
Using Amazon Simple Db With RailsUsing Amazon Simple Db With Rails
Using Amazon Simple Db With RailsAkhil Bansal
 
Web performance essentials - Goodies
Web performance essentials - GoodiesWeb performance essentials - Goodies
Web performance essentials - GoodiesJerry Emmanuel
 
Desenvolvimento web com Ruby on Rails (parte 2)
Desenvolvimento web com Ruby on Rails (parte 2)Desenvolvimento web com Ruby on Rails (parte 2)
Desenvolvimento web com Ruby on Rails (parte 2)Joao Lucas Santana
 
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)Robert Swisher
 

Ähnlich wie Bot Herding and Duplicate Content Mitigation Strategies (20)

Seo mistakes
Seo mistakesSeo mistakes
Seo mistakes
 
Seo mistakes
Seo mistakesSeo mistakes
Seo mistakes
 
Advanced SEO for Web Developers
Advanced SEO for Web DevelopersAdvanced SEO for Web Developers
Advanced SEO for Web Developers
 
Getting More Traffic From Search Advanced Seo For Developers Presentation
Getting More Traffic From Search  Advanced Seo For Developers PresentationGetting More Traffic From Search  Advanced Seo For Developers Presentation
Getting More Traffic From Search Advanced Seo For Developers Presentation
 
(SEO) Search Engine Optimization
(SEO) Search Engine Optimization(SEO) Search Engine Optimization
(SEO) Search Engine Optimization
 
Java script
Java scriptJava script
Java script
 
.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana Stingu.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana Stingu
 
3 coding101 fewd_lesson3_your_first_website 20210105
3 coding101 fewd_lesson3_your_first_website 202101053 coding101 fewd_lesson3_your_first_website 20210105
3 coding101 fewd_lesson3_your_first_website 20210105
 
T5 Oli Aro
T5 Oli AroT5 Oli Aro
T5 Oli Aro
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
Nanoformats
NanoformatsNanoformats
Nanoformats
 
Using Amazon Simple Db With Rails
Using Amazon Simple Db With RailsUsing Amazon Simple Db With Rails
Using Amazon Simple Db With Rails
 
Migration from ASP to ASP.NET
Migration from ASP to ASP.NETMigration from ASP to ASP.NET
Migration from ASP to ASP.NET
 
Web performance essentials - Goodies
Web performance essentials - GoodiesWeb performance essentials - Goodies
Web performance essentials - Goodies
 
Desenvolvimento web com Ruby on Rails (parte 2)
Desenvolvimento web com Ruby on Rails (parte 2)Desenvolvimento web com Ruby on Rails (parte 2)
Desenvolvimento web com Ruby on Rails (parte 2)
 
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
 
Sinatra
SinatraSinatra
Sinatra
 

Mehr von David Degrelle - Consultant SEO Expert

Chiffres clés et usages des sites vente de chaussures en ligne france 2011
Chiffres clés et usages des sites vente de chaussures en ligne france 2011Chiffres clés et usages des sites vente de chaussures en ligne france 2011
Chiffres clés et usages des sites vente de chaussures en ligne france 2011David Degrelle - Consultant SEO Expert
 
Personnalisation des recherches Google, Google Instant, Caffeine et Mayday
Personnalisation des recherches Google, Google Instant, Caffeine et MaydayPersonnalisation des recherches Google, Google Instant, Caffeine et Mayday
Personnalisation des recherches Google, Google Instant, Caffeine et MaydayDavid Degrelle - Consultant SEO Expert
 

Mehr von David Degrelle - Consultant SEO Expert (19)

Référencement et Web Sémantique SMX Paris 2013
Référencement et Web Sémantique SMX Paris 2013Référencement et Web Sémantique SMX Paris 2013
Référencement et Web Sémantique SMX Paris 2013
 
Référencement international et SEO en Suisse
Référencement international et SEO en SuisseRéférencement international et SEO en Suisse
Référencement international et SEO en Suisse
 
Le Référencement Multicanal sur internet en 2012
Le Référencement Multicanal sur internet en 2012Le Référencement Multicanal sur internet en 2012
Le Référencement Multicanal sur internet en 2012
 
Referencement multicanal en e-tourime
Referencement multicanal en e-tourimeReferencement multicanal en e-tourime
Referencement multicanal en e-tourime
 
Chiffres clés et usages des sites vente de chaussures en ligne france 2011
Chiffres clés et usages des sites vente de chaussures en ligne france 2011Chiffres clés et usages des sites vente de chaussures en ligne france 2011
Chiffres clés et usages des sites vente de chaussures en ligne france 2011
 
Personnalisation des recherches Google, Google Instant, Caffeine et Mayday
Personnalisation des recherches Google, Google Instant, Caffeine et MaydayPersonnalisation des recherches Google, Google Instant, Caffeine et Mayday
Personnalisation des recherches Google, Google Instant, Caffeine et Mayday
 
Du référencement naturel (SEO) au référencement Social (SMO)
Du référencement naturel (SEO) au référencement Social (SMO)Du référencement naturel (SEO) au référencement Social (SMO)
Du référencement naturel (SEO) au référencement Social (SMO)
 
Online Gaming and Casino SEO in France
Online Gaming and Casino SEO in FranceOnline Gaming and Casino SEO in France
Online Gaming and Casino SEO in France
 
SEOCampus 2010 : Referencement Universel
SEOCampus 2010 : Referencement UniverselSEOCampus 2010 : Referencement Universel
SEOCampus 2010 : Referencement Universel
 
Référencement Social et SMO
Référencement Social et SMORéférencement Social et SMO
Référencement Social et SMO
 
Référencement Multimédia et Universel sur Google
Référencement Multimédia et Universel sur GoogleRéférencement Multimédia et Universel sur Google
Référencement Multimédia et Universel sur Google
 
Etude Google Adwords/Metrix Lab pour L'oréal
Etude Google Adwords/Metrix Lab pour L'oréalEtude Google Adwords/Metrix Lab pour L'oréal
Etude Google Adwords/Metrix Lab pour L'oréal
 
1ere Position Ebusiness Archamps 2007
1ere Position Ebusiness Archamps 20071ere Position Ebusiness Archamps 2007
1ere Position Ebusiness Archamps 2007
 
Le PageRank est mort, vive le TrustRank !
Le PageRank est mort, vive le TrustRank !Le PageRank est mort, vive le TrustRank !
Le PageRank est mort, vive le TrustRank !
 
Long Distance WiFi record : 382 km by air !
Long Distance WiFi record : 382 km by air !Long Distance WiFi record : 382 km by air !
Long Distance WiFi record : 382 km by air !
 
Barometre e-mailing secteur E-Commerce en France
Barometre e-mailing secteur E-Commerce en FranceBarometre e-mailing secteur E-Commerce en France
Barometre e-mailing secteur E-Commerce en France
 
Référencement 2.0 et 3.0
Référencement 2.0 et 3.0Référencement 2.0 et 3.0
Référencement 2.0 et 3.0
 
Réussir sa campagne de liens sponsorisés
Réussir sa campagne de liens sponsorisésRéussir sa campagne de liens sponsorisés
Réussir sa campagne de liens sponsorisés
 
Internet et les élections présidentielles 2007
Internet et les élections présidentielles 2007Internet et les élections présidentielles 2007
Internet et les élections présidentielles 2007
 

Kürzlich hochgeladen

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Bot Herding and Duplicate Content Mitigation Strategies

  • 1. Bot Herding presented by Stephan Spencer, Founder & President, Netconcepts © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 2. Duplicate Content Mitigation  Dup content is rampant on blogs. Herd bots to permalink URL & lead in everywhere else (Archives by Date pages, Category pages, Tag pages, Home page, etc.) with paraphrased “Optional Excerpt” – Not just the first couple paragraphs, i.e. the <!--more--> tag! – Requires you to revise your Main Index Template theme file: if (empty($post->post_excerpt) || is_single() || is_page()) { the_content(); } else { the_excerpt(); echo quot;<a href='”; the_permalink(); echo quot;' rel='nofollow'>Continue reading &raquo;</a>quot;; } © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 3. Duplicate Content Mitigation  Include sig line (& headshot photo!) at bottom of post/article. Link to original article/post permalink URL! – http://www.naturalsearchblog.com/archives/2008/06/03/syndic ating-your-articles/ – http://www.businessblogconsulting.com/2008/05/brand- yourself-with-photo-sig-line © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 4. Duplicate Content Mitigation  On ecommerce sites, dup content also rampant: – Manufacturer-provided product descriptions, inconsistent order of query string parameters, “guided navigation”, pagination within categories, tracking parameters  Selectively append tracking codes for humans w/ “white hat cloaking” or use JavaScript to append the codes – REI.com used to append a quot;vcatquot; parameter on all brand links on their Shop By Brand page (see http://web.archive.org/web/20060823085548/www.rei.com/rei/s ales_and_events/brands.html) © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 5. Pagination  Not only creates many pages that share the same keyword theme, also very large categories with thousands of products result in hundreds of pages of product listings not getting crawled. Thus lowered product page indexation.  Herd bots through keyword-rich subcat links or “View All” link or both? How to display page number links? Optimal # of products to display/link per page? Test! © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 6. PageRank Leakage?  If you’re using Robots.txt Disallow, you’re probably leaking PageRank  Robots.txt Disallow & Meta Robots Noindex both accumulate and pass PageRank – Meta Noindex tag on a Master sitemap page will de-index the page but still pass PageRank to linked sub-sitemap pages  Meta Robots Nofollow blocks the flow of PageRank – http://www.stonetemple.com/articles/interview-matt-cutts.shtml © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 7. Rewriting Spider-Unfriendly URLs  3 approaches: 1) Use a “URL rewriting” server module / plugin – such as mod_rewrite for Apache, or ISAPI_Rewrite for IIS Server 2) Recode your scripts to extract variables out of the “path_info” part of the URL instead of the “query_string” 3) Or, if IT department involvement must be minimized, use a proxy server based solution (e.g. Netconcepts' GravityStream) – With (1) and (2), replace all occurrences of your old URLs in links on your site with your new search-friendly URLs. 301 redirect the old to new URLs too, so no link juice is lost. © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 8. mod_rewrite – the Foundation for URL Rewriting, Remapping & Redirecting  Works with Apache and IBM HTTP Server  Place “rules” within .htaccess or your Apache config file (e.g. httpd.conf, sites_conf/…) – RewriteEngine on – RewriteBase / – RewriteRule ^products/([0-9]+)/?$ /get_product.php?id=$1 [L] – RewriteRule ^([^/]+)/([^/]+).htm$ /webapp/wcs/stores/servlet/ProductDisplay?storeId=10 001&cat alogId=10001&langId=-1 &categoryID=$1&productID=$2 [QSA,P,L] © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 9. Regular Expressions  The magic of regular expressions / pattern matching – * means 0 or more of the immediately preceding character – + means 1 or more of the immediately preceding character – ? means 0 or 1 occurrence of the immediately preceding char – ^ means the beginning of the string, $ means the end of it – . means any character (i.e. wildcard) – “escapes” the character that follows, e.g. . mea dot ns – [ ] is for character ranges, e.g. [A-Za-z]. – ^ inside [] brackets means “not”, e.g. [^/] © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 10. Regular Expressions – () puts whatever is wrapped within it into memory – Access what’s in memory with $1 (what’s in first set of parens), $2 (what’s in second set of parens), and so on  Gotchas to beware of: – “Greedy” expressions. Use [^ instead of .* – .* can match on nothing. Use .+ instead – Unintentional substring matches because ^ or $ wasn’t specified © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 11. mod_rewrite Specifics  Proxy page using [P] flag – RewriteRule /blah.html$ http://www.google.com/ [P]  [QSA] flag is for when you don’t want query string params dropped (like when you want a tracking param preserved)  [L] flag saves on server processing  Got a huge pile of rewrites? Use RewriteMap and have a lookup table as a text file © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 12. IIS? ISAPI_Rewrite!  What if your site is running Microsoft IIS Server?  ISAPI_Rewrite plugin! Not that different from mod_rewrite  In httpd.ini : – [ISAPI_Rewrite] RewriteRule ^/category/([0-9]+).htm$ /index.asp?PageAction=VIEWCATS&Category=$1 [L] – Will rewrite a URL like http://www.example.com/index.asp?PageAction=VIEWCATS&Ca tegory=207 to something like http://www.example.com/category/207.htm © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 13. Implementing 301 Redirects Using Redirect Directives  In .htaccess (or httpd.conf), you can redirect individual URLs, the contents of directories, entire domains… : – Redirect 301 /old_url.htm http://www.example.com/new_url.htm – Redirect 301 /old_dir/ http://www.example.com/new_dir/ – Redirect 301 / http://www.example.com  Pattern matching can be done with RedirectMatch 301 – RedirectMatch 301 ^/(.+)/index.html$ http://www.example.com/$1/ © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 14. Implementing 301 Redirects Using Rewrite Rules  Or use a rewrite rule with the [R=301] flag – RewriteCond %{HTTP_HOST} !^www.example.com$ [NC] – RewriteRule ^(.*)$ http://www.example.com/$1 [L,QSA,R=301]  [NC] flag makes the rewrite condition case-insensitive © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 15. Conditional Redirects  Conditional 301 for bots – great for capturing the link juice from inbound affiliate links  Only works if you manage your own affiliate program  Most are outsourced and 302  (e.g. C.J.)  By outsourcing your affiliate marketing, none of your deep affiliate links are counting  If Amazon’s doing it, why can’t you?  – (Credit to Brian Klais for hypothesizing Amazon was doing this) – http://tinyurl.com/5ubc28 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 16. Status Code 200 for humans © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 17. 301 for all bots. Muahaha!! © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 18. Implementing Conditional Redirects Using Rewrite Rules  Selectively redirect bots that request URLs with session IDs to the URL sans session ID: – RewriteCond %{QUERY_STRING} PHPSESSID RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR] RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR] RewriteCond %{HTTP_USER_AGENT} Slurp [OR] RewriteCond %{HTTP_USER_AGENT} Ask Jeeves RewriteRule ^/(.*)$ /$1 [R=301,L]  Utilize browscap.ini instead of having to keep up with each spider’s name and version changes © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 19. Error Pages © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 20. Error Pages  Traditional approach is to serve up a 404, which drops that error page with the obsolete or wrong URL out of the search indexes. This squanders the link juice to that page.  But what if you return a 200 status code instead, so that the spiders follow the links! Then include a meta robots noindex so the error page itself doesn’t get indexed.   Or do a 301 redirect to something valuable (e.g. your home page) and dynamically include a small error notice   (Credit to Francois Planque for this clever approach.) © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 21. URL Stability  An annually recurring feature, like a Holiday Gift Buying Guide, should have a stable, date-unspecified URL – No need for any 301s – When the current edition is to be retired and replaced with a new edition, assign a new URL to the archived edition  Otherwise link juice earned over time is not carried over to future years’ editions © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 22. URL Testing  URL affects searcher clickthrough rates  Short URLs get clicked on 2X long URLs (Source: MarketingSherpa, used with permission) © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 23. URL Testing  Further, long URLs appear to act as a deterrent to clicking, drawing attention away from its listing and instead directing it to the listing below it, which then gets clicked 2.5x more frequently. – http://searchengineland.com/080515-084124.php  Don’t be complacent with search-friendly URLs. Test and optimize.  Make iterative improvements to URLs, but don’t lose link juice to previous URLs. 301 previous URLs to latest. No ch of 301s. ains  WordPress handles 301s automatically when renaming post slugs  Mass editing URLs (post slugs) in WordPress – announcement tomorrow in Give It Up session © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 24. Yank Competitor’s Grouped Results from Google page 1 SERPs  Knock out your competitor’s second indented (grouped) listing by directing link juice to other non-competitive listings (e.g. on page 2 SERPs, or directly below indented result’s true position)  First, find the true position of their indented result by appending &num=9 to the URL and see if the indented listing drops off. If not, append &num=8. Rinse and repeat until the indented listing falls away. Indented listing is more susceptible the worse its true position. © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 25. This isn’t really #3 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 26. Nope, not yet © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 27. Gone! It’s true position was #9 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 28. SEO the title of #12 to bump it up to page 1 – it will be grouped to #2. Then link to #11 and bump it up to page 1 to knock #4 to page 2 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 29. More Things I Wish I Had Time to Cover  Robots.txt gotchas  Webmaster Central tools (www vs no www, crawl rate, robots.txt builder, Sitemaps, etc.)  Yahoo's Dynamic URLs tab in Site Explorer  <div class=quot;robots-nocontentquot;>  If-Modified-Since  Status codes 404, 401, 500 etc.  PageRank transfer from PDFs, RSS feeds, Word docs etc.  Diagnostic tools (e.g. livehttpheaders, User Agent Switcher) © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com
  • 30. Thanks!  This Powerpoint can be downloaded from www.netconcepts.com/learn/bot-herding.ppt  For 180 minute long screencast (including 90 minutes of Q&A) on SEO for large dynamic websites (taught by myself and Chris Smith) – including transcripts – email seo@netconcepts.com  Questions after the show? Email me at stephan@netconcepts.com © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com