Blocking Search Engines from Indexing your SharePoint Site
1. BLOCKING SEARCH ENGINES FROM
INDEXING YOUR SHAREPOINT SITE
Ahmed Madany
Senior SharePoint Consultant
http://eg.linkedin.com/pub/ahmed-madany/35/80/2b6
http://ahmedmadany.wordpress.com/
https://twitter.com/ahmed_madany
2. AGENDA
Introduction
What is Robots.txt ?
Location of Robots.txt in SharePoint Site.
Creating Robots File.
Integrate Robots.txt with SharePoint Site.
3. INTRODUCTION
There are many occasions where you may want
to exclude a website or portion of a site from
search engine crawling and indexing.
There are several ways to
prevent Google, Yahoo!, Bing or Ask from
indexing a site’s pages.
robots.txt files to prevent sites from being
indexed and thus showing up in the search
engines.
4. WHAT IS ROBOTS.TXT
Robots.txt is a text (not html) file placed in the
root of your site to tell search robots which
pages should and should not be
visited/indexed.
It is not mandatory for search engines to
adhere to the instructions found in the
robots.txt but generally search engines obey
what they are asked not to do
5. WHAT IS ROBOTS.TXT
It is important to note that a robots.txt does not
completely prevent search engines from crawling
your site (i.e. it is not a firewall) and the fact that
you may have a robots.txt file on your site is
something like putting a note “Please, do not
enter” on your unlocked front door. Put simply, it
will not prevent thieves from coming in but the
good guys will not open to door and enter.It goes
without saying therefore, if you have sensitive
data, you cannot rely 100% on a robots.txt to
protect it from being indexed and displayed in
search results.
6. LOCATION OF ROBOTS.TXT IN SHAREPOINT SITE
The location of robots.txt is very important
It must be in the main directory because
otherwise user agents (search engines) will not
be able to find it.
Search engines look first in the main directory
(i.e.http://www.sitename.com/robots.txt) and if
they don’t find it there, they simply assume that
this site does not have a robots.txt file
7. CREATING ROBOTS FILE
Creating a Robots.txt
Launch Notepad
Put the following in your robots.txt file to refuse your
entire site from indexing :
User-agent: *
Disallow: /
Save the file as: robots.txt
Note : Make sure understand the robots.txt is case
sensitive for the urls
8. INTEGRATE ROBOTS.TXT WITH SHAREPOINT
SITE
Adding a robots.txt file to the root of your SharePoint
site.
Open up your root site in SharePoint Designer.
Double Click the folder All Files
Drag and drop the newly created robots.txt to the All
Files folder.
Exit SharePoint Designer.
Alternatively you can create the robots.txt from within
SharePoint Designer itself.