SEO and Robots.txt

The robots.txt file is the first file that a search engine requests when indexing your site. This file lets you tell the search engines which pages on your site not to index. When you first set up your site, it is important to have your robots.txt file in place before you go live. This is especially important if you have faceted navigation. Faceted navigation can result in a large number of URLs to pages that appear to search engines to have the same content. Because duplicate content has a negative impact on your search engine ranking, you should use the robots.txt file to control what is indexed and prevent the search engine from indexing pages that appear to be the same. For information about creating the robots.txt file when you use faceted navigation, see Robots.txt with Categories and Facets.

Important: Test the Robots.txt File

Before taking a site live, it is extremely important to test the robots.txt file to confirm how the different URLs behave. The best tool available to perform this test is the Robots Testing Tool in Google Webmaster Tools.

https://www.google.com/webmasters/tools/robots-testing-tool

Sample robot.txt file.

How to Create the Robots.txt File

The robots.txt file is a text file. You can use any text editor to create the file.

Robots.txt Common Commands

The following sample robots.txt files give you some commonly used methods of disallowing/allowing indexing.

Allow all web crawlers to crawl all content:

          User-agent: *
Disallow: 

        

Block all web crawlers from all content:

          User-agent: *
Disallow: / 

        

Block a specific web crawler from all content:

          User-agent: Googlebot
Disallow: / 

        

Block a specific web crawler from a specific facet and all its values:

          User-agent: Googlebot
Disallow: /facet/* 

        

Block all crawlers from a specific facet disregarding the order in which it appears:

          User-agent: *
Disallow: */facet/* 

        

Allow all crawlers to crawl a specific facet value within a facet, disregarding the order in which it appears:

          User-agent: *
Disallow: */facet/*
Allow: */facet/facet-value-1 

        

Allow all crawlers to crawl a specific facet value within a facet only when this facet appears first:

          User-agent: *
Disallow: /facet/*
Allow: /facet/facet-value-1 

        

Block all web crawlers from adding items to cart by following “Add to Cart” links:

Note:

This is applicable only to Site Builder sites. Commerce web stores do not have “Add to Cart” links available to web crawlers.

          User-agent: *
Disallow: /additemtocart.nl 

        

Robots.txt File Location

The robots.txt file should reside in the root folder of your website. You can create the robots.txt file on your local drive and upload it to the file cabinet.

To add the robots.txt file to the file cabinet

  1. Go to Documents > Files > File Cabinet.

  2. In the file cabinet, go to Web Site Hosting files > Live Hosting Files.

  3. Click Add File.

  4. Browse to the location of your robots.txt file and select it.

  5. Click Open. This adds the file to the file cabinet.

Related Topics

Robots.txt with Categories and Facets
SEO and Item Reviews

General Notices