• SEO Optimization Techniques & WordPress Plugins Subscribe to RSS
    • Blog
    • Archives
    • Scheduled Posts
    • WordPress Plugins
    • Directory
    • List of Blog Authors
      • Spunky Jones > 
        • Robots.txt for WordPress SEO

        • WordPress Robot.txt File EditingThe beginning of WordPress SEO, should start with the optimization process of your WordPress robots.txt file. The reason to do this is to prevent as much duplicate content as possible because "WordPress" out of the box produces a lot of duplicate content.  Having duplicated content can cause the search engines to penalize your site's search engine rankings, pagerank and crawl rate.

          The WordPress robots.txt file is used to give the search engines "robots" instructions to follow, when they crawl your WordPress blog. These instructions will tell search engines not to crawl non-relevant files, folders, images and duplicate content. Excluding non-relevant files, such as "/wp-admin/", "/wp-content", "/wp-includes/" will save bandwidth and speed up the search engine crawling process, when they access your site.

          A robots.txt file is a simple text file which can be created with your Windows notepad text editor, and then it is placed in your root domain of your site.

          WordPress robots.txt file example one, when WordPress is installed into the root directory, such as, /.

          # robots.txt for http://www.yourdomain.com/

          #  PARTIAL access (Googlebot)
          User-agent: Googlebot

          Disallow: /*?
          Disallow: /*.css$
          Disallow: /*.inc$
          Disallow: /*.js$
          Disallow: /*.php$
          Disallow: /category/*/*
          Disallow: /comment-page/*
          Disallow: /*/feed/$
          Disallow: /*/feed/rss/$
          Disallow: /*/trackback/$
          Disallow: /wp-

          User-agent: *
          Disallow: /cgi-bin/
          Disallow: /archives/
          Disallow: /category/
          Disallow: /comment-page
          Disallow: /feed
          Disallow: /feed/
          Disallow: /page/
          Disallow: /trackback/
          Disallow: /wp-admin/
          Disallow: /wp-content/
          Disallow: /wp-includes/
          Disallow: /wp-login.php/
          Disallow: /index.php

          # Edited last, on 04-05-2009

          WordPress robots.txt file example two, when WordPress is installed into a sub-directory, such as, yourdomain.com/blog/.

          # robots.txt for http://www.yourdomain.com/

          #  PARTIAL access (Googlebot)
          User-agent: Googlebot

          Disallow: /blog/*?
          Disallow: /blog/*.css$
          Disallow: /blog/*.inc$
          Disallow: /blog/*.js$
          Disallow: /blog/*.php$
          Disallow: /blog/category/*/*
          Disallow: /blog/comment-page/*
          Disallow: /blog/*/feed/$
          Disallow: /blog/*/feed/rss/$
          Disallow: /blog/*/trackback/$
          Disallow: /blog/wp-

          User-agent: *
          Disallow: /blog/archives/
          Disallow: /blog/category/
          Disallow: /blog/comment-page
          Disallow: /blog/feed
          Disallow: /blog/feed/
          Disallow: /blog/page/
          Disallow: /blog/tag/
          Disallow: /blog/trackback/
          Disallow: /blog/wp-admin/
          Disallow: /blog/wp-content/
          Disallow: /blog/wp-includes/
          Disallow: /blog/index.php

          # Edited last, on 04-05-2009

          "User-agent: *" function, which means that all search engines bots/spiders will crawl your site according to your instructions.

          "Disallow: /wp-" function, will exclude the search engines from crawling all files and folders which start with"wp-", therefore prevent the indexing of duplicate content.

          "Disallow: /*?" function, will exclude your index any url that contains a ?

          "$" function, at the end of a file, excludes extensions - example, /*.css$
          Using the .css$, will match all files the end with a .css

          # is used to show comments for reference because search engines spiders do not read lines beginning with #.

          Additional notes:
          * The above robots.txt file configuration focuses on keeping as much pagerank as possible on your main page and article pages.
          * You must make sure that you name your "robots.txt file" as, "robots.txt" and upload it as ASCII into your website's root directory or sub directory.
          * I have the robots.txt file up into two groups, one for User-agent: Googlebot and the other for User-agent: *which is for all bots.
          * You can also install the SearchStatus if you use the FireFox browser. It has many SEO options, and it will let you look at many sites robots.txt files if they are not hidden. This way you can find a nice PR6 site which is in your niche and see what they are using for a WordPress robots.txt configuration.
          * Make sure that your robots.txt file validates. Most people do not do this, but it is an good idea to make sure it validates. To make sure that your robots.txt validates, please visit Robots.txt Checker.

          Hope this blog post was helpful to you, "Robots.txt for WordPress SEO"

          Spunky Jones.





          If You Enjoyed, Please Share:
          • Digg
          • del.icio.us
          • StumbleUpon
          • Technorati
          • NewsVine
          • Reddit
          • Twitter
          Related Posts:
          1. WordPress 2.7 Creates Duplicate Content by Default
          2. Follow these Easy Steps to Customize Your WordPress 404 Page!
          3. Stop! Blog Scrappers with the WordPress RSS Footer Plugin
          4. How to remove /category/ from your WordPress URL
          5. Sub Title Plus Plugin for WordPress
          Top Incoming Search Queries:
          1. wordpress robots.txt
          2. wordpress robots txt
          3. robots.txt for wordpress
          4. sample wordpress robots.txt
          5. wordpress robot.txt
        • Posted on 5th April 2009 by Spunky Jones in WordPress
      • Google Custom Search
      • Navigation

        • About
        • Archives
        • Blog
        • Contact
        • Directory
        • List of Blog Authors
        • Privacy Policy
        • SEO Optimization
        • SEO Strategy
        • Upcoming Scheduled Posts
        • What are RSS Feeds?
        • WordPress Plugins
      • WordPress Plugins

        Keyword Optimizer
        SEO Top Tip
        Display Scheduled Posts
        User Resolution Logger
        Ping List Checker
        Bump the Schedule
        Sub Title Plus
        Open Post
      • Topics

      • Optimization Tools

        Canonical Duplicate Fixer
  • Copyright © Spunky Jones - SEO Optimization Techniques
    Designed by MadMouse Blog | Coded by PSD to HTML