• WordPress SEO Optimization Techniques Subscribe to RSS
    • Blog
    • Archives
    • Directory
    • WordPress SEO
    • WordPress Themes
      • Blog > WordPress > Robots.txt for WordPress SEO
        • Robots.txt for WordPress SEO

        • - Preventing, out of the box duplicate content -

        • WordPress Robot.txt File EditingThe beginning of WordPress SEO, should start with the optimization process of your WordPress robots.txt file. The reason to do this is to prevent as much duplicate content as possible because "WordPress" out of the box produces a lot of duplicate content.  Having duplicated content can cause the search engines to penalize your site's search engine rankings, pagerank and crawl rate.

          The WordPress robots.txt file is used to give the search engines "robots" instructions to follow, when they crawl your WordPress blog. These instructions will tell search engines not to crawl non-relevant files, folders, images and duplicate content. Excluding non-relevant files, such as "/wp-admin/", "/wp-content", "/wp-includes/" will save bandwidth and speed up the search engine crawling process, when they access your site.

          A robots.txt file is a simple text file which can be created with your Windows notepad text editor, and then it is placed in your root domain of your site.

          WordPress robots.txt file example one, when WordPress is installed into the root directory, such as, /.

          # robots.txt for http://www.yourdomain.com/

          #  PARTIAL access (Googlebot)
          User-agent: Googlebot

          Disallow: /*?
          Disallow: /*.css$
          Disallow: /*.inc$
          Disallow: /*.js$
          Disallow: /*.php$
          Disallow: /category/*/*
          Disallow: /comment-page/*
          Disallow: /*/feed/$
          Disallow: /*/feed/rss/$
          Disallow: /*/trackback/$
          Disallow: /wp-

          User-agent: *
          Disallow: /cgi-bin/
          Disallow: /archives/
          Disallow: /category/
          Disallow: /comment-page
          Disallow: /feed
          Disallow: /feed/
          Disallow: /page/
          Disallow: /trackback/
          Disallow: /wp-admin/
          Disallow: /wp-content/
          Disallow: /wp-includes/
          Disallow: /wp-login.php/
          Disallow: /index.php

          # Edited last, on 04-05-2009

          WordPress robots.txt file example two, when WordPress is installed into a sub-directory, such as, yourdomain.com/blog/.

          # robots.txt for http://www.yourdomain.com/

          #  PARTIAL access (Googlebot)
          User-agent: Googlebot

          Disallow: /blog/*?
          Disallow: /blog/*.css$
          Disallow: /blog/*.inc$
          Disallow: /blog/*.js$
          Disallow: /blog/*.php$
          Disallow: /blog/category/*/*
          Disallow: /blog/comment-page/*
          Disallow: /blog/*/feed/$
          Disallow: /blog/*/feed/rss/$
          Disallow: /blog/*/trackback/$
          Disallow: /blog/wp-

          User-agent: *
          Disallow: /blog/archives/
          Disallow: /blog/category/
          Disallow: /blog/comment-page
          Disallow: /blog/feed
          Disallow: /blog/feed/
          Disallow: /blog/page/
          Disallow: /blog/tag/
          Disallow: /blog/trackback/
          Disallow: /blog/wp-admin/
          Disallow: /blog/wp-content/
          Disallow: /blog/wp-includes/
          Disallow: /blog/index.php

          # Edited last, on 04-05-2009

          "User-agent: *" function, which means that all search engines bots/spiders will crawl your site according to your instructions.

          "Disallow: /wp-" function, will exclude the search engines from crawling all files and folders which start with"wp-", therefore prevent the indexing of duplicate content.

          "Disallow: /*?" function, will exclude your index any url that contains a ?

          "$" function, at the end of a file, excludes extensions - example, /*.css$
          Using the .css$, will match all files the end with a .css

          # is used to show comments for reference because search engines spiders do not read lines beginning with #.

          Additional notes:
          * The above robots.txt file configuration focuses on keeping as much pagerank as possible on your main page and article pages.
          * You must make sure that you name your "robots.txt file" as, "robots.txt" and upload it as ASCII into your website's root directory or sub directory.
          * I have the robots.txt file up into two groups, one for User-agent: Googlebot and the other for User-agent: *which is for all bots.
          * You can also install the SearchStatus if you use the FireFox browser. It has many SEO options, and it will let you look at many sites robots.txt files if they are not hidden. This way you can find a nice PR6 site which is in your niche and see what they are using for a WordPress robots.txt configuration.
          * Make sure that your robots.txt file validates. Most people do not do this, but it is an good idea to make sure it validates. To make sure that your robots.txt validates, please visit Robots.txt Checker.

          Hope this blog post was helpful to you, "Robots.txt for WordPress SEO"

          Spunky Jones.

          If You Enjoyed, Please Share:
          • Digg
          • del.icio.us
          • StumbleUpon
          • Technorati
          • NewsVine
          • Reddit
          • Twitter
          • Facebook
          Related Posts:
          1. WordPress 2.7 Creates Duplicate Content by Default
          2. Follow these Easy Steps to Customize Your WordPress 404 Page!
          3. Stop! Blog Scrappers with the WordPress RSS Footer Plugin
          4. How to Remove the WordPress Generator
          5. How to remove /category/ from your WordPress URL
          6. Sub Title Plus Plugin for WordPress
          7. WordPress Users, Don’t Get Banned by Ping Services
          8. User Resolution Logger WordPress Plugin!
          9. A step by step dummies guide to installing WordPress
          10. Installing WordPress on Your Computer
          Top Incoming Search Queries:
        • Posted on 5th April 2009 by Spunky Jones in WordPress
      • Google Custom Search
      • Navigation

        • About
        • Blog Archives
        • Contact
        • List of Blog Authors
        • Privacy Policy
        • Upcoming Scheduled Posts
        • WordPress SEO
      • Popular Posts

        • Adjusting Your Monitor's Brightness...
        • Protecting Your Email Address from Bots...
        • Dealing with WordPress "Missed Schedule"
        • How to get that new site indexed fast...
        • Optimizing Page Text
        • WordPress Brute Force Attack, Prevention
        • What can I do to get more traffic to my...
        • Bump the Schedule Plugin for WordPress
        • The Future of Search Engine...
        • How to Remove, "Comments are Closed"...
      • Category Topics

        • Blogging
        • Business World
        • Making Money Online
        • Ramblings
        • SEO
        • Social Networking
        • Technology
        • Web Directories
        • Webmaster World
        • WordPress
      • Spunky has Faster Loading Speeds

        Many of you are noticing faster loading speeds on Spunky Jones. This is due to a little tweaking and changing over to Gotekky for my web hosting provider.
      • WordPress Plugins

      • Directory Submissions

        Madmouse Web Directory - US
        Deep Link Listing Directory
        Madmouse Link Directory - India
      • Free SEO Tools

        • Canonical Duplicate Fixer
        • META Language Tag Generator
  • Copyright © 2008 - 2010 Spunky Jones: WordPress SEO Optimization Techniques
    Designed by: Spunky Jones | Coded by: PSD to HTML