Robots.txt file is used to keep a complete control over crawling of your blog.
You can allow particular bots and disallow bots to crawl your blog for certain posts or complete website.
Duplicate issue is a complete killer for wordpress and it directly effect your search engine results.
Pages like admin folder should not be crawl by bots which you can control by adding entries in your robots.txt.
You can check the status of your robots.txt by going to Google webmaster tool > Under tools > Analyze robots.txt
The basic structure of your robots.txt to avoid duplicate content should be somethign like this
User-agent: *
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/
Disallow: /comments/feed/
Disallow: /page/
Disallow: /comments/
This will prevent robots to crawl your admin folder followed by feeds, trackbacks, comment feeds, pages and comments.
Related posts:
- [Blogger] Google will use Feeds to Crawl your Web pages from now.
- Protect your blog post from copy Geeks
- Meta robots Wordpress SEO plugin
- Wordpress hack: Find the plugin used by any wordpress blog
- Global translator WP plugin Google adsense TOS violation and solution
- Analyze Your Website Blog Health With LinkAider : Webmaster Tools
- Free meta tag analyzer and key density checker :SEO
- Enhance your 404 pages with Google custom search page






{ 2 trackbacks }
{ 7 comments… read them below or add one }
very useful post
himanshu™s last blog post..Top 10 richest models in the World: Forbe™s Magazine
Thanks Harsh. This post is really handy.
.-= Amal Roy ´s last blog ..Enlarge Images Without Losing Quality/Pixelated “ 4 Must See Tools =-.
I’ve made the changes , thanks for the info
Thanks for the info,, working on the robots text now..
great stuff thanks for sharing.
Can Blogspot users also edit their robots.txt file?
User-agent: *
Crawl-delay: 2
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /category
Disallow: /tag
Disallow: /author
Disallow: /trackback
Disallow: /*trackback
Disallow: /*trackback*
Disallow: /*/trackback
Disallow: /*?*
Disallow: /*.html/$
Disallow: /*feed*
# Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*
# Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*