Whenever we talk about SEO of Wp blogs, WordPress robots.txt file plays a major role in search engine ranking. It helps to block search engine bots to index and crawl important part of our blog. Though, sometimes a wrong configured Robots.txt file can let your presence completely go away from search engines. So, it’s important when you make changes in your robots.txt file, it should be well optimized and should not block access to important part of your blog.
There are many misunderstanding regarding indexing and non-indexing of content in Robots.txt and we will look into that aspect also in this article.
SEO consists of hundreds of element and one of the essential part of SEO are Robots.txt. This small text file standing at the root of your Website can help in serious optimization of your Website. Most of Webmasters tend to avoid editing Robots.txt file, but it’s not as hard as killing a snake. Anyone with basic knowledge can create and edit his Robots file, and if you are new to this, this post is perfect for your need.
If your website hasn’t got a Robots.txt file, learn here how to do it. If your blog or website does have a Robots.txt file but is not optimized, then follow this post and optimize your Robots.txt file.
What is WordPress Robots.txt and why should we use it?
Robots.txt file helps search engine robots to direct which part to crawl and which part to avoid. When Search bot or spider of Search Engine comes to your site and wants to index your site, they follow Robots.txt file first. Search bot or spider follows this files direction for index or no index any page of your website.
If you are using WordPress, you will find Robots.txt file in the root of your WordPress installation. For static websites, if you have created one or you developer has created one, you will find it under your root folder. If you can’t simply create a new notepad file and name it as Robots.txt and upload it into Root directory of your domain using FTP. Here is ShoutMeLoud Robots txt file and you can see the content and it’s the location at the root of the domain.
How to make robots.txt file?
As I mentioned earlier, Robots.txt is a general text file. So, if you don’t have this file on your website, open any text editor as you like ( as the example: Notepad) and make Robots.txt file made with one or more records. Every record bears important information for search engine. Example:
If these lines write on Robots.txt file it’s allowed Google bot for index every page of your site. But
cgi-bin folder of root directory doesn’t allow for indexing. That means Google bot won’t index
By using Disallow option, you can restrict any search bot or spider for indexing any page or folder. There are many sites who use no index in Archive folder or page for not making duplicate content.
Where Can You Get names of Search bot?
You can get it in your website’s log, but if you want lots of visitors from the Search engine you should allow every search bot. That means every search bot will index your site. You can write
User-agent: * for allow every search bot. Example:
That’s why every search bot index your Website.
What You Shouldn’t do?
1. Don’t use comments in Robots.txt file.
2. Don’t keep the space at the beginning of any line and don’t make ordinary space in the file. Example:
Dis allow: /support
3. Don’t change rules of command.
4. If you want no index, more than one directory or page don’t write along with these names:
Disallow: /support /cgi-bin /images/
5. Use capital and small letter properly. As the example, you want no index “Download” directory but write “download” on Robots.txt file. It makes miss understand for search bot.
6. If you want index all page and directory of your site write:
7. But if you want no index for all page and directory of you site write:
After editing Robots.txt file upload via any FTP software on Root or Home Directory of your site.
Robots.Txt for WordPress:
You can either edit your WordPress Robots.txt file by logging into your FTP account of the server or you can use plugin like Robots meta to edit robots.txt file from WordPress dashboard. There are few things, which you should add in your robots.txt file along with your sitemap URL. Adding sitemap URL helps search engine bots to find your sitemap file and thus faster indexing of pages.
Here is a sample Robots.txt file for any domain. In sitemap, replace the Sitemap URL with your blog URL:
sitemap: http://www.shoutmeloud.com/sitemap.xml User-agent: * # disallow all files in these directories Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /archives/ disallow: /*?* Disallow: *?replytocom Disallow: /comments/feed/ User-agent: Mediapartners-Google* Allow: / User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Adsbot-Google Allow: / User-agent: Googlebot-Mobile Allow: /
How to make sure no content is affected by new Robots.txt file?
So now you have made some changes into your Robots.txt file, and it’s time to check if any of your content is impacted by updating robots.txt file. You can use Google Webmaster tool ‘Fetch as bot tool’ to see if your content can be accessed by Robots.txt file or not. This step is simple, login to Google Webmaster tool and go to diagnostic and Fetch as Google bot. Add your site posts and check if there is any issue accessing your post.
You can also check for the crawl errors caused due to Robots.txt file under Crawl error section of GWT. Under diagnostic >Crawl error select Restricted by Robots.txt and you will see what all links has been denied by Robots.txt file.
Here is an example of Robots.txt crawl Error for ShoutMeLoud:
You can clearly see that Replytocom links have been rejected by Robots.txt and so any other link which should not be a part of Google. FYI, Robots.txt file is an essential element of SEO, and you can avoid many post duplication issues by updating your Robots.txt file.
Are you using WordPress Robots.txt to optimize your site? Do you wish to add more insight to Robots.txt file? Let us know via comments. Don’t forget to subscribe to our e-mail newsletter to keep receiving more SEO tips.
Check your domain ranking