How to Find Low Quality Content & Fix for Creating Quality Site

When I started Blogging in 2008, it was a completely different experience from how we blog now. Earlier, no matter what how good or how bad your post is, you end up getting some traffic from search engine. In fact, few low-quality content will never affect your whole domain. When Google launched Google Panda (April 2011), they changed the face of the blogosphere.

Find Low quality content on blog

Panda changed the blogging and SEO from quantity to quality. Having thousands or lacks of pages indexed in Google, won’t give the great result, rather quality articles only will help you for surviving this SEO battle. The major difference with panda algo is, if your blog has few low-quality content, chances are you might end up losing your site quality value, and you will notice less SERP’s ranking.  Now, the question is How you can find such low-quality content and work on them to improve your site.

What is low-quality content?

I have a client (Tech blog), who have more than 5000 blog posts and despite having a genuine and a useful tech blog, his website is affected by these algorithm change. So, let’s assume you are one of that blogger who worked day and night to grow your blog and panda updates took away all search engine love. FYI, latest Panda 4.0 update happened yesterday, and there is no way Google is going back to give your lost traffic back without any constructive action. So, if your site is still missing out traffic from the search engine( after panda penalty), it’s time for you to take some serious action and a complete SEO audit of your site.

Well, before I move ahead with finding low-quality posts, read What Google expect in high-quality content, this will give you an idea on what search engine expect from us and what you can do to improve your content. Low-quality content or thin content, posts which doesn’t add much value and misses out some of the important ingredients of quality content and SEO requirements for on page optimization.

Honestly, there is no hand book of identifying thin content, so I’m going to share my experience of what I did here at ShoutMeLoud and what action I have taken to improve or get rid of such thin content. You should also analyze and find, which algorithm update hit your site. Here is a complete Google algo change timeline for your reference.

Here are general methods that we will be following:

  • Finding content with no search engine traffic
  • Finding content with very little search engine traffic
  • Finding duplicate content
  • Finding Content with less word count and taking action
  • Checking for plagiarism
  • Finally fixing broken links and wait for few days to see the changes

Also, you need to learn and get used to following things here, which will help us in this tutorial:

I know it’s a lot of reading, but by the end you will be getting very close to understanding how SEO works now and you will have a better plan for panda recovery.

Fixing Low-quality content quickly:

Now, let us get rid of pages that can be removed quickly, and you can easily lower down the number of the thin content from the search index. The First step is to check, How Google is indexing your blog. You can login to Webmaster tool and check how many pages Google have indexed from your blog. If you have 2000 blog posts and you see more than 10000 indexed pages, certainly something is not right.

If your categories & tags are indexed in Google search, I recommend you to get rid of it. Also do a quick search in Google with following search term:

And you can see all the pages Google have indexed, and get rid of all such pages which doesn’t add any value to search and to readers, for example Tags, Categories, author, Paginated pages. You can add no-index tags to these page using Meta Robots plugin and from URL removal feature under your Google webmaster tool dashboard or using, you can get rid of the complete folder at one go. Something, which will help you to get rid of tons of pages that are not useful.

Google site operator

Also, check if any URL parameter is indexed in Google and if it’s not important, you can use URL parameter handling feature in GWT to get rid of such links. For me, replytocom parameter was an issue, and you can read here: How to got rid of replytocom links.

Identifying thin content with No traffic/negligible traffic

find Low quality content

Here at ShoutMeLoud, I use Google Analytics to monitor site traffic and stats, and this is one free tool, which I highly recommend to every Webmaster. Login to your Google Analytics dashboard > site content > All Pages > Show rows = 500 and start browsing the next pages. When you see stats as 1-10 visits, it’s time to make a list of such content.  You might like to increase the time to 3

Login to your Google Analytics dashboard > site content > All Pages > Show rows = 500 and start browsing the next pages. When you see stats as 1-10 visits, it’s time to make a list of such content.  You might like to increase the time to 3 months so that you get accurate results. If you have very few pages/posts, you can use Google Analytics javascript to analyze such pages individually.

Duplicate content:

Duplicate content is one of the most common reason for site penalty and your site being marked as the low-quality site. Duplicate content happens due to many reasons, and one of them is poor SEO structure of your site.  I have already shared a detailed article on how you can identify duplicate content on your blog, and you can read it below:

Blog post with less word count:

Let me start with, it’s a myth that fewer words means a low-quality content but it all depends on how you are writing the content and which topic you are covering. If you cover an important topic with proper Keywords and LSI words along with outbound links in 200 words, it could be a good article and rank pretty well. Though, one of the recent SEO case study (SEO debunked) which I have read, it’s clearly shown pages with more words rank higher. Now, I try to write a post with the minimum of 400 words and if I’m covering a news post, I make sure to add my insights and opinion to make it more meaningful.

Start finding all such content that have very less word count and analyze their traffic and identify content that is getting no traffic. They are your first enemy, and you should start working on optimizing them and taking proper action. In my case, I found almost 80 such articles which have word count <100 and they were performing badly. I will share my tip later in this article, on how I optimized thin content. You can use Admin word count plugin to see the word count of posts from the dashboard.

Content with Broken link:

Lets assume you have written a very useful plugin or software a year back, and now that link is outdated or gives 404 error. Do you think such blog posts will add any value to people? You will notice more bounce rate and avg. Time on the page will go down. Use Broken link checker plugin to identify all broken links quickly on your WordPress blog and fix them. Either update the post with the new link or add a notice that the link is no more working and manually link to other related posts/software so that readers can find something meaningful. For other platform, you can use Xenu link checker desktop tool to identify 404 links on your site.


This is something Google handles pretty smart  but again, the search engine is just an algo, and you can’t rely on search engine bots smartness all the time.Find all the copied content from your blog and report the same to Google. In my case, I reported almost 2000+ such links and it helped a lot.

Here are some articles from past that will be helpful and here is a video, which explains how you can report such content to Google.

Alert! Thin content found what next? ; Time for Action

So, following the above method you have successfully identified all thin and low-quality articles on your blog and now it’s time to take action and improve your blog. Here are some of the suggested method, which I used and worked for me. Though, keep in mind best judge of your content is you and it’s you who have to determine if your content adds value or not. Go ahead & read: how to judge a high-quality content and it will help you in the whole process.

Remove content:

This is for quick fixing your traffic drop and I have seen this worked like a charm for me and for my clients. Noindex all such content which are not useful and meaningful now. May be years back, those discount coupons or Facebook spam warning messages was useful but now they are just a piece of junk for SEO. No body is searching for it and they are just increasing the quantity of indexed pages.

I usually take two actions for such content:

  • Delete and Noindex : When content is poorly written and doesn’t add any value. I love to keep my database size as minimal as possible and only keep valuable content
  • No-index only: At times, I keep the content on my blog but noindex the article. There are many content which was written in-depth but now such services are not more working. For example: Google wave. Since, such content are your master-piece, you should keep it on your blog but noindex them.

Update content:

There are many posts which are useful but poorly written and have a scope of improvement. Work on such posts and add more content to it and add more value. I follow 2-3 methods to make sure, when I’m updating the content, it should work for me.

  • Check current keyword which is driving traffic (use Google Analytics or SEMRUSH)
  • Perform a keyword research (List of Keyword research tools)
  • Use EasyWPSEO and start optimizing content with your target Keyword, use LSI words and make sure your readability score is good.

If you one of those fanatic who believe in On-page SEO, you are doing it right but make sure you don’t over whelmed Google with Keyword stuffing and over optimization. Try to add slides, videos and images in a post to make it more engaging. Once you are done updating old content, you should share this article on your social networking and bookmarking site and let Google recrawl your content faster.

Following the above steps will fix many of your SEO and low quality site issue but there are more you can do after that.

  • Make sure all your posts have Unique meta title and meta description
  • Check Google Webmaster tool HTML suggestion for any warning and short/duplicate meta warning
  • Don’t use too many H2/h3 tags in a post. Follow the heading hierarchy.
  • Make sure your posts title are using H1 tags (One of the most common problem I have seen on old blogs)
  • You can use SEOMOZ tool, which is really awesome for complete SEO audit of your site. Something, which I used when I was fixing my site and it helped a lot in finding common SEO issues.
  • Once you removed all such useless piece of content from your blog, make sure to regenerate your sitemap and resubmit it to Google.
  • Work on improving social signals to your blog posts (Very helpful).
  • Get quality links to internal posts and updates posts via guest posting campaign.

Now, if you are a victim of thin content blog and seeing traffic loss after any algo change, it’s THE TIME to take action and bring back your lost traffic. I hope this tutorial will help you but if you still have any queries about identifying low quality content or to add more value to this post, do let me know via comments. Also, if you find this tutorial useful, don’t forget to share it on Google plus and Facebook.

Subscribe on Youtube

Article By
Harsh Agrawal is a blog scientist and a passionate blogger. He is blogging since 2008 & writes about Blogging, SEO, Make money online & tech. His blog, ShoutMeLoud receives 1 million Pageviews/month and have over 700K subscribers.


COMMENTs ( 17 )

  1. Sanjay Sajeev says

    Hi again harsh.
    According to my knowledge, when we setup google authorship and publisher for our blog, Google will automatically know that we are the owner of this blog’s post. So when someone else copy your content or part of your content, it will not affect your site’s performance.

  2. Taswir Haider says

    Great post Harsh! Some new technique dealing with thin content. Specially I like the idea to find out low traffic pages from Google Analytics.

  3. Manish says

    What about a Q&A platform , where each questions asked by someone is actually just 100 or 200 words , but thats what the actual length has to be , its not a post per se, but in wordpress terms its like a post , what do we do in this situation ?

  4. says

    Hi Harsh , you mention above that Make sure all your posts have Unique meta title and meta description , My blog have post title like ” gadgets name price and features ” , so all the post like this .Is this can produce any issue ?

  5. says

    Thanks for this really useful article, I’m a bit curious about no indexing the already indexed tag and category pages. Will they have an effect since Google has already indexed them ?

  6. says

    Just curious, how many articles did you removed? Currently you have 2000+ articles, how many u had before the ‘spring cleaning’?

  7. rakesh kumar says

    I got your point harsh, Just clear thin content from the list. One question which i want to ask you what about the low quality websites that keep on appearing on our website as a commentators’ signature link, and what to do with those commentators which add deep link in their signatures.

    • says

      Any link going out or on your site..You are responsible..
      Specially after Penguin update..I highly recommend to get rid of such links…or make sure your commentator comment with real name and not something like “Best Pet shop” “Buy Inkjet printer” ….

  8. geoio says

    Right, we should go back, rewrite and complement our posts, in any case it will be useful for the blog. About low quality content – I think they put filters based on certain words, typical for spammers, like “best”, “amazing”, “free”, “money”…
    Duplicate content can kill your website!

  9. Francesco says

    Hi Harsh, thanks for sharing this info! I have a question for you. I’m using blogger and I find out google indexes every archive page of my site… so if I perform site:mysiteurl in google, I find Jen2011 archive page, Feb2011 archive page etc… I have about 40 posts on my site, but google find 127 results, because it indexed every archive page.

    Do you thinking it penalize my blog? Have I to find a way to remove archive page to be indexed by google algo?

    this is the google query for……0.0…1c.2pHE3QLDbks&pbx=1&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.,cf.osb&fp=e5f78c1fd15b6751&biw=1251&bih=662


    • says

      I will put that question back to you:
      Do you believe your archive pages are adding any value in search???
      In simple word: it;s good for your blog structure but no use for search engines…Infact you will notice all archive pages will have similar meta title which creates the issue of duplication…!!
      You can use Bloggers new Robots.txt file to get rid of such links…!! If you can’t figure out, let me know and I ca write a quick tutorial or make a quick video for the same.. :)

  10. Gautam Doddamani says

    very useful tut although my site didnt get hit by the recent algo update as it is relatively new, i will follow these rules and lookout if i have any low quality content and remove them!

  11. Sreejesh says

    Harsh, I’m we can also find posts with low wordcount using wordstats plugin. I believe I’ve sent you the link already some days ago.