Duplicate content: the scourge of SEO?

March 28, 2006 – 12:49 pm

If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

Everywhere I go these days I hear “duplicate content” this and “duplicate content” this. It begins to make me wonder. Why is duplicate content such a big deal? Couple this with the “URL only” problem have in Google and you here tons of complaints. But is the concern about duplicate content really as important as people think?

Let’s take a look at the two different types of duplicate content

  1. Firstly, we have external duplicate content. An example of this is a free-to-reprint article that you distribute around the internet. You have one copy on your website and there are 100 other exact copies scattered around.
  2. Secondly, we have the internal duplicate content issue. This is when you have identical pages on your domain. You have one page that is pasted in several areas.

The second type of content, internal duplicate content, appears to be a much bigger deal than external duplicate content. Click here to It’s easy to discount external duplicate content. Do a quick search in any search engine and you can quickly find examples of article directories that hold rankings with exact copies of web pages found elsewhere on the internet. Evidence concerning the effect of internet duplicate content is a bit harder to pin down.

What does Google say about their handling of duplicate content?

Let’s look at the Google Webmaster Guidelines::

  • Don’t create multiple pages, subdomains, or domains with substantially duplicate content.

They tell us that duplicate content is indeed, a bad thing, which we already figured. Let’s look a step further, and consult the Google Webmaster FAQ and seek further elucidation on the topic:

It’s also important to keep in mind that our crawlers don’t index duplicate content, so creating identical sites at several domains will likely not result in their returning for many country restricts. If you do create duplicate domains, we suggest using a robots.txt file to block our crawler from accessing all but your preferred one.

Here the language is even clearer: the crawlers don’t even index the content! There is no indication of a penalty here, but rather their claim is that a crawler will recognize and ignore any duplicate content. I think it’s safe to say that this practice does not happen, at least not as explained here.

In summary, stay away from creating duplicate content on any one domain, according to official statements by Google. It doesn’t say anything about re-deployment of existing content with value-added. Perhaps there is such a thing as “quality duplicate content” that exists and is not “not crawled” by GoogleBot. Time will shed more light on this fascinating and never ending saga.

Related Resources:

Technorati Tags:

If you enjoyed this post, subscribe to the Sootle RSS feed!.

Post a Comment