In: , , , , ,
On: 2007 / 06 / 18 Viewed: 109287 times
Shorter URL for this post: http://ozh.in/ey

I've seen recently a lot of plugins for WordPress aimed at taking care of the duplicate content issue in search engines. Don't get me wrong, those plugins are doing well what they were created for : adding a meta tag in some pages so that Googlebot and its friends don't index them. The problem is that doing such a thing is, in my very humble opinion, a bad idea.

Duplicate Content ? Duplicate Content ?

First of all, if you don't know what's wrong with duplicate content, read this nicely illustrated article from SEOmoz. In short, when a search engine bot sees the same content on different pages, let alone different websites, it doesn't like it, tries to identify the original source for it and puts a penalty on others. Duplicate content is a real cross site issue, often synonym to spam, splog and content stealing.

Inside your own blog, duplicate content might be a problem. You have original content on your post page, you have the very same content in the timely archives for the same day, month and year, and again the same in each category archive page under which you filed your post. Four, five, six times the same content? This has to be a bit confusing for our friend Googlebot, and it might put a penalty on some and any of these pages.

Note the emphasize on might. The real problem here is that you want to have control over which page is the most important, and not let search engines decide for yourself.

What to do, then ?

At this point, you have 3 possibilities :

  1. Do nothing
    Let search engines index 6 times the same content and decide which is best. That could work. Or that could painfully decrease your visibility in search engine pages, since you can't decide what page should be proeminent over others. Not an option for those who want to fine tune things.

  2. Use a "don't index this" plugin
    Basically, those plugins will simply add something like <meta name="robots" content="noindex,follow" /> in some pages, so that search engine bots will follow links but won't remember what they've seen on that page.
    As I've said in the intro, this seems a dumb idea to me. Why? Mainly for two reasons. First of all, it's much better for your site to have 6 pages indexed instead of 1. Or 6,000 instead of one thousand. And second reason, simply because it's up to you not to serve duplicate content.

  3. The right way : just give different content
    Reason #771 why WordPress is great is you can customize just everything to suit your needs. Like, having a smart theme with smart archive pages. Instead of displaying each post in its entirety, display an excerpt of it, and get your page indexed

The Right Way™ to do it

"The right way" purposely emphasized: as usual and as with most things, There's More Than One Way To Do It. But at least that's my way to do it, and judging by my referrer visits, it's not working too bad.

Keeping your site safe from the duplicate content issue, and more generally getting things optimized for search engines, should not be a plugin's job, it must be your theme's job, it must be coded and designed for it. Just like looking good, using good markup, being crossbrowser friendly, etc...

WordPress themes have a template page specifically made for those pesky archive pages, be they year, month, day, category, author, whatever archives. You need at least one file, located in wp-content/themes/yourtheme/, named archive.php.

If your theme does not have an archive.php, it sucks is incomplete. Create one, for example by duplicating and editing the file you'll find in the default theme directory. If your theme has an archive.php and displays whole posts, hack it. An example (and rather minimalist) template to display post excerpts would be:

PHP:
  1. <?php get_header(); ?> 
  2.  
  3. <div id="content">
  4.  
  5. <?php if (have_posts()) : ?>
  6. <?php while ( have_posts() ) : the_post() ?>
  7.     <div class="post">
  8.     <h2><?php the_title(); ?></h2>
  9.  
  10.     <div class="storycontent">
  11.     <p><?php
  12.     $short = get_the_excerpt();
  13.     if (strpos($short,'[...]') === false) $short.='[...]';
  14.     echo $short;
  15.     ?>
  16.     &rarr; <strong><a href="<?php the_permalink() ?>">Read more</a></strong></p>
  17.     </div>
  18.  
  19.     </div> <!-- post -->
  20.  
  21. <?php endwhile ?>
  22.  
  23. </div> <!-- content -->
  24.  
  25. <?php get_sidebar(); ?> 
  26.  
  27. <?php get_footer(); ?>

Does that really matter ?

My theme is serving archive pages using the example above. While most of my incoming search engine visitors land on post pages directly, I do have visitors landing on category archives. Oh, not much, about 2 or 3% of them. Just checking the past hour referrers, I've found people coming from Google and looking for :
lolcat + javascript (go figure)
french lolcats (ho hai i'm fwench!)
php bitwise gd function

So they came here, and actually they probably didn't find what they were looking for : I've never written anything about javascripted lolcats. But I've posted about lolcats, and about javascript, for sure. I've never written anything about bitwise operations in gd, but I've posted about gd, and about bitwise operators. Yet, Google showed them results that made these people think they would find what they were looking for on my site, and they came. And that would never have happened if I made my site with this noindex meta tag on archive pages.

So, why would I want to cut my visitor number by 2 or 3 percent with a noindex directive for bots ? Would that make 3 visitors a day, or 3 visitors and hour, that's still 3 potential readers, 3 potential ad clickers, 3 potential bloggers who will like and link my site. That'd be silly to tell Google not to send those fine people to my site.

Summary

Do stay away from the duplicate content problem. But it's definitely a theme issue, not a plugin's business.

Related posts

Shorter URL

Want to share or tweet this post? Please use this short URL: http://ozh.in/ey

Metastuff

This entry "WordPress, Duplicate Content, and Wrong SEO Plugins" was posted on 18/06/2007 at 10:37 pm and is tagged with , , , , ,
Watch this discussion : Comments RSS 2.0.

25 Blablas

    Pages: [3] 2 1 » Show All

  1. 25
    Adam Arnold Australia »
    thought, on 24/Jan/10 at 6:46 am # :

    Will your archives template above show excerpts when none are provided in the post/page itself. Or does this simply create it's own excerpt from the first N characters/words?

  2. 24
    Fred United States »
    wrote, on 27/Dec/09 at 9:10 pm # :

    I was thinking there must be some solution to the WordPress duplicate content problem other than using noindex. Thank you very much for the info, only problem is I'm just starting and hacking pages is still a lot of guesswork for me. Your example will help a lot though. Thanks!

  3. 23
    Rhys Australia »
    replied, on 16/Oct/09 at 7:05 am # :

    Hi -
    I can't get excited about duplicate content for my wordpress blogs - I see Google shows several duplicate contents in the same page of serps: E.G., under the site URL, with the URL with the postname, the same post under Category and again under Tag.

  4. 22
    Comparison of WordPress SEO Plugins &laq... United States »
    pingback on 24/Apr/09 at 8:13 am # :

    [...] in the “I hate SEO plugins” category (why), and tagged as “but some are better than others”, there’s an article on Urban [...]

  5. 21
    cuocthiseo Viet Nam »
    wrote, on 26/Dec/08 at 9:07 am # :

    Great tips, I'll follow some steps now,
    Thanks for share.

Pages: [3] 2 1 » Show All

Leave a Reply

Comment Guidelines or Die

  • HTML: You can use these tags: <a href=""> <em> <i> <b> <strong> <blockquote>
  • Posting code: Post raw code (no <> &lt; etc) within appropriate tags : [php][/php], [css][/css], [html][/html], [js][/js], [sql][/sql], [xml][/xml], or generic [code][code]
  • Gravatars: Curious about the little images next to each commenter's name ? Go to Gravatar.
  • Spam: Various spam plugins on patrol. I'll put pins in a Voodoo doll if you spam me.
  • I will mark as Spam test comments, all comments with SEO names (ie "My Cool Online Shop" instead of "Joe") or containing forum-like signatures.

Read more ?

Close
E-mail It