{"id":625,"date":"2007-06-18T22:37:43","date_gmt":"2007-06-18T20:37:43","guid":{"rendered":"http:\/\/planetozh.com\/blog\/2007\/06\/wordpress-duplicate-content-and-wrong-seo-plugins\/"},"modified":"2007-06-27T09:02:32","modified_gmt":"2007-06-27T07:02:32","slug":"wordpress-duplicate-content-and-wrong-seo-plugins","status":"publish","type":"post","link":"https:\/\/planetozh.com\/blog\/2007\/06\/wordpress-duplicate-content-and-wrong-seo-plugins\/","title":{"rendered":"WordPress, Duplicate Content, and Wrong SEO Plugins"},"content":{"rendered":"<p>I&#39;ve seen recently a <a href=\"http:\/\/www.seologs.com\/wordpress-duplicate-content-cure\/\">lot<\/a> <a href=\"http:\/\/www.filination.com\/tech\/2007\/05\/27\/wordpress-seo-using-excerpt-robotstxt-and-noindex-meta-tag-for-duplicate-content-in-index-archives-and-categories\/\">of<\/a> <a href=\"http:\/\/www.utheguru.com\/seo_wordpress-wordpress-seo-plugin\">plugins<\/a> for WordPress aimed at taking care of the duplicate content issue in search engines. Don&#39;t get me wrong, those plugins are doing well what they were created for : adding a meta tag in some pages so that Googlebot and its friends don&#39;t index them. The problem is that doing such a thing is, in my very humble opinion, a <strong>bad idea<\/strong>.<br \/>\n<!--more--><\/p>\n<h2>Duplicate Content ? Duplicate Content ?<\/h2>\n<p>First of all, if you don&#39;t know what&#39;s wrong with duplicate content, read this <a href=\"http:\/\/www.seomoz.org\/blog\/the-illustrated-guide-to-duplicate-content-in-the-search-engines\">nicely illustrated article<\/a> from SEOmoz. In short, when a search engine bot sees the same content on different pages, let alone different websites, it doesn&#39;t like it, tries to identify the original source for it and puts a penalty on others. Duplicate content is a real cross site issue, often synonym to spam, splog and content stealing.<\/p>\n<p>Inside your own blog, duplicate content <em>might<\/em> be a problem. You have original content on your post page, you have the very same content in the timely archives for the same day, month and year, and again the same in each category archive page under which you filed your post. Four, five, six times the same content? This has to be a bit confusing for our friend Googlebot, and it <em>might<\/em> put a penalty on some and any of these pages.<\/p>\n<p>Note the emphasize on <em>might<\/em>. The real problem here is that you want to have control over which page is the most important, and not let search engines decide for yourself.<\/p>\n<h2>What to do, then ?<\/h2>\n<p>At this point, you have 3 possibilities :<\/p>\n<ol>\n<li><b>Do nothing<\/b><br \/>\nLet search engines index 6 times the same content and decide which is best. That <em>could<\/em> work. Or that <em>could<\/em> painfully decrease your visibility in search engine pages, since you can&#39;t decide what page should be proeminent over others. Not an option for those who want to fine tune things.<\/p>\n<\/li>\n<li><b>Use a &quot;don&#39;t index this&quot; plugin<\/b><br \/>\nBasically, those plugins will simply add something like <tt>&lt;meta name=\"robots\" content=\"noindex,follow\" \/><\/tt> in some pages, so that search engine bots will follow links but won&#39;t remember what they&#39;ve seen on that page.<br \/>\nAs I&#39;ve said in the intro, this seems a dumb idea to me. Why? Mainly for two reasons. First of all, it&#39;s much better for your site to have 6 pages indexed instead of 1. Or 6,000 instead of one thousand. And second reason, simply because <b>it&#39;s up to you<\/b> not to serve duplicate content.<\/p>\n<\/li>\n<li><b>The right way : just give <em>different<\/em> content<\/b><br \/>\nReason #771 why WordPress is great is you can customize just everything to suit your needs. Like, <em>having a smart theme with smart archive pages<\/em>. Instead of displaying each post in its entirety, display an excerpt of it, <b>and<\/b> get your page indexed<\/li>\n<\/ol>\n<h2>The Right Way&trade; to do it<\/h2>\n<p>&quot;The right way&quot; purposely emphasized: as usual and as with most things, <a href=\"http:\/\/en.wikipedia.org\/wiki\/There_is_more_than_one_way_to_do_it\">There&#39;s More Than One Way To Do It<\/a>. But at least that&#39;s my way to do it, and judging by my referrer visits, it&#39;s not working too bad.<\/p>\n<p>Keeping your site safe from the duplicate content issue, and more generally getting things optimized for search engines, should not be a plugin&#39;s job, it <b>must be<\/b> your theme&#39;s job, it must be coded and designed for it. Just like looking good, using good markup, being crossbrowser friendly, etc&#8230;<\/p>\n<p>WordPress themes have a template page specifically made for those pesky archive pages, be they year, month, day, category, author, whatever archives. You need at least one file, located in <em>wp-content\/themes\/yourtheme\/<\/em>, named <strong>archive.php<\/strong>.<\/p>\n<p>If your theme does not have an archive.php, it <del>sucks<\/del> is incomplete. Create one, for example by duplicating and editing the file you&#39;ll find in the default theme directory. If your theme has an archive.php and displays whole posts, hack it. An example (and rather minimalist) template to display post excerpts would be:<\/p>\n<div id=\"ig-sh-1\" class=\"syntax_hilite\">\n\n\t\t<div class=\"toolbar\">\n\n\t\t<div class=\"view-different-container\">\n\t\t\t\t\t\t<a href=\"#\" class=\"view-different\">&lt; View <span>plain text<\/span> &gt;<\/a>\n\t\t\t\t\t<\/div>\n\n\t\t<div class=\"language-name\">php<\/div>\n\n\t\t\n\t\t<br clear=\"both\">\n\n\t<\/div>\n\t\n\t<div class=\"code\">\n\t\t<ol class=\"php\" style=\"font-family:monospace\"><li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> get_header<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span><span style=\"color: #339933\">;<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span>&nbsp; <\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #339933\">&lt;<\/span>div id<span style=\"color: #339933\">=<\/span><span style=\"color: #0000ff\">&quot;content&quot;<\/span><span style=\"color: #339933\">&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> <span style=\"color: #b1b100\">if<\/span> <span style=\"color: #009900\">&#040;<\/span>have_posts<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span><span style=\"color: #009900\">&#041;<\/span> <span style=\"color: #339933\">:<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> <span style=\"color: #b1b100\">while<\/span> <span style=\"color: #009900\">&#040;<\/span> have_posts<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span> <span style=\"color: #009900\">&#041;<\/span> <span style=\"color: #339933\">:<\/span> the_post<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&lt;<\/span>div <span style=\"color: #000000;font-weight: bold\">class<\/span><span style=\"color: #339933\">=<\/span><span style=\"color: #0000ff\">&quot;post&quot;<\/span><span style=\"color: #339933\">&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&lt;<\/span>h2<span style=\"color: #339933\">&gt;<\/span><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> the_title<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span><span style=\"color: #339933\">;<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span><span style=\"color: #339933\">&lt;\/<\/span>h2<span style=\"color: #339933\">&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&lt;<\/span>div <span style=\"color: #000000;font-weight: bold\">class<\/span><span style=\"color: #339933\">=<\/span><span style=\"color: #0000ff\">&quot;storycontent&quot;<\/span><span style=\"color: #339933\">&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&lt;<\/span>p<span style=\"color: #339933\">&gt;<\/span><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #000088\">$short<\/span> <span style=\"color: #339933\">=<\/span> get_the_excerpt<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span><span style=\"color: #339933\">;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #b1b100\">if<\/span> <span style=\"color: #009900\">&#040;<\/span><span style=\"color: #990000\">strpos<\/span><span style=\"color: #009900\">&#040;<\/span><span style=\"color: #000088\">$short<\/span><span style=\"color: #339933\">,<\/span><span style=\"color: #0000ff\">'&amp;#91;...&amp;#93;'<\/span><span style=\"color: #009900\">&#041;<\/span> <span style=\"color: #339933\">===<\/span> <span style=\"color: #009900;font-weight: bold\">false<\/span><span style=\"color: #009900\">&#041;<\/span> <span style=\"color: #000088\">$short<\/span><span style=\"color: #339933\">.=<\/span><span style=\"color: #0000ff\">'&amp;#91;...&amp;#93;'<\/span><span style=\"color: #339933\">;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #b1b100\">echo<\/span> <span style=\"color: #000088\">$short<\/span><span style=\"color: #339933\">;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&amp;<\/span>rarr<span style=\"color: #339933\">;<\/span> <span style=\"color: #339933\">&lt;<\/span>strong<span style=\"color: #339933\">&gt;&lt;<\/span>a href<span style=\"color: #339933\">=<\/span><span style=\"color: #0000ff\">&quot;&lt;?php the_permalink() ?&gt;&quot;<\/span><span style=\"color: #339933\">&gt;<\/span>Read more<span style=\"color: #339933\">&lt;\/<\/span>a<span style=\"color: #339933\">&gt;&lt;\/<\/span>strong<span style=\"color: #339933\">&gt;&lt;\/<\/span>p<span style=\"color: #339933\">&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&lt;\/<\/span>div<span style=\"color: #339933\">&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp; &nbsp; <span style=\"color: #339933\">&lt;\/<\/span>div<span style=\"color: #339933\">&gt;<\/span> <span style=\"color: #339933\">&lt;!--<\/span> post <span style=\"color: #339933\">--&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> <span style=\"color: #b1b100\">endwhile<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #339933\">&lt;\/<\/span>div<span style=\"color: #339933\">&gt;<\/span> <span style=\"color: #339933\">&lt;!--<\/span> content <span style=\"color: #339933\">--&gt;<\/span><\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> get_sidebar<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span><span style=\"color: #339933\">;<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span> <\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\">&nbsp;<\/div><\/li>\n<li style=\"font-weight: normal;vertical-align:top\"><div style=\"font: normal normal 1em\/1.2em monospace;margin:0;padding:0;background:none;vertical-align:top\"><span style=\"color: #000000;font-weight: bold\">&lt;?php<\/span> get_footer<span style=\"color: #009900\">&#040;<\/span><span style=\"color: #009900\">&#041;<\/span><span style=\"color: #339933\">;<\/span> <span style=\"color: #000000;font-weight: bold\">?&gt;<\/span><\/div><\/li>\n<\/ol>\t<\/div>\n\n<\/div>\n\n<h2>Does that really matter ?<\/h2>\n<p>My theme is serving archive pages using the example above. While most of my incoming search engine visitors land on post pages directly, I do have visitors landing on category archives. Oh, not much, about 2 or 3% of them. Just checking the past hour referrers, I&#39;ve found people coming from Google and looking for :<br \/>\n<a href=\"http:\/\/www.google.ca\/search?q=lolcats+%2B+javascript\">lolcat + javascript<\/a> (go figure)<br \/>\n<a href=\"http:\/\/www.google.com\/search?q=french+lolcats\">french lolcats<\/a> (ho hai i&#39;m fwench!)<br \/>\n<a href=\"http:\/\/www.google.com\/search?q=php+bitwise+gd+function\">php bitwise gd function<\/a><\/p>\n<p>So they came here, and actually they probably didn&#39;t find what they were looking for : I&#39;ve never written anything about javascripted lolcats. But I&#39;ve posted about lolcats, and about javascript, for sure. I&#39;ve never written anything about bitwise operations in gd, but I&#39;ve posted about gd, and about bitwise operators. Yet, Google showed them results that made these people think they would find what they were looking for on my site, and they came. And that would never have happened if I made my site with this <em>noindex<\/em> meta tag on archive pages.<\/p>\n<p>So, why would I want to cut my visitor number by 2 or 3 percent with a <em>noindex<\/em> directive for bots ? Would that make 3 visitors a day, or 3 visitors and hour, that&#39;s still 3 potential readers, 3 potential ad clickers, 3 potential bloggers who will like and link my site. That&#39;d be silly to tell Google not to send those fine people to my site.<\/p>\n<h2>Summary<\/h2>\n<p>Do stay away from the duplicate content problem. But it&#39;s definitely a <strong>theme<\/strong> issue, not a plugin&#39;s business.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#39;ve seen recently a lot of plugins for WordPress aimed at taking care of the duplicate content issue in search engines. Don&#39;t get me wrong, those plugins are doing well what they were created for : adding a meta tag in some pages so that Googlebot and its friends don&#39;t index them. The problem is [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[42,216,85,119,245,108],"class_list":["post-625","post","type-post","status-publish","format-standard","hentry","tag-google","tag-noindex","tag-plugins","tag-seo","tag-wordpress","tag-wordpress-theme"],"_links":{"self":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts\/625","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/comments?post=625"}],"version-history":[{"count":0,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts\/625\/revisions"}],"wp:attachment":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/media?parent=625"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/categories?post=625"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/tags?post=625"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}