The three major search engines (Google, Yahoo and Bing) joined forces in February 2009 and introduced the canonical tag. That was supposed to be a revolutionary way to resolve the duplicate content issues that so many SEOs and web designers were so much worried about. It was the time that the myth about duplicate content penalties was still alive, although such thing doesn’t exist.
According to google specifying a canonical URL is “a hint” rather than a directive. The terms sounds too innocent and one wouldn’t expect such distress as that many people have suffered from the catastrophic consequences of implementing the canonical link wrongly!
What if the rel=”canonical” returns a 404?
Google’s answer: “We’ll continue to index your content and use a heuristic to find a canonical, but we recommend that you specify existent URLs as canonicals.”
Many users’ experience though has been completely different. I personally witnessed the severe negative impact of wrong implemented canonical tags on a big e-commerce site. Their developers where keen to resolve duplicate content issues within their site in order to improve rankings of some deeper pages associated with secondary terms. However, they did it wrong specifying a non existent URL as the canonical which was returning a 404 response header. Within just 1 day they lost all their rankings, even for terms they were ranking first for years!. Getting those pages ranking again took about a three months.
Because of my bad experience I decided to dig a bit deeper and see what other SEOs and webmasters experiences were as I wasn’t convinced that I was a 1 in a million case.
Another webmaster ‘s experience that is very interesting is this:
“We implemented canonicals on product pages across the site to pick up some duplicate content issues (pagination, sort by parameters, etc). But we failed to realise that we’d put the wrong URL format within the canonical tags so that they all pointed to non-existent URLs!
Within a couple of weeks all the product URLs dropped out of Google and interestingly Webmaster Tools reported hundreds of 404 errors, treating the incorrect canonicals as broken links. We fixed the tags quickstyle and rankings came back, together with the additional boost we were originally expecting!”
So, the question is how it is possible for a “hint” to have such catastrophic consequences. Maybe it’s time for Google to update their outdated guidelines that strongly encourage people to have a go with the canonical tags or just caution people to be 100% confident about what they’re doing?
Google Warnings
Matt Cutts through his blog sent out an early warning saying “If you’re a power user, exhaust alternatives first“. Why should power users and not amateurs too exhaust alternatives first? Then he goes on saying “If we see abuse, we reserve the right to react as needed“. This one definitely should have been expanded and stressed a bit more. Any abuse examples Mr Cutts so we know exactly what the Google bot perceives as abuse?
Another comment made by a Google engineer in a forum is closer to what many users have experienced:
“When we see a canonical link element like that and follow it (which we mostly do), we’ll treat it similarly to a redirect. So if you play around with the rel=canonical, you have to be very careful because you won’t see the “redirect” that Googlebot will use for indexing.”
What if the rel=”canonical” links to the home-page?
This is what a user has experienced:
“I’ve heard two very similar horror stories just in the last month. In both cases, it involved a CMS snafu where someone unintentionally canonicalized 1000s of URLs to the home-page. The effects on their index were quick and disastrous.
Fortunately, in at least one case, the recovery was pretty quick, post-disaster, but it’s yet another example of how any tool can be dangerous in the wrong hands. Major architectural changes shouldn’t be made because someone read one SEO blog post.“
What if a caononical link points to itself?
This is a controversial call as Google has pointed out that there’s nothing to worry. Nontheless, many SEOs and Web developers are wary about the potential of that practice breaking things in the future or in unpredictable ways.
Where to use the canonical tag
- Affiliate links: where adding the affiliate ID in the URL is obligatory.
- Pages which aren’t exact duplicates in content but very similar
- A product page that is available under different categories thus appears under different URLs. Ideally, that should be 301 redirected but in many sites due to site architecture constraints that is not always possible.
- Printer-friendly pages that usually are exact duplicates of other pages on the site.
- In any situation where you cannot do a proper 301 redirect but you need to indicate what page is the one that should be picked up by the search engines.
What about plugins that automate the canonical tags?
Magento, WordPress, Joomla and other popular open source packages come with addons and plugins which are supposed to automate the canonical links. This is something you should be very careful with as some users have had reported problems. This is what a user has witnessed:
“We learned the negative effect of the canonical tag in the worst way. We installed an add-on for magento which seemed like a very good idea. It has some benefits to it that make the product worth the price. However, one of the features is the canonical tag. All of the sites were in development and beginning to get indexed. During the last index, 70% of the pages fell from the index causing complete havoc and panic. I believe that I have traced it back to the canonical URL being automatically inserted on product pages. If in fact the canonical tag would prevent Google from indexing a page, having it inserted on a site with thousands of products and pagination is a bad idea.”
Conclusion
Amateurs shouldn’t experiment with the canonical tag and they should seek for professional help. Because the effect of a badly implemented canonical links is not instantly obvious it can be assumed that it has been done correctly. However, a few days or weeks later pages may disappear out of Google’s indexes and extra work will be required to get them raking again.