Google, Yahoo and MSN Agree on the Canonical Link Tag
The latest news coming from the the three major search engines is a major improvement to how Websites are indexed by search engines. The idea of the Canonical Link Tag is that a website owner can specify a preferred version of a particular URL. What does that mean? If your site has identical or similar content (accessible through several different URLs), the Canonical link tag helps search engines calculate the most preferred URL. How Does it Operate? The tag is part of the HTML header on a web page, the same section you’d find the Title attribute and Meta Description tag. In fact, this tag isn’t new, but like nofollow, simply uses a new rel parameter. For example:
<link rel="canonical" href="http://www.yoursite.org/yourpage.php?5473893993">
This would tell Yahoo!, MSN or Google that this page, where you place the tag will be treated as www.yoursite.com/yourpage.php. Therefore all links, as well as content metrics a search engine would apply should tie back to that URL as though it were one and the same
The Canonical URL tag attribute will work in a same way as a 301 redirect . In essence, you’re telling the engines that multiple pages should be considered as one (which a 301 does), without actually redirecting visitors to the new URL. The major difference will be that although a 301 redirect sends all traffic (bots and human visitors), to a new page, the Canonical URL tag is just for search engines. This means you can still separately track visitors to the unique URL versions.
Here is the info from the Main Search Engines themselves:
Is rel=”canonical” a hint or a directive?
It’s a hint that we honor strongly. We’ll take your preference into account, in conjunction with other signals, when calculating the most relevant page to display in search results.
Can I use a relative path to specify the canonical, such as?
Yes, relative paths are recognized as expected with the tag. Also, if you include a link in your document, relative paths will resolve according to the base URL.
Is it okay if the canonical is not an exact duplicate of the content?
We allow slight differences, e.g., in the sort order of a table of products. We also recognize that we may crawl the canonical and the duplicate pages at different points in time, so we may occasionally see different versions of your content. All of that is okay with us.
What if the rel=”canonical” returns a 404?
We’ll continue to index your content and use a heuristic to find a canonical, but we recommend that you specify existent URLs as canonicals.
What if the rel=”canonical” hasn’t yet been indexed?
Like all public content on the web, we strive to discover and crawl a designated canonical URL quickly. As soon as we index it, we’ll immediately reconsider the rel=”canonical” hint.
Can rel=”canonical” be a redirect?
Yes, you can specify a URL that redirects as a canonical URL. Google will then process the redirect as usual and try to index it.
What if I have contradictory rel=”canonical” designations?
Our algorithm is lenient: We can follow canonical chains, but we strongly recommend that you update links to point to a single canonical page to ensure optimal canonicalization results.
Can this link tag be used to suggest a canonical URL on a completely different domain?
No. To migrate to a completely different domain, permanent (301) redirects are more appropriate. Google currently will take canonicalization suggestions into account across subdomains (or within a domain), but not across domains. So site owners can suggest www.example.com vs. example.com vs. help.example.com, but not example.com vs. example-widgets.com.
Sounds great—can I see a live example?
Yes, wikia.com helped us as a trusted tester. For example, you’ll notice that the source code on the URL http://starwars.wikia.com/wiki/Nelvana_Limited specifies its rel=”canonical” as: http://starwars.wikia.com/wiki/Nelvana.
• The URL paths in the tag can be absolute or relative, though we recommend using absolute paths to avoid any chance of errors.
• A tag can only point to a canonical URL form within the same domain and not across domains. For example, a tag on http://test.example.com can point to a URL on http://www.example.com but not on http://yahoo.com or any other domain.
• The tag will be treated similarly to a 301 redirect, in terms of transferring link references and other effects to the canonical form of the page.
• We will use the tag information as provided, but we’ll also use algorithmic mechanisms to avoid situations where we think the tag was not used as intended. For example, if the canonical form is non-existent, returns an error or a 404, or if the content on the source and target was substantially distinct and unique, the canonical link may be considered erroneous and deferred.
• The tag is transitive. That is, if URL A marks B as canonical, and B marks C as canonical, we’ll treat C as canonical for both A and B, though we will break infinite chains and other issues.
• This tag will be interpreted as a hint by Live Search, not as a command. We’ll evaluate this in the context of all the other information we know about the website and try and make the best determination of the canonical URL. This will help us handle any potential implementation errors or abuse of this tag.
• You can use relative or absolute URLs in the “href” attribute of the link tag.
• The page and the URL in the “href” attribute must be on the same domain. For example, if the page is found on “http://mysite.com/default.aspx”, and the ”href” attribute in the link tag points to “http://mysite2.com”, the tag will be invalid and ignored. However, the “href” attribute can point to a different subdomain. For example, if the page is found on “http://mysite.com/default.aspx” and the “href” attribute in the link tag points to http://www.mysite.com”, the tag will be considered valid.
• Live Search expects to implement support for this feature sometime in the near future.
A great article by Matt Cutts on his blog about the new Canonical Tag