Protecting Your Website from Canonical Triggers
There are times when search engine spiders can get lost and will not be able to locate or index your website. Even search engine spiders get confused! Let’s look at the following example:
office.com
www.office.com
office.com/index.html
www.office.com/index.html
https://office.com/index.html
https://office.com
Now there are some servers that use the mod_dir, which is capable of creating additional issues through redirecting a domain without using the trailing slash to a domain using a trailing slash. Hence office.com will be redirected to office.com/. This is quite a rare occurrence but still it can become an important issue. Most experts would say that you need to always make it a point to add trailing slash in any link that you want to use for the purpose of link building. In fact, this is also the right way of linking to a website.
Canonical stands for Authoritative Path and this is how the search engines are able to understand that the pages belong to your website. Since most of the times, you will be dealing with robots hence, you have to take some extra precautions because robots aren’t programmed to think and so, can’t think at all.
If one of the robots notices a page that has anywhere from 2 to 5 different paths then it will mark them as additional pages and not the main page. There is a high probability that the search engine spider might get confused seeing so many paths and hence might even duplicate all your pages and in the process place importance on an unimportant webpage. The fact of the matter is that it will end up putting priority on the index.html page instead of your domain. This is exactly the reason why you need to have a proper home page link like “http://www.office.com/” or even “/” but never use something like /index.xxx.
Canonical issues exist in different forms and create problems although these problems are not a regular occurrence thanks to different sitemap programs as well as the increase in awareness factor.
Duplicate Content Issues
This is another major issue and is related to Canonical issues. If you use the same contact form with several dynamic variables then duplicate pages will be created. So your form might look like contact.asp?id=New York or even as contact.asp?id=Florida. This basically means that Google will see the page in different ways and might end up treating it as spam.

There is a simple fix for the duplicate content issue and it can be done by… using a rel=”nofollow” tag or even by banning any contact.php wildcards within the file robots.txt. Off late, this has become quite a common task on most of the dynamic sites. This fix is required because without the fix, there can be a canonical trigger since the paths become duplicated. Another option is to add a SSL (secure sockets layer) certificate to your site. These are some of the ways of removing the potential threat of canonical issues as well as that of duplicate content problems
To understand different duplicate content issues and to have a error free website, you can get the SEOi free trial account and take the product tour.
comments
Leave a Reply



