Advertisement
  1. SEJ
  2.  ⋅ 
  3. SEO

Bot Herding: The Ultimate Tool for PageRank Flow

One important rule in SEO is that you cannot rely entirely on search engine spiders and bots when they visit your web site. They [bots] can generate duplicate content issues, perceive important pages as junk or cause other problems which actually don’t exist.

Googlebot has, in the past, mistakenly moved valuable pages into the Google supplemental index or has passed PageRank to pages that do not need to rank  i.e login, register, subscribe, e.t.c.
Therefore it is necessary to guide these bots if you want to avoid such problems. But how can you guide them?

The noindex HTML meta tag advises the search engines’ bots not to index a web page. On the other hand the nofollow meta tag advises the search engine bots not to pass rank through a particular link to the linked page. The nofollow meta tag does the same thing as rel=”nofollow” but at the page level, instead at the per link level. Are the above methods the ultimate PageRank sculpting solution?

Lets look into the following scenario:

Page A is linking to page B, and that link is attributed nofollow status. In this case Page A will not pass PageRank to Page B, right? But what is if page B is linked from another page of your web site or from an external web site without that link being protected with the nofollow attribute? Won’t PageRank be assigned to page B? Won’t a snippet show up in the search results? It will! Is that what you want? Wouldn’t you prefer to make sure that the incoming PageRank for page B is being passed to the most important web pages of your web site? Would it not be better to rank your most lucrative products or services pages? If your answer is yes, continue reading to find out how can you achieve this.

Many webmasters or SEOs might advise to also disallow page B is the robots.txt, using for example this:

User-agent: *
Disallow: /login.php

But will that solve the problem? Not always. As I said above, if Page B is linked from an external site without being protected by the nofollow attribute, PageRank will still be assigned to it, and it can still show up in the search results. But there is a way, in Google, to overcome this problem.

You can use one of the following methods to overcome the problem:

a. Adding to page B the noindex HTML tag like this:

<meta name=”robots” content=”noindex” />

or

b. Adding in your robots.txt the noindex directive using this:

User-agent: Googlebot
Noindex: /login.php

Did I say Noindex robots.txt directive? Yes! Unofficially Google supports the Noindex robots.txt directive, but at this moment it is not supported by other search engines. Using this directive you can block or advise Googlebot not to index a page (or de-index a page if previously indexed). But it will not hinder Googlebot to follow the links on page B and pass the incoming PageRank to the outgoing linked pages which are not protected.

But that is not all!

You SHOULD NOT use the nofollow HTML meta tag on Page B, because that will make Page B a dead end. And links pointing to dead end pages are called dangling links.

Now you might ask: Are Dangling Links a problem?

The answer may be found in the following extract from the original PageRank paper by Google’s founders, Sergey Brin and Lawrence Page:

“Dangling links are simply links that point to any page with no outgoing links. They affect the model because it is not clear where their weight should be distributed, and there are a large number of them. Often these dangling links are simply pages that we have not downloaded yet… Because dangling links do not affect the ranking of any other page directly, we simply remove them from the system until all the PageRanks are calculated. After all the PageRanks are calculated they can be added back in without affecting things significantly.”

That said, you still can add the nofollow attribute on outgoing links of Page B, but you must make sure that at least one link may be followed.

By implementing the above techniques, you can achive the maximum possible control over the PageRank flow within your web site.

Disclaimer: I said maximum possible control, but not 100% control. If you have an alternative  idea and you don’t mind sharing, please  feel free to do so .

John S. Britsios (aka Webnauts) is the Founder & CEO at SeoWorkers.com & Webnauts.net, a Web Architect & Senior SEO Consultant, specializing in Web Content Accessibility, Usability Testing, Search Engine / Social Media Optimization & Marketing.

 

Category SEO
ADVERTISEMENT
John Britsios CIO at SEO Workers

Founder and Chief Information Officer (CIO) of SEO Workers and Chief Executive Officer (CEO) of Webnauts Net, a qualified UX ...