Advertisement
  1. SEJ
  2.  ⋅ 
  3. Google Patents & Research Papers

PageRank Patent Update – How it Impacts SEO

PageRank Patent Update – How it Impacts SEO

Bill Slawski published an article noting that Google updated a patent related to PageRank. This is an important algorithm because it affects how sites are ranked and explains why some sites rank well while others do not.

How Does this Affect Link Building?

This changes the game for link building. Actually the game has been changed for awhile now. The algorithms described here are closely tied to what we know about the Penguin Algorithm.

This affects link building because it is calculating link distances between an authoritative and spam free site and the sites it links to. These links are also divided by topic.

For link building, the ideal link is going to be a link from a site that is as close as possible to the most authoritative and high quality site in that niche. The difference is that the high quality sites are different for every niche. This changes what is meant by an authority site.

An untrustworthy site will ordinarily not be able to obtain many links from sites closest to the most trusted and authoritative sites in their topics. The link distance will be far.

If Google still showed the PageRank score of that site, the actual PageRank score would be somewhat irrelevant in obscure niches. There aren’t many sites linking in a small niche to build up large amounts of PageRank. This gives smaller sites in smaller niches a boost in rankings.

And that is why this algorithm is a breakthrough. It allows sites in small niche topics to be able to outrank larger sites that have more links. And this affects link building because it changes the solution for ranking.

The solution for ranking is no longer about getting the biggest and most authoritative link. But it becomes a problem of identifying the links that are the shortest distance from the most authoritative sites about the topic you want to rank for.

That’s the big takeaway for ranking. What follows is a geek-out on the details. You don’t have to read it. But if you want to know how search engines work then you’re going to like reading it!

What has Changed With PageRank?

Probably not a whole lot. This patent doesn’t seem to introduce much that is different. But what is noteworthy is that the author cited in the patent is connected to a related patent called that’s about the methodology for computing link distances Scalable System for Determining Short Paths Within Web Link Network.

Here is the abstract, it’s really un-sexy, like plumbing:

“Systems and methods for finding multiple shortest paths. A directed graph representing web resources and links are divided into shards, each shard comprising a portion of the graph representing multiple web resources. Each of the shards is assigned to a server, and a distance table is calculated in parallel for each of the web resources in each shard using a nearest seed computation in the server to which the shard was assigned.”

See what I mean? It’s about dividing niche topics by servers and letting those servers do their calculations in a distributed manner so that if one computer fails the others pick up the slack.

Why Plumbing Affects PageRank

This is important because scaling the computations is a core problem to computing large scale seed set algorithms, as noted by the patent update cited by Bill Slawski:

“Hence, what is needed is a method and an apparatus for producing a ranking for pages on the web using a large number of diversified seed pages without the problems of the above-described techniques.”

Is Google Introducing a Bias by Using “Trusted” Sites?

This algorithm is about creating a map of links between web pages. The starting point for this map of the web is what Google determines is the most authoritative and spam free sites.

All this does is to try to create a spam-free map of the Internet. There are other parts of Google’s core ranking algorithm that calculate other factors, said to number over 200.

You can think of this algorithm as the gate keeper that determines what pages will be considered for ranking. They don’t decide who ranks and who doesn’t. There are over 200 ranking factors that help determine that.

Why Use a Trusted Seed Set?

By using a “trusted” seed set they are just setting up parameters for what constitutes an authority site for X Topic. This is following on what was discovered in the Topical Trust Rank paper that’s cited in the patent.

The original TrustRank algo was biased toward big sites. Researchers developed a new form of Trust Rank called Topical Trust Rank. Topical Trust rank solved the big site bias by dividing the web into shards that represented different topics and niches.

The seed sets were chosen from among those different shards that represented topics and niches. This meant that you can have a little tiny niche with a relative few sites on a topic, but still have an authority site that links out to related sites.

The effect of this is that these small sites have a chance at ranking because they are authoritative on those obscure topics.

Is Google’s Algorithm a Topical Trust Rank Algorithm?

No. The difference between Google’s algorithm and the various trust rank algorithms is that there is no actual THING called Trust that is being assigned to each web page (called a node in the patent).

In trust algorithms there is a thing called Trust that is being propagated the way PageRank is spread from page to page through links. Not so with Google’s algorithm. There is no thing called trust.

The only thing they are doing is starting from “trusted” sites. Don’t let that word “trusted” confuse you into thinking that this is a trust algorithm. It’s not. It is a Distance Ranking Algorithm.

They are only selecting sites that can be trusted. Which means they are legit and spam free. Google is not using a thing called Trust to spread from link to link.

Google’s Patent Does Not Use the Word Trust

Google’s patent doesn’t even use the word trust. And they are not using a thing called Trust to calculate PageRank.

They are starting from trusted pages to calculate distances between the seed set and the pages they link to and the distance further away. Google’s patent is ranking according to distance. So the further away you are (distance) the likelier you are irrelevant or spam.

In a trust ranking algorithm, there is an actual thing called trust that is spread from link to link that is calculated. Not so with Google’s algorithm.

Google Uses the Word Trusted

Google is using the word Trusted, not the noun version, trust. Google uses the word Trusted to mean that a site is legit. One could change that word to Spam-Free and not really lose the meaning.

A site that’s legit and trusted makes a good starting point from which to calculate link distances. But that’s a far cry from saying trust is being measured. It’s not trust that is being measured. It is the distance. This is a link distance ranking algorithm called ( Producing a Ranking for Pages Using Distances in a Web-Link Graph ).   It is not about trust. It’s about the distance from a trusted site to another web page.

I’m anticipating that others will start calling it a trust algorithm. But they will be wrong. Google’s patent only states that the algorithm is starting with trusted sites then measuring the distances from those sites. They are not saying there is a thing called Trust being used to calculate PageRank.

 

Images by Shutterstock, modified by Author

ADVERTISEMENT
SEJ STAFF Roger Montti Owner - Martinibuster.com at Martinibuster.com

I have 25 years hands-on experience in SEO, evolving along with the search engines by keeping up with the latest ...