Google has recently been granted a patent by the US government which the company originally filed in 2003. The patent, writes Bill Slawski of SEO by the SEA, explores how Google would (a patent does not always have to be used) use anchor text links to rank and classify pages, Google crawls some sites more frequently than others, and how the search engine would treat links which use a redirect different than a natural link.
What’s quite interesting in the patent is how Google would form a link log for a linked document which takes into account the text surrounding the link itself:
A URL document is a document obtained from a URL by a robot and passed to content filter. Each link record lists the URL fingerprints of all the links (URLs) that are found in the URL document associated with a record, and the text that surrounds the link.
For example, a link pointing to a picture of Mount Everest might read “to see a picture of Mount Everest click here.” The anchor text might be the “click here” but the additional text “to see a picture of Mount Everest” could be included in the link record.
Hence, the importance of making sure your inbound links are part of an editorial piece which is relevant to the information on the target URL’s page, and perhaps why some SEO’s in the industry have seen the influence of a sitewide link drop, while editorial links can be much more powerful.
Another interesting and very timely snippet is the treatment of URL redirects, especially in the current world where a good piece of linkbait will be around one day, and then redirecting to a site’s homepage the next.
The patent on Handling Permanent and Temporary Redirects:
Robots do not follow permanent redirects that are found at URLs that they have been requested to crawl, but instead send the source and target (redirected) URLs of the redirect to the content filters.
The content filters take the redirect URLs and place them in link logs where they are passed back to URL managers. It is the URL managers that determine when and if such redirect URLs will be assigned to a robot for crawling. Robots are set to follow temporary redirects, and obtain page information from the temporary redirects.
Mr. Slawski has put together an overview of one of the most important patents to be granted since Yahoo’s Paid Search Bid Patent and every practicing search marketer should dig deeply into Slawski’s overview and the original patent filing.
- Bill Slawski’s Post : Google Patent on Anchor Text and Different Crawling Rates
- The Patent : Anchor tag indexing in a web crawler system