Although all external Wikipedia links are using the nofollow attribute since January of this year, links from the encyclopedia are still important because of other reasons than ranking in search engines.
Next to generating human traffic from people who read Wikipedia articles, external links from Wikipedia also reinforce the authority of a site for the subject of the article at Wikipedia.
Both of these can be very significant and should not be underestimated. For competitive intelligence reasons is it also interesting to know, if your competitor has links to his website within the main article name space at Wikipedia. Of special importance are those links that are references for industry specific subjects and not just a link from the article about the website or company.
There are several methods and tools available to find out about links in Wikipedia in addition to other useful information. Most methods and tools I am introducing here are for a more tech savvy audience (= developers), but a few of them don’t require programming abilities to use them.
1. Wikipedia External Link Search
The Wikipedia External Link Search feature is build into Wikipedia. The good thing about it is the fact that it searches the current Wikipedia database and makes it the most up to date link search tool available.
The problem is that the search pattern is case sensitive and increases the chance that you miss URLs, because of capitalization. Most domain names in URLs are spelled in lower-case though and the search will do a good job for basic domain searches in most cases.
2. Wikipedia Search by DomainTools.com
DomainTools.com is known for their free WhoIs lookup among other great tools for domainers.
One great feature is the ability to search from the WhoIs results and others for references to the domain within Wikipedia. It also has a search that returns all external links for a specific Wikipedia article.
I have not found a “search form” or something like that, only specific links that do a search for a specified domain name or Wikipedia page (including pages outside the main articles name space).
The URL for the domain search looks like this: http://www.domaintools.com/enwikipedia/domainname.tld
For example: http://www.domaintools.com/enwikipedia/seobook.com
The URL for the look up of all external links from a specific Wikipedia article looks like this: http://www.domaintools.com/en/article_name
Note: Replace spaces in the article name with underscores
For example: http://www.domaintools.com/en/Search_engine_optimization
Nice about this tool is the inter-linking to further searches from each result pages, for example to the domain search from the article search results and to the article search from the domain search results.
The tool utilizes the data from the Wikipedia article dumps, which means that the data are not real-time.
3.Wikipedia Database Dumps
Wikipedia makes the content of ALL their wiki databases available for download to the public. Guess where the wiki clones and Search.com get their Wikipedia data from? The dump of the English language Wikipedia main space is the most interesting one for US, UK and other marketers who serve markets that speak the English language.
The name of that dump is “enwiki” and it is not being done every day. The last dump of the “enwiki” was on June 4, 2007. That is over two weeks ago. However, most links are not changing that often. Spam links do not last very long and if they do, only because they are on pages nobody cares about, and that includes us.
Here are the links that are relevant to the Wikipedia Dump.
- Primary article at Wikipedia to Wikipedia Dumps
- Download location of the English Wikipedia dumps
- List that shows the last date and time of all dumps
- Request for dumps – info page; I doubt that they will perform a special dump, if you ask them to, but who knows, you might want to try it anyway.
The dumps are more for the technical experts who can deal with the amount of raw data to work with it and use it for their purposes.
There is a program for searching the database dump available though and worth checking out.
It is free of course and was developed by a Wikipedian with the name “Bluemoose”. You can find the link to the download of the program and a description of its features on this special user page for the “DataBaseSearchTool”.
4. Wikipedia Special Export
Like the Wikipedia Dump Is this feature for the technical folks rather than the average marketer out there. You can export the text and editing history of a particular page or set of pages wrapped in some XML.
The page where you initiate special exports is available at:
http://en.wikipedia.org/wiki/Special:Export
5. Wikipedia API (alpha)
An API is in development to access information at Wikipedia. The project is currently in alpha stage, but the API is accessible to the public.
The API is located at: http://en.wikipedia.org/w/api.php
For documentation and examples, check out the API homepage at the Mediawiki website at: http://www.mediawiki.org/wiki/API
There is a limitation build in. The API returns a maximum of 500 results for any request. The limit increases to up to 5000 results, if you spend the time to become a sysop or get your bot authorized. The alternative would be to ask an existing sysop for access to the API through him.
6. Wikipedia Query
The Wikipedia Query feature will probably eventually replaced by the Wikipedia API. The Query is available at: http://en.wikipedia.org/w/query.php
The documentation is provided on the page itself. Calls are being made by the simple addition of URL parameters. The Query is able to return results in multiple machine-readable formats for programmatically access, such as xml, json, php, yaml and wddx, and also in human readable format.
The Query does not provide the option to search for external links. You can find internal linking and category structures and more. To find external links, is it unfortunately necessary to request an articles full content and parse the external links out manually.
This example query returns the content of the article to search engine optimization:
The source code for the query tool is available for download and suggestions to features can be made to the authors. They might be willing to implement an external link feature if they will be motivated enough to do so :).
7. Search Engine Queries
For example a Yahoo! search with linkdomain:yourdomain.com and site:en.wikipedia.org parameters, such as this query for links to SearchEngineJournal.com.
For more Wikipedia resources, check out my Wiki Resources user page at Wikipedia itself. It provides references to a bunch of other tools and resources to Wikipedia, which you might find useful as well.
Carsten Cumbrowski
Internet marketing and web development resources at Cumbrowski.com, such as Web APIs and resources for Web Services development