SEMrush found in 2017 that 75% of websites have hreflang implementation errors, and four years on it doesn’t appear this has improved by much.
In my own 2020 study of 1,926 Salesforce Commerce Cloud websites, I found that of those with hreflang implemented, 63% contained errors.
Localization and user issues, such as an Arabic website not reading from right to left, likely increase these figures, as well.
In this column, you’ll learn about the different facets of a multilingual website that you’ll need to plan for, how to structure your hreflang tags, and how to successfully implement your hreflang strategy.
Selecting Target Countries & Planning Site Architecture
When planning your online international expansion and deciding on target markets, you also need to consider how you’re going to target them.
There are four main ways in which the URL structure can reflect internationalization:
Using Different ccTLDs
Using different ccTLD domains. This is considered the best practice for targeting Russian and China, in particular. An example of this in practice is Hartley Botanic.
International Subdomains
Using a single domain, typically a gTLD, and using language targeted subdomains. An example of this in practice is CNN, which uses a subdomain to differentiate between the U.S. and U.K. English sites.
International Subdirectories
Again using a single domain, typically a gTLD, different language and content zones are targeted through a subdirectory. An example of this in practice is BeatsByDre.
Anecdotally speaking, this is oftentimes a favored approach by developers.
Parameter Appending
I don’t recommend implementing this method, but I do see it a lot. This is where the domain is appended with a ?lang=de parameter or similar.
It’s also worth noting that some third-party tools will flag parameter hreflang as errors, as their own bots don’t recognize the ?parameter strings.
Structuring Hreflang Tags & Targeting Users
Hreflang always starts with targeting language but then can consist of further variables such as:
- Language: “en”, “es”, “zh”, or a registered value.
- Script: “Latn”, “Cyrl”, or other ISO 15924 codes.
- Region: ISO 3166 codes, or UN M.49 codes.
- Variant: Such as “guoyu”, “Latn”, “Cyrl.”
- Extension: Single letter followed by additional subtags.
Regardless of how targeted your tags are, they must also follow the below format:
{language}-{extlangtag}-{script}-{region}-{variant}-{extension}
Probably the most common interpretation of the above that the majority of us will be familiar with is {language}-{region}.
However, if you do a lot of work in Chinese-speaking countries you’re more likely to use {language}-{script}-{region}, such as zh-Hans-cn (Simplified Chinese for the Chinese mainland).
The Internet Engineering Task Force (IEFT) specifications can be found here.
Language Tag
The supported language code comes from the ISO 639-1 classification list. However, in some instances, the extended language tag {extlangtag} can be used on its own.
Extended Language Tag
{extlangtag} tags are subtags that can be used to specify selected languages that are closely identified with an existing primary language subtag. Examples of these are:
- zh-yue: Cantonese Chinese.
- ar-afb: Gulf Arabic.
The extended language tags come from the ISO339-3 classification list.
There is also a code within this classification list for en-eng, which is the extension code for English – and is why en-eng when implemented as English for England works (but not as intended).
Script
The script subtag was introduced in RFC-46464, and they come from the list of ISO 15924 classification list. Only one script subtag can be used per hreflang tag.
Examples of these include:
- uz-Cyrl: Uzbek in the Cyrillic script.
- uz-Latn: Uzbek in the Latin script.
- zh-Hans: Chinese in the simplified script.
- zh-Hant: Chinese in the traditional script.
Region
Region codes come from the ISO 3166-1 alpha-2 list and along with the language tag.
Common mistakes include attempting to target “RW” as the rest of the world when it’s the country code for Rwanda, and “LA” as Latin America, when it’s Laos.
Variant
The variant subtag can be used to indicate dialects, or script variations, not covered by the language, extended language tag, or region tag.
It’s highly unlikely that you’ll come across variant subtags unless you work in niche and specialized areas. Examples of these variants are:
- sl-SI-nedis: The Nadiza dialect of Slovenia, as spoken in Slovenia.
- de-DE-1901: The variant of German orthography dating from the 1901 reforms, as spoken in Germany).
Extension
Extension subtags allow for extensions to the language tag, such as the extension tag “u,” which has been registered by the Unicode Consortium to add information about the language or locale behavior.
It’s highly unlikely you will ever need to use these.
When implemented correctly, it should look something like this:
<link rel="alternate" hreflang="en-gb" href="https://website.co.uk" /> <link rel="alternate" hreflang="en-us" href="https://website.com" /> <link rel="alternate" hreflang="es-es" href="https://website.es" />
Other Hreflang Considerations
Targeting Language or Countries
Some of the issues I see with hreflang implementations are that they aren’t made wholly in conjunction with business goals and objectives about which markets they’re targeting.
For example, launching a Spanish website with the code hreflang=”es” won’t just target Spain, it will also provide a localized version for a number of other Spanish-speaking countries across Latin America, the Caribbean, and the Spanish-speaking population of the United States.
Getting the hreflang implementation correct is important in ensuring users are delivered:
- Content in the correct localized language.
- Website templates that cater for a user experience that they’re used to (i.e., in Baidu items open in new tabs).
- Products, services, and offers relevant to their country (as well as being legal within their country), as failing this can lead to bad customer experience, a lost customer, and negative reviews.
Return Tags
If page A links to page B through hreflang, then page B must link back to page A. If not, your hreflang might not be read correctly.
These errors are highlighted in Google Search Console, so it’s important that profiles are set up to cover each of the localized site versions.
Absolute URLs
There is a lot of debate about absolute vs. relative URLs. I always defer to Ruth Burr Reedy’s Moz Whiteboard Friday on the topic.
When it comes to hreflang annotations, however, there is no debate – these need to be absolute URLs referenced in the href subtag.
It’s also important to note that when working to target international users:
- Don’t use IP redirects, as it can break Google’s indexes (also remember Google crawls primarily from the U.S.).
- If you’re using a .com, and you’ve implemented one of the above, don’t redirect your root domain to your “main website;” Google will use the hreflang to point users to the correct site.
- Only use x-default to point to a language selector page/default page for users worldwide. A great example of this in practice is IKEA, which behaves as a language selector, but x-default can also be used to indicate a default fall-back version of the website for global users.
Implementing Hreflang for Your Multilingual Website
When implementing hreflang, you must consider the countries that you’re targeting and whether users in those countries support Google as the dominant search engine.
Do All Search Engines Support Hreflang?
There haven’t been many developments in Hreflang support over recent years, however, Yandex officially introduced XML sitemap support in August 2020.
Search Engine | Does it support hreflang? |
Yes, through both HTML and XML Sitemap | |
Yandex | Yes, through HTML and XML Sitemap |
Baidu | No, need to use the HTML Meta Language Tag |
Naver | No, need to use the HTML Meta Language Tag |
Bing | No, need to use the HTML Meta Language Tag |
Seznam | Yes, through HTML |
The status of other search engines remains the same.
Because of Bing’s market share (which is widely debated) globally, best practice means always including both hreflang and the HTML Meta Language Tag as part of an international technical specification.
<meta http-equiv="content-language" content="en-gb">
Sometimes Google Ignores Errors If It Can Work It Out
In April 2018, Patrick Stox wrote this great article looking at errors in hreflang that Google can work out (sometimes) and display content as though the tag implementation met best practice.
One of these is the use of incorrect targeting codes.
Eoghan Henn sparked the conversation on Twitter in follow-up to SEO pros seeing the hreflang targeting en-UK and {language}-EU working, even though they’re incorrect.
If Google can interpret the implementation, then it has a vested interest in trying to display the right content to users.
However, as Google’s John Mueller goes on to say:
“My guess is it doesn’t actually work (would be interesting to test though)… IMO you don’t want to implement something in the *hope* that search engines can guess what you mean; it should be exact & consistent instead.”
Learn how to properly use and structure hreflang tags for multilingual websites to build visibility and better serve your users in each market.
More International Search Resources:
- Hreflang Implementation: The 8 Biggest SEO Misconceptions
- The Best Practices of Optimizing for International SEO
- How to Develop a Solid Business Case for Hreflang Implementation
Image Credit
Screenshot taken by author, March 2021