Creating a Multilingual Site

This post is a collection of useful tips for creating a multilingual site. As I learn more I will keep the post updated. I hope it will be a good reference or at least a starting point for a range of site internationalisation concepts.

User Interface

Language Selection

A fundamental aspect of a multilingual site is how the user switches languages.

The link text of language selectors should be presented in the language of the target page. Providing this courtesy can help the user quickly identify the link to their language if they land on a page in another.

A title attribute can then be used to specify the language of the target page in the language of the current page. This text will appear as a tooltip when the user hovers over the language option.

Link to English on a Japanese page: English

Link to Japanese on an English page: 日本語

Link to English on a Japanese page: <a title="英語" lang="en">English</a>

Link to Japanese on an English page: <a title="Japanese" lang="ja">日本語</a>

3rd Party Components

Most websites have share buttons from a variety of social media sites. It is good to remember to ensure that these components are localised for the page language and don’t default to English for all localisations. I use Twitter, Facebook and Linkedin buttons on this site, all of which support a wide variety of languages.

Here are some example twitter buttons:

<a class="twitter-share-button"
href="https://twitter.com/intent/tweet?text=Creating a Multilingual Website&hashtags=web"
data-lang="en">Tweet</a>

<a class="twitter-share-button"
href="https://twitter.com/intent/tweet?text=多言語のサイトを作ること&hashtags=ウエブ"
data-lang="ja">ツイート</a>

Accessibility

lang Attribute

The lang attribute specifies the language of an HTML element and is fundamental to creating an accessible multilingual website. Screen readers rely on the attribute to determine which voice to use to read the page.

It should be specified in all <html> elements. Its value will then be inherited by all elements on the page.

<html lang="en">
</html>

It should be specified for elements which differ in language from their ancestors.

<p lang="en">
  This paragraph contains some Japanese text:
  <span lang="ja">日本語</span>
</p>

For an example of what the user experience may be like for pages without the lang attribute set see this quick and clear screen reader demo. It is such a simple thing to implement but leaving out the lang attribute can make your pages near unusable to those who rely on screen readers.

Search Engine Optimisation

hreflang

To help make sure your international pages are presented in SERPs in the correct regions it is important to ensure search engines can easily understand the language pages are in and which pages contain translations of the same page.

This is achieved by specifying hreflang links in the document <head> or alternatively in your site’s sitemap.xml.

I have personally used the <head> method on this site which I have documented here.

The hreflang attribute correlates translations of the same page to each other.

  • Each translation must link to itself.
  • Linked translations must reference each other. If page A links to page B, then page B must link to Page A.
<!-- This English page -->
<link rel="alternate" hreflang="en"
href="https://www.darrenlester.com/blog/geostationary-satellite-animations">
<!-- Japanese translation of this page -->
<link rel="alternate" hreflang="ja"
href="https://www.darrenlester.com/ブログ/静止衛星のアニメ">
<!-- This Japanese page -->
<link rel="alternate" hreflang="ja"
href="https://www.darrenlester.com/ブログ/静止衛星のアニメ">
<!-- English translation of this page -->
<link rel="alternate" hreflang="en"
href="https://www.darrenlester.com/blog/geostationary-satellite-animations">

Specifying the links in the <head> like this is fine for this website as it only supports English and Japanese. I feel for sites supporting a large number of languages it makes more sense to use the sitemap.xml method as this will greatly reduce the number of bytes you have to send in each HTTP response. The link information is for the search engines not the user so you may as well include the links in the sitemap, reducing how many bytes you have to send and making a small improvement on page load speed for your users.

Structured Data

Schema.org presents a number of structured data properties related to language which can be used to help describe your international content. I am not sure how widespread current support for these properties is but they may be useful in the future:

Style

Styling by Language

Elements can be styled by language by using a [lang] attribute selector or the :lang pseudo-class. See this W3C article for the differences between the two methods. This is especially important for languages which do not use the Roman alphabet as the different scripts may not appear very nice in many of the common fonts used for English.

Code Snippets

URL Decoding and Encoding

There are plenty of websites for helping to decode and encode URLs but I find it can be handy to also be able to do so from a terminal. Here are some aliases that utilise python for URL encoding and decoding.

alias urlencode='python -c "import sys, urllib; \
print urllib.quote_plus(sys.argv[1])"'

alias urldecode='python -c "import sys, urllib; \
print urllib.unquote_plus(sys.argv[1])"'

urlencode ブログ # %E3%83%96%E3%83%AD%E3%82%B0

urldecode %E3%83%96%E3%83%AD%E3%82%B0 # ブログ

Conclusion

I hope this post is helpful and if you have any questions, thoughts or tips please do share in the comments.