Skip to content

Language on websites

Identifying the language of a web page, as well as the language of its individual parts, helps to ensure that screen readers will correctly pronounce the content.

For an overview of this issue, see Language in our IT Accessibility Checklist.

Techniques

Defining language in HTML

In HTML the language of content is identified using the lang attribute, the value of which is a standard BCP 47 Language Code. For example the following tag identifies the entire HTML document as being an English:

<html lang="en">

If a paragraph, table cell, list item, or any other block of text changes from the default language of the page, that too must be marked up with a lang attribute. For example, imagine that our English document contains a short paragraph in French, as in the following example:

<p lang="fr">Vaut mieux prévenir que guérir.</p>

Defining language in content management systems

WordPress, Drupal, and other content management systems all have a rich content editor for authoring content.  Most, if not all, of these products automatically add a lang attribute to the <html> element on all pages within a website. The default language of the website can be specified in the website settings.

Unfortunately, few if any rich content editors provide a mechanism for identifying the language of parts within a page. Therefore, the only way to specify the language of parts of the page is to switch to your editor’s HTML view and add lang attributes to the outer HTML element of any foreign language content, as explained in the preceding section.  After doing so and saving the page, be sure to inspect your source code to be sure your editor preserved the code you added. If your editor is stripping out lang attributes after you’ve added them, talk to your website admin, as this is likely a configurable setting.