Translations

''Note: We switched in September 2015 to using Zanata for the Liberty cycle. Please direct questions to the openstack-i18n mailing list and update the documentation below to fully explain Zanata.''

= Translation, Internationalization and Localization in OpenStack =

OpenStack is committed to broad international support, and as such there must be an ongoing concern with making OpenStack usable for all audiences. This includes proper use of internationalization and localization tools by developers, and high-quality translations for both user-facing messages and documentation.

Translation & Management
Let's start with a working definition: translation is the act of taking the written materials in one language and converting them into another language in the most meaningful way possible. In terms of OpenStack, translation happens on both the written documentation and on strings marked for translation in the projects' codebases.

NOTE: information on how to prepare your code or documentation for translation, see the section on internationalization below.

Zanata
OpenStack is using a Zanata instance running at https://translate.openstack.org/ as translation management platform.

Downloading translation files
If you wish to download the translation files (.po files) you can do so by selecting the language you're interested in, then clicking on the name of the project resource you wish to download. In the modal dialog which appears, you can use any of the "download" options depending on your use case.

Translating on the site
Translation is most efficiently done right on Zanata's site. You don't need to download any files or applications to get started.

If your language already exists, select a project, select the "master" version, select your language and then select a document to start translating.

Downloading translation files
You can also download translation files and translate locally.

TODO: Explain exactly how this is done with Zanata.

Release cycle
One of the most challenging aspects of managing translations in an Open Source project is handling the interplay between translators and developers during the release cycle. The key piece of this equation is the "string freeze".

String Freeze
NOTE: OpenStack's string freeze happens at the close of the final milestone in the development cycle, giving translators the entire RC period to update translations.

At a predefined time during the release cycle there will be a "string freeze", which means that after this point strings marked for translation in the codebase can no longer be changed except in the case of critical-priority bugs.

Once the string freeze is in effect, the translation files in Zanata can be assumed to be static, and translation efforts should happen in full force. This is not to say that translation can't happen all the time. But during the development process strings may change and translation efforts may end up being wasted.

Any changes during the RC period should be carefully vetted to ensure they do not alter or add translation strings, or else coordinated with translators to ensure that changes are handled appropriately.

Check out http://docs.openstack.org/project-team-guide/release-management.html for more details.

Re-incorporating Translations
The OpenStack Infrastructure team has set up automatic generation of reviews for translations so that they can be re-incorporated with minimal effort at any time. For each project where this is setup in our CI infrastructure, every day a job is run. This job regenerates the original pot file and imports all well enough translated files and then proposes them to the project as patches. Only files that have 75 per cent or more translated strings are downloaded.

The list of current open proposed imports is available at review.openstack.org.

Most importantly though, immediately prior to the release of each Release Candidate, and before cutting the Final Release for each version, the translation files should be merged back into their respective projects to make sure they are properly distributed with the release.

At present it is the responsibility of each project's PTL or appointed translation manager to make sure this happens, though OpenStack's release managers, translation team coordinators, etc. are also encouraged to help ensure that this happens smoothly.

Stable Releases and Backports
At present, changes to translations will not be backported to stable release branches. Doing so would require maintaining wholly separate copies of each set of translations and massively increases the burden on translators.

Internationalization (i18n)
The term internationalization is used to broadly describe coding practices that allow software to be adapted to the linguistic and technical differences of various regions. This includes practices such as marking strings for translation, supporting non-ASCII character sets, etc.

Python Projects (General)
For most of the OpenStack core projects (and any that use Python), the preferred tools for internationalization are gettext and babel (Debian/Ubuntu package name python-pybabel). Getting started is pretty easy:

Adopt oslo.i18n
First step is to adopt oslo.i18n in your project - How to Use oslo.i18n in Your Application or Library

Extract messages
Once you have some messages to translate, we need to extract those messages using Babel. The easiest way is to run "python setup.py extract_messages" in say the py27 venv.

Configure your project to use Babel to easily create your translation files. First, add `Babel` to your requirements.txt file (or wherever you track dependencies). Second, create a `babel.cfg` file in the root of your project; at it's simplest it can just contain this line: [python: **.py]

Finally, add the following to your `setup.cfg` file: [extract_messages] keywords = _ gettext ngettext l_ lazy_gettext mapping_file = babel.cfg output_file = /locale/ .pot

That will allow you to run `python setup.py extract_messages` and have it automatically generate the base translation resource file for your project.

Now you are ready to merge the generated files into your project (see example review). Note that an initial file needs to be imported into your project for the scripts that interact with the translation site.

Setup Zanata server, import and export of translations
Now you are ready to setup Zanata and the CI infrastructure. Read the Infra manual on how to do it.

Horizon (Django)
Django has built-in internationalization tools that go well-beyond the basics of `gettext` to ensure proper unicode support throughout the entire codebase and to make advanced features more accessible. As such, Horizon uses Django's family of `ugettext` functions from `django.utils.translation`. It is preferrable to explicitly import the translation function you wish to use:

#!highlight python from django.utils.translation import ugettext, ugettext_lazy # ..., etc.

For more information on the internationalization tools Django makes available, see the Django i18n Docs.

Documentation (DocBook)
While developer documentation for projects can generally be maintained solely in English, user-oriented documentation such as that produced and maintained by OpenStack's Docs team is also a high-priority for translation. This includes installation and administration manuals.

''NOTE: For the first release this does not include API documentation. Typically these are sourced in the `openstack-manuals` project.''

For specifics on translation of OpenStack Documentation, please refer to the Documentation/Translation.

What To Translate
At present the convention is to translate all user-facing strings. This means API messages, CLI responses, documentation, help text, etc.

See LoggingStandards for information about translating log messages.

Exception text should not be marked for translation, because if an exception occurs there is no guarantee that the translation machinery will be functional.

Localization (L10n)
The term localization is used more specifically than internationalization to cover coding practices that allow a software's input and output characteristics to adjust to variances in style from region to region. This includes things like number and date formatting, especially.

Dates, Numbers, and Other Concerns
Going beyond What is accomplished by Internationalization, the most important aspect to consider is regional differences in formatting for dates and numbers. For example::

Dates: 04/01/2012 == April 1st, 2012 (US) 04/01/2012 == January 4th, 2012 (UK)

Numbers: 1,000.42 == One thousand and 42 hundredths (US) 1.000,42 == One thousand and 42 hundredths (EU)

Accepting any format and naively passing it into our code would horribly break things. Accepting only one format leaves out large chunks of the world. Therefore, we use localization tools to accept these formats and normalize them into data structures Python can handle universally on input, and to convert them back to the user's expected format for display.

Another less common (for OpenStack) issue related to localization revolves around name formats, which vary culturally. The western style of "first name" and "last name" doesn't fit for many cultural naming conventions. This isn't something a software tool can account for, so for problems such as these the best solution is to simply accept the broadest range of inputs (e.g. a single "name" field).

Horizon (Django)
Horizon has excellent localization tools available since it is built on top of the Django web framework. Most conversions happen automatically when the localization framework is active. Full support for a localized user dashboard experience is a high-priority feature.

Other OpenStack Projects
Python's `locale` and `gettext` modules offer most of the tools necessary to localize a Python project with some effort. More information on this will be added in the future.

Translation infrastructure
The translation infrastructure and workflow is documented on the Translations/Infrastructure page.