Jump to: navigation, search

Difference between revisions of "I18n/TranslatableStrings"

(Use contextual markers on ambiguous translations or short words)
Line 24: Line 24:
  
  
=== Use contextual markers on ambiguous translations or short words ===
+
=== Use contextual markers on short strings to avoid ambiguousness ===
  
 
Strings may be the same in English, but different in other languages. English, for example, has no grammatical gender, and sometimes the noun and verb forms of a word are identical.
 
Strings may be the same in English, but different in other languages. English, for example, has no grammatical gender, and sometimes the noun and verb forms of a word are identical.

Revision as of 09:25, 10 June 2014

Horizon translatable strings rules

Comments and suggestions: https://etherpad.openstack.org/p/i18n_translatable_strings

Use Comments for translators as much as possible

Give translators hints about a translatable string. To do so, add a comment prefixed with the Translators: keyword on the line preceding the string, e.g.:

def my_view(request):
    # Translators: This is a famous Quote from Arthur C. Clark
    output = ugettext("Any sufficiently advanced technology is indistinguishable from magic.")

The comments need to be placed just before the first translatable string if you want to see them appear in the po file:

some_message = ugettext(
    # Translators: This line is just before the first string.
    "This is a huge sentence, it is long and needs to be split on several"
    "lines."
)

You can not always use comments (see This django bug), so you might need contextual markers instead.


Use contextual markers on short strings to avoid ambiguousness

Strings may be the same in English, but different in other languages. English, for example, has no grammatical gender, and sometimes the noun and verb forms of a word are identical. To make it possible to localize these correctly, we can add “context” (known in gettext as msgctxt) to differentiate two otherwise identical strings). Django provides a pgettext() function for this. In english, there is only one form for "None", in other languages, you can have 2, 3, 6 etc... different ways of writing None depending on the word/object it refers to:

ugettext("None")  # Never do this.
pgettext("Quantity of Images", u"None")  # Do this instead

Example where the same sentence in English has 2 meanings and needs contextual markers because in another language they would be written differently:

ugettext(u"Shut Off Instance") # Never do this.
# Do this:
pgettext("Action to perform (the instance is currently running)",
    u"Shut Off Instance")
pgettext("Past action (this is the status of the instance)",
    u"Shut Off Instance")

Keep in mind that you have to think this: "can the same word(s) or sentence mean something else in other cases?" If the answer is "yes" you need contextual markers.

NOTE: Because of a pgettext_lazy django bug which is fixed in django 1.6, but still present in older versions (we support back to version 1.4), it is a good habit to always use unicode strings for translatable strings as you can see in the example.

Always try to refer to the same context string ("Quantity of Images") as other places in the code if the same word is used elsewhere in order to use the same translation. Otherwise translators will have to translate the same word multiple times for similar contexts if the context message differs.

Contextual markers can also be used in django templates:

{% trans "May" context "month name" %}
{% blocktrans with name=user.username context "greeting" %}Hi {{ name }}{% endblocktrans %}

Another very common example where contextual markers are needed is the word "Free":

English: "Free Software" -> French: "Logiciel libre" (Free -> Libre)
English: "Free Beer" -> French: "Bière gratuite" (Free -> Gratuite)
English: "Free Trip" -> French: "Voyage gratuit" (Free -> Gratuit)

So if somewhere in your code you write the word "Free" alone, contextual markers are compulsory. As you can see, even with contextual markers the word "Free" can never be used in sentences using string formating variables to refer to what will be free, because translations will never work ("Free" needing to be adapted to the gender). This is why most of the time it is better to write the full sentence and to avoid dynamic composition, because dynamic compositions can make translations fail completely. If you can not do without dynamic composition, it is compulsory to use string formating variables as explained next.

Use string formating variables, never perform string concatenation

Translators do not see the concatenation process and only see a string with a trailing space. If you need a variable part, always code with variables:

_("Image details: ") + variable + "." # Never do this.
_("Image details: %(variable)s.") % variable  # Do this instead.
# Or the following if the variable's content needs a translation.
# It Works only if the content of the variable is already present somewhere in the code as a translatable string.
# see ugettext_noop or ugettext_lazy documentation, it is a way to mark strings for translation in variables.
_("Image details: %(variable)s.") % _(variable)

Use blocktrans instead of trans in django templates when you need variables:

{% trans "Image details:  " %}{{variable }}. /* Never do this */
{% blocktrans %}Image details:  {{ variable }}.{% endblocktrans %} /* Do this instead */

Note that the {{ variable }} variable needs to exist in the template context. In case you need to evaluate template expressions such as filters or accessing object attributes, since you can’t do that within the {% blocktrans %} block, you need to bind the expression to a local variable first:

{% blocktrans with revision.created_date|timesince as timesince %}
{{ revision }} {{ timesince }} ago
{% endblocktrans %}

{% blocktrans with project.name as name %}Delete {{ name }}?{% endblocktrans %}

Also, some languages need to put the variable elsewhere than at the end of the string, this is impossible with string concatenation.

Example: "The %(name)s image is too large for this volume." can be translated to "L'image %(name)s est trop volumineuse pour ce volume." where the variable changed place depending on the language. The variable can not change place if the string was "The " + image_name + " image is too large", and would not be properly translatable in many languages. However avoid this as much as possible because event with string formating variables, the dynamic composition can fail in sentences such as "Here we serve free %(product_name)s". Many languages need indeed to adapt "free" to the gender of "%(product_name)s"

Use ungettext for pluralisation

Never perform pluralization yourself by deciding that 1 is singular and 0 or >1 plural, because this would fail in all other languages than English. Indeed, 0 is plural in English, and singular in French. Some languages have 2,3,4,5 or 6 forms of plurals (i.e. There are 5 forms of plural in Irish (ga) and 6 in Arabic(ar)). See "Plural Forms" and "Translating plural forms" for more information.

When pluralization is required, always use::

ungettext(
    'Singular sentence',
    'Plural sentence',
    number
)

Do not use ungettext(variable_singular_string, variable_plural_string, number) by defining variable_singular_string and variable_plural_string elsewhere in the code because the variable strings would not appear in the pot file for pluralization and would each be an independent string marked for translation. When these variable strings are marked independently for translation and used afterwards in ungettext, there are only 2 plural forms in the pot file and many languages would fail handling proper plurals since they would require more forms.

Reuse existing strings when possible

Do not modify all existing translatable strings because we would loose the work translators have already done. So if your code moves an translatable string elsewhere, reuse it as much as possible. The rules on this page apply to new strings or modified strings which require a new translation. There is a quick way to check if your string is already translated:

Chose one or 2 languages covered as close as possible to 100% (other than English) here: https://www.transifex.com/projects/p/horizon/ The less the language looks like english, the better (even if you don't understand it) as you'll see in the test to know if a translation exists. from your horizon dir:

. .venv/bin/activate  # the first call to ./run_tests.sh (when you run unittests) installed this virtual environment.
python manage.py shell
>>> from django.utils.translation import activate, ugettext
>>> activate('zh_CN')
>>> print ugettext("None")
wu
>>> print ugettext("Instance")
云主机
>>> activate('fr')
>>> print ugettext("None")
Aucun
>>> print ugettext("Instance")
Instance
>>>

If the returned word is different, it means a translation exists. As you can see, "Instance" is translated, even if it's translated in French. You do not seem to notice it is translated because the translation looks the same. So the best is to check with 2 different languages. You can perform the same test with pgettext (Contextual markers) and ungettext (plural forms) too.