PO File Entries (GNU gettext utilities)

Next: Workflow flags, Up: The Format of PO Files [Contents][Index]

3.1 What an entry looks like ¶

A PO file is made up of many entries, each entry holding the relation between an original untranslated string and its corresponding translation. All entries in a given PO file usually pertain to a single project, and all translations are expressed in a single target language. One PO file entry has the following schematic structure:

white-space
#  translator-comments
#. extracted-comments
#: reference...
#, flag...
#| msgid previous-untranslated-string
msgid untranslated-string
msgstr translated-string

The general structure of a PO file should be well understood by the translator. When using PO mode, very little has to be known about the format details, as PO mode takes care of them for her.

A simple entry can look like this:

#: lib/error.c:116
msgid "Unknown system error"
msgstr "Error desconegut del sistema"

Entries begin with some optional white space. Usually, when generated through GNU gettext tools, there is exactly one blank line between entries. Then comments follow, on lines all starting with the character #. There are two kinds of comments: those which have some white space immediately following the # - the translator comments -, which comments are created and maintained exclusively by the translator, and those which have some non-white character just after the # - the automatic comments -, which comments are created and maintained automatically by GNU gettext tools. Comment lines starting with #. contain comments given by the programmer, directed at the translator; these comments are called extracted comments because the xgettext program extracts them from the program’s source code. Comment lines starting with #: contain references to the program’s source code. Comment lines starting with #, contain flags; more about these below. Comment lines starting with #| contain the previous untranslated string for which the translator gave a translation.

All comments, of either kind, are optional.

References to the program’s source code, in lines that start with #:, are of the form file_name:line_number or just file_name. If the file_name contains spaces. it is enclosed within Unicode characters U+2068 and U+2069.

After white space and comments, entries show two strings, namely first the untranslated string as it appears in the original program sources, and then, the translation of this string. The original string is introduced by the keyword msgid, and the translation, by msgstr. The two strings, untranslated and translated, are quoted in various ways in the PO file, using " delimiters and \ escapes, but the translator does not really have to pay attention to the precise quoting format, as PO mode fully takes care of quoting for her.

The msgid strings, as well as automatic comments, are produced and managed by other GNU gettext tools, and PO mode does not provide means for the translator to alter these. The most she can do is merely deleting them, and only by deleting the whole entry. On the other hand, the msgstr string, as well as translator comments, are really meant for the translator, and PO mode gives her the full control she needs.

The previous-untranslated-string is optionally inserted by the msgmerge program, at the same time when it marks a message fuzzy. It helps the translator to see which changes were done by the developers on the untranslated-string.

The comment lines beginning with #, are special because they are not completely ignored by the programs as comments generally are. The comma separated list of flags is used by the msgfmt program to give the user some better diagnostic messages. Currently there are two forms of flags defined: workflow flags and sticky flags.