3.6 Further details on the PO file format

It happens that some lines, usually whitespace or comments, follow the very last entry of a PO file. Such lines are not part of any entry, and will be dropped when the PO file is processed by the tools, or may disturb some PO file editors.

The remainder of this section may be safely skipped by those using a PO file editor, yet it may be interesting for everybody to have a better idea of the precise format of a PO file. On the other hand, those wishing to modify PO files by hand should carefully continue reading on.

An empty untranslated-string is reserved to contain the header entry with the meta information (see Filling in the Header Entry). This header entry should be the first entry of the file. The empty untranslated-string is reserved for this purpose and must not be used anywhere else.

Each of untranslated-string and translated-string respects the C syntax for a character string, including the surrounding quotes and embedded backslashed escape sequences, except that universal character escape sequences (\u and \U) are not allowed. When the time comes to write multi-line strings, one should not use escaped newlines. Instead, a closing quote should follow the last character on the line to be continued, and an opening quote should resume the string at the beginning of the following PO file line. For example:

msgid ""
"Here is an example of how one might continue a very long string\n"
"for the common case the string represents multi-line output.\n"

In this example, the empty string is used on the first line, to allow better alignment of the H from the word ‘Here’ over the f from the word ‘for’. In this example, the msgid keyword is followed by three strings, which are meant to be concatenated. Concatenating the empty string does not change the resulting overall string, but it is a way for us to comply with the necessity of msgid to be followed by a string on the same line, while keeping the multi-line presentation left-justified, as we find this to be a cleaner disposition. The empty string could have been omitted, but only if the string starting with ‘Here’ was promoted on the first line, right after msgid.2 It was not really necessary either to switch between the two last quoted strings immediately after the newline ‘\n’, the switch could have occurred after any other character, we just did it this way because it is neater.

One should carefully distinguish between end of lines marked as ‘\ninside quotes, which are part of the represented string, and end of lines in the PO file itself, outside string quotes, which have no incidence on the represented string.

Outside strings, white lines and comments may be used freely. Comments start at the beginning of a line with ‘#’ and extend until the end of the PO file line. Comments written by translators should have the initial ‘#’ immediately followed by some white space. If the ‘#’ is not immediately followed by white space, this comment is most likely generated and managed by specialized GNU tools, and might disappear or be replaced unexpectedly when the PO file is given to msgmerge.

For a PO file to be valid, no two entries without msgctxt may have the same untranslated-string or untranslated-string-singular. Similarly, no two entries may have the same msgctxt and the same untranslated-string or untranslated-string-singular.


Footnotes

(2)

This limitation is not imposed by GNU gettext, but is for compatibility with the msgfmt implementation on Solaris.