Next: , Previous: , Up: Preparing Program Sources   [Contents][Index]


4.6 Special Comments preceding Keywords

In C programs strings are often used within calls of functions from the printf family. The special thing about these format strings is that they can contain format specifiers introduced with %. Assume we have the code

printf (gettext ("String `%s' has %d characters\n"), s, strlen (s));

A possible German translation for the above string might be:

"%d Zeichen lang ist die Zeichenkette `%s'"

A C programmer, even if he cannot speak German, will recognize that there is something wrong here. The order of the two format specifiers is changed but of course the arguments in the printf don’t have. This will most probably lead to problems because now the length of the string is regarded as the address.

To prevent errors at runtime caused by translations, the msgfmt tool can check statically whether the arguments in the original and the translation string match in type and number. If this is not the case and the ‘-c’ option has been passed to msgfmt, msgfmt will give an error and refuse to produce a MO file. Thus consistent use of ‘msgfmt -c’ will catch the error, so that it cannot cause problems at runtime.

If the word order in the above German translation would be correct one would have to write

"%2$d Zeichen lang ist die Zeichenkette `%1$s'"

The routines in msgfmt know about this special notation.

Because not all strings in a program will be format strings, it is not useful for msgfmt to test all the strings in the .po file. This might cause problems because the string might contain what looks like a format specifier, but the string is not used in printf.

Therefore xgettext adds a special tag to those messages it thinks might be a format string. There is no absolute rule for this, only a heuristic. In the .po file the entry is marked using the c-format flag in the #, comment line (see The Format of PO Files).

The careful reader now might say that this again can cause problems. The heuristic might guess it wrong. This is true and therefore xgettext knows about a special kind of comment which lets the programmer take over the decision. If in the same line as or the immediately preceding line to the gettext keyword the xgettext program finds a comment containing the words xgettext:c-format, it will mark the string in any case with the c-format flag. This kind of comment should be used when xgettext does not recognize the string as a format string but it really is one and it should be tested. Please note that when the comment is in the same line as the gettext keyword, it must be before the string to be translated. Also note that a comment such as xgettext:c-format applies only to the first string in the same or the next line, not to multiple strings.

This situation happens quite often. The printf function is often called with strings which do not contain a format specifier. Of course one would normally use fputs but it does happen. In this case xgettext does not recognize this as a format string but what happens if the translation introduces a valid format specifier? The printf function will try to access one of the parameters but none exists because the original code does not pass any parameters.

xgettext of course could make a wrong decision the other way round, i.e. a string marked as a format string actually is not a format string. In this case the msgfmt might give too many warnings and would prevent translating the .po file. The method to prevent this wrong decision is similar to the one used above, only the comment to use must contain the string xgettext:no-c-format.

If a string is marked with c-format and this is not correct the user can find out who is responsible for the decision. See Invoking the xgettext Program to see how the --debug option can be used for solving this problem.


Next: Special Cases of Translatable Strings, Previous: Marking Translatable Strings, Up: Preparing Program Sources   [Contents][Index]