Next: Files, Previous: Concepts, Up: Introduction
For a totally multi-lingual distribution, there are many things to translate beyond output messages.
gettext offers a complete toolset for
translating messages output by C programs. Perl scripts and shell
scripts will also need to be translated. Even if there are today some hooks
by which this can be done, these hooks are not integrated as well as they
should be.
autoconf or bison, are able
to produce other programs (or scripts). Even if the generating
programs themselves are internationalized, the generated programs they
produce may need internationalization on their own, and this indirect
internationalization could be automated right from the generating
program. In fact, quite usually, generating and generated programs
could be internationalized independently, as the effort needed is
fairly orthogonal.
recode program is able to reconstruct at execution.
Since these descriptions are extracted from the RFC by mechanical means,
translating them properly would require a prior translation of the RFC
itself.
gcc to allow diacriticized characters in identifiers or use
translated keywords; ‘rm -i’ might accept something else than
‘y’ or ‘n’ for replies, etc. Even if the program will
eventually make most of its output in the foreign languages, one has
to decide whether the input syntax, option values, etc., are to be
localized or not.
As we already stressed, translation is only one aspect of locales.
Other internationalization aspects are system services and are handled
in GNU libc. There
are many attributes that are needed to define a country's cultural
conventions. These attributes include beside the country's native
language, the formatting of the date and time, the representation of
numbers, the symbols for currency, etc. These local rules are
termed the country's locale. The locale represents the knowledge
needed to support the country's native attributes.
There are a few major areas which may vary between countries and
hence, define what a locale must describe. The following list helps
putting multi-lingual messages into the proper context of other tasks
related to locales. See the GNU libc manual for details.
Time of the day may be noted as hh:mm, hh.mm,
or otherwise. Some locales require time to be specified in 24-hour
mode rather than as AM or PM. Further, the nature and yearly extent
of the Daylight Saving correction vary widely between countries.
12,345.67 English
12.345,67 German
12345,67 French
1,2345.67 Asia
Some programs could go further and use different unit systems, like
English units or Metric units, or even take into account variants
about how numbers are spelled in full.
gettext provides the means for developers and users to
easily change the language that the software uses to communicate to
the user.
These areas of cultural conventions are called locale categories. It is an unfortunate term; locale aspects or locale feature categories would be a better term, because each “locale category” describes an area or task that requires localization. The concrete data that describes the cultural conventions for such an area and for a particular culture is also called a locale category. In this sense, a locale is composed of several locale categories: the locale category describing the codeset, the locale category describing the formatting of numbers, the locale category containing the translated messages, and so on.
Components of locale outside of message handling are standardized in
the ISO C standard and the POSIX:2001 standard (also known as the SUSV3
specification). GNU libc
fully implements this, and most other modern systems provide a more
or less reasonable support for at least some of the missing components.