Previous: Input Conventions, Up: Text


5.1.7 Input Encodings

Currently, the following input encodings are available.

cp1047
This input encoding works only on EBCDIC platforms (and vice versa, the other input encodings don't work with EBCDIC); the file cp1047.tmac is by default loaded at start-up.
latin-1
This is the default input encoding on non-EBCDIC platforms; the file latin1.tmac is loaded at start-up.
latin-2
To use this encoding, either say ‘.mso latin2.tmac at the very beginning of your document or use ‘-mlatin2’ as a command line argument for groff.
latin-5
For Turkish. Either say ‘.mso latin9.tmac at the very beginning of your document or use ‘-mlatin9’ as a command line argument for groff.
latin-9 (latin-0)
This encoding is intended (at least in Europe) to replace latin-1 encoding. The main difference to latin-1 is that latin-9 contains the Euro character. To use this encoding, either say ‘.mso latin9.tmac at the very beginning of your document or use ‘-mlatin9’ as a command line argument for groff.

Note that it can happen that some input encoding characters are not available for a particular output device. For example, saying

     
     groff -Tlatin1 -mlatin9 ...

fails if you use the Euro character in the input. Usually, this limitation is present only for devices which have a limited set of output glyphs (e.g. -Tascii and -Tlatin1); for other devices it is usually sufficient to install proper fonts which contain the necessary glyphs.

Due to the importance of the Euro glyph in Europe, the groff package now comes with a PostScript font called freeeuro.pfa which provides various glyph shapes for the Euro. In other words, latin-9 encoding is supported for the -Tps device out of the box (latin-2 isn't).

By its very nature, -Tutf8 supports all input encodings; -Tdvi has support for both latin-2 and latin-9 if the command line -mec is used also to load the file ec.tmac (which flips to the EC fonts).