Previous: Escape-Sequence Lines, Up: Escape-Sequence Lines   [Contents][Index]


Recognizing Escape Sequences

Okay, great, so escape-sequence lines help distinguish control-characters and text characters that make up an escape sequence from those that don’t. But what exactly makes up an escape sequence, anyway?

The Ecma-35 / ISO/IEC 2022 standard defines an escape sequence to be a sequence of characters beginning with ESC, with a final byte in the range x30x7E, and any number (including zero) of intermediate bytes in the range x20-x2F. Table 3.1 has been provided as a reference for finding which characters match which codes.

x2Xx3Xx4Xx5Xx6Xx7X
xX0SPC0@Pp
xX1!1AQaq
xX2"2BRbr
xX3#3CScs
xX4$4DTdt
xX5%5EUeu
xX6&6FVfv
xX77GWgw
xX8(8HXhx
xX9)9IYiy
xXA*:JZjz
xXB+;K[k{
xXC,<L\l|
xXD-=M]m}
xXE.>N^n~
xXF/?O_oDEL

Table 3.1

So, for instance, the following is a valid escape sequence.

: Esc $ ( C

$’ and ‘(’ have code values x24 and x28, and so are valid intermediate bytes; ‘C’ has the value x43, and so terminates the escape sequence.

You may have noticed that a lot of the examples of escape sequences in this document don’t actually follow this format. For instance,

: Esc [ 3 ; 31 m

According to the definition we just gave, ‘[’ should be the final byte of an escape sequence. So why does Teseq keep going until it reaches the ‘m’?

The answer is that the escape sequence does end with the ‘[’; but the combination ‘Esc [’ invokes a control named CONTROL SEQUENCE INTRODUCER (CSI). The CSI control marks the beginning of a different kind of sequence, called a “control sequence”. Control sequences are described by the Ecma-48 / ISO/IEC 6429 standard, which considers it to be a distinct concept from escape sequences; however, Teseq treats both types of sequences as “escape sequences”.

A control sequence starts with the two-character CSI escape sequence ‘Esc [’, followed by an optional sequence of parameter bytes in the range x30x3F, an optional sequence of intermediate bytes in the range x20x2F (the same range as intermediate bytes in a regular escape sequence), and a final byte in the range x40x7e. The set of standard control sequence functions are defined in Ecma-48 / ISO/IEC 6429.

When used in accordance with the standard, the parameter bytes are used to provide a semicolon-separated list of numeric parameters to the control function being invoked. These affect the details of the control function’s behavior; but not which control function is being invoked:

: Esc [ 1 m
& SGR: SELECT GRAPHIC RENDITION
" Set bold text.
: Esc [ 0 m
& SGR: SELECT GRAPHIC RENDITION
" Clear graphic rendition to defaults.

Both sequences end with the same sequence of intermediate bytes (none) and final byte; both invoke the SGR control function. But the first one indicates that following text should be rendered boldface, while the second indicates that text rendering should be restored to its default settings.

Intermediate bytes, however, together with the final byte, do affect the meaning of the function invoked. Currently, Ecma-48 / ISO/IEC 6429 only defines functions for either no intermediate bytes, or a single space character (x20) as the intermediate byte.

: Esc [ A
& CUU: CURSOR UP
: Esc [ Spc A
& SR: SCROLL RIGHT

Ecma-48 / ISO/IEC 6429 describes an alternate representation for CSI; the 8-bit byte value x9B. Teseq does not currently treat that value specially, nor any of the other high-value bytes from the C1 set of control functions. This is because whether or not those bytes indicate control functions is dependent upon what character encoding is in use. Future versions of Teseq may support an option to interpret these forms as well, at which time control sequences using the single-byte CSI control will probably be rendered like:

: CSI [ 1 m

Ecma-48 / ISO/IEC 6429 also describes another kind of sequence called “control strings”. These are not interpreted by Teseq; the control characters involved (for example, ‘SOS/^X’ and ‘ST/^\’) will be printed on control-character lines, and any text characters will be displayed on text lines.

Future versions of Teseq will probably not depart from this display behavior; however, support for some common interpretations for control strings may be added, in which case a label line and/or description line might follow the control string, describing its usual interpretation.


Previous: Escape-Sequence Lines, Up: Escape-Sequence Lines   [Contents][Index]