Next: , Previous: , Up: Types and encodings   [Contents][Index]


3.1 Character types

The Guile Ncurses library uses two basic character types, simple characters and complex characters. Simple characters are the native Guile characters, and complex characters are those used to interact with the NCurses library. When using this library, a programmer will often have to convert simple characters to complex characters and vice versa, so it is important to understand their differences and applications.

Simple characters are the Guile native character type.

For older versions of Guile, such as the 1.6.x and the 1.8.x versions, characters were limited to being 8-bit. The lower 128 characters were ASCII, and the upper 128 characters had a meaning based on the locale of the system. If the locale were set to ISO-8859-1, for example, the upper 128 characters would be common accented letters used in western European languages. In almost all cases, each character took up the same space on a character cell terminal.

In newer versions of Guile, such as 2.0.x, the native Guile character type is a Unicode codepoint, a 32-bit number. Characters based on Unicode include all or most of the glyphs for the most languages. There are challenges when using the non-ASCII characters. There are double-width characters, such as hiragana and katakana, that take up two character spaces on a character cell terminal because of their complexity. There are combining characters, such as accents, that aren’t intended to stand alone and actually modify the previous character in a string.

In the terminology of Unicode, a 32-bit integer that maps to a wide character is a code point, and documents that refer to Unicode code points typically write them like this: U+XXXX, where XXXX is a four-digit hexadecimal number.

simple characters

An simple character is a standard Guile character, such as #\x. It is also referred to as an unrendered characters. It is unrendered because it has no associated color or attributes.

simple strings

A simple string, also called an unrendered string, is a standard Guile string, such as "hello", which is a sequence of unrendered characters.

complex characters

Rendered, complex characters usually are a standard character, zero to three combining characters, attribute information, and color information. The attribute information describes if the character is bold, dim, reverse video, etc. The guile ncurses library defines a special record type for complex characters: the xchar.

Each complex character may contain more than one simple character. The first character in the list should be a base character or a control character. A base character is usually a character than can be printed standalone: combining accents and other letter codepoints intended to modify another letter are not base characters. The remaining characters in the list, if any, are accents or other combining characters that modify the appearance of the base character. If the first character was a control character, no combining characters are allowed.

Here are examples of the constructors used to make complex characters. These constructors will be described in more detail later on.

;; A constructor for a letter 'x' using the default colors
> (normal #\x)

;; The display format of the resulting complex character
==> #<xchar #\x>

;; A constructor for a bold letter 'L' using default colors
> (bold #\L)

;; The display format of the resulting bold 'L' character
==> #<xchar bold #\L>

;; A bold letter 'x' printed white on a green background
> (init-pair! 2 COLOR_WHITE COLOR_GREEN)
> (bold-on (color 2 #\x))
==> #<xchar bold color-pair #2 [white on green] #\x>

;; A letter 'n' overwritten with a tilde
> (define ntilde (normal #\n))
> (set-xchar-chars! ntilde '(#\n #\~))
> ntilde
==> #<xchar #\n #\~>
rendered, complex strings

Rendered, complex strings are lists of rendered, complex characters.

An example of the constructor for a rendered complex string, and the display format of that string

;; The constructor for a complex string: the word 'hello' in
;; reverse video
> (inverse "hello")

;; The display format of the resulting string could be...
==> (#<xchar reverse color-pair #0 [white on black] #\h>
     #<xchar reverse color-pair #0 [white on black] #\e>
     #<xchar reverse color-pair #0 [white on black] #\l>
     #<xchar reverse color-pair #0 [white on black] #\l>
     #<xchar reverse color-pair #0 [white on black] #\o>)

Next: , Previous: , Up: Types and encodings   [Contents][Index]