5.2 Character Arrays

The string representation used by Octave is an array of characters, so internally the string "dddddddddd" is actually a row vector of length 10 containing the value 100 in all places (100 is the ASCII code of "d"). This lends itself to the obvious generalization to character matrices. Using a matrix of characters, it is possible to represent a collection of same-length strings in one variable. The convention used in Octave is that each row in a character matrix is a separate string, but letting each column represent a string is equally possible.

The easiest way to create a character matrix is to put several strings together into a matrix.

collection = [ "String #1"; "String #2" ];

This creates a 2-by-9 character matrix.

The function ischar can be used to test if an object is a character matrix.

 
: tf = ischar (x)

Return true if x is a character array.

See also: isfloat, isinteger, islogical, isnumeric, isstring, iscellstr, isa.

 
: tf = isstring (s)

Return true if s is a string array.

A string array is a data type that stores strings (row vectors of characters) at each element in the array. It is distinct from character arrays which are N-dimensional arrays where each element is a single 1x1 character. It is also distinct from cell arrays of strings which store strings at each element, but use cell indexing ‘{}’ to access elements rather than string arrays which use ordinary array indexing ‘()’.

Programming Note: Octave does not yet implement string arrays so this function will always return false.

See also: ischar, iscellstr, isfloat, isinteger, islogical, isnumeric, isa.

To test if an object is a string (i.e., a 1xN row vector of characters and not a character matrix) you can use the ischar function in combination with the isrow function as in the following example:

ischar (collection)
     ⇒ 1

ischar (collection) && isrow (collection)
     ⇒ 0

ischar ("my string") && isrow ("my string")
     ⇒ 1

One relevant question is, what happens when a character matrix is created from strings of different length. The answer is that Octave puts blank characters at the end of strings shorter than the longest string. It is possible to use a different character than the blank character using the string_fill_char function.

 
: val = string_fill_char ()
: old_val = string_fill_char (new_val)
: old_val = string_fill_char (new_val, "local")

Query or set the internal variable used to pad all rows of a character matrix to the same length.

The value must be a single character and the default is " " (a single space). For example:

string_fill_char ("X");
[ "these"; "are"; "strings" ]
    ⇒  "theseXX"
        "areXXXX"
        "strings"

When called from inside a function with the "local" option, the variable is changed locally for the function and any subroutines it calls. The original variable value is restored when exiting the function.

Another useful function to control the text justification in this case is the strjust function.

 
: str = strjust (s)
: str = strjust (s, pos)

Return the text, s, justified according to pos, which may be "left", "center", or "right".

If pos is omitted it defaults to "right".

Null characters are replaced by spaces. All other character data are treated as non-white space.

Example:

strjust (["a"; "ab"; "abc"; "abcd"])
     ⇒
        "   a"
        "  ab"
        " abc"
        "abcd"

See also: deblank, strrep, strtrim, untabify.

This shows a problem with character matrices. It simply isn’t possible to represent strings of different lengths. The solution is to use a cell array of strings, which is described in Cell Arrays of Strings.