9.1.8 Getting Type Information

gawk provides two functions that let you distinguish the type of a variable. This is necessary for writing code that traverses every element of an array of arrays (see Arrays of Arrays), and in other contexts.

isarray(x)

Return a true value if x is an array. Otherwise, return false.

typeof(x)

Return one of the following strings, depending upon the type of x:

"array"

x is an array.

"regexp"

x is a strongly typed regexp (see Strongly Typed Regexp Constants).

"number"

x is a number.

"number|bool"

x is a Boolean typed value (see Boolean Typed Values).

"string"

x is a string.

"strnum"

x is a number that started life as user input, such as a field or the result of calling split(). (I.e., x has the strnum attribute; see String Type versus Numeric Type.)

"unassigned"

x is a scalar variable that has not been assigned a value yet. For example:

BEGIN {
    # creates a[1] but it has no assigned value
    a[1]
    print typeof(a[1])  # unassigned
}
"untyped"

x has not yet been used yet at all; it can become a scalar or an array. The typing could even conceivably differ from run to run of the same program! For example:

BEGIN {
    print "initially, typeof(v) = ", typeof(v)

    if ("FOO" in ENVIRON)
        make_scalar(v)
    else
        make_array(v)

    print "typeof(v) =", typeof(v)
}

function make_scalar(p,    l) { l = p }

function make_array(p) { p[1] = 1 }

isarray() is meant for use in two circumstances. The first is when traversing a multidimensional array: you can test if an element is itself an array or not. The second is inside the body of a user-defined function (not discussed yet; see User-Defined Functions), to test if a parameter is an array or not.

NOTE: While you can use isarray() at the global level to test variables, doing so makes no sense. Because you are the one writing the program, you are supposed to know if your variables are arrays or not.

The typeof() function is general; it allows you to determine if a variable or function parameter is a scalar (number, string, or strongly typed regexp) or an array.

Normally, passing a variable that has never been used to a built-in function causes it to become a scalar variable (unassigned). However, isarray() and typeof() are different; they do not change their arguments from untyped to unassigned.

This applies to both variables denoted by simple identifiers and array elements that come into existence simply by referencing them. Consider:

$ gawk 'BEGIN { print typeof(x) }'
-| untyped
$ gawk 'BEGIN { print typeof(x["foo"]) }'
-| untyped

Note that prior to version 5.2, array elements that come into existence simply by referencing them were different, they were automatically forced to be scalars:

$ gawk-5.1.1 'BEGIN { print typeof(x) }'
-| untyped
$ gawk-5.1.1 'BEGIN { print typeof(x["foo"]) }'
-| unassigned