6.1.1.1 Numeric and String Constants

A numeric constant stands for a number. This number can be an integer, a decimal fraction, or a number in scientific (exponential) notation.32 Here are some examples of numeric constants that all have the same value:

105
1.05e+2
1050e-1

A string constant consists of a sequence of characters enclosed in double quotation marks. For example:

"parrot"

represents the string whose contents are ‘parrot’. Strings in gawk can be of any length, and they can contain any of the possible eight-bit ASCII characters, including ASCII NUL (character code zero). Other awk implementations may have difficulty with some character codes.

Some languages allow you to continue long strings across multiple lines by ending the line with a backslash. For example in C:

#include <stdio.h>

int main()
{
    printf("hello, \
world\n");
    return 0;
}

In such a case, the C compiler removes both the backslash and the newline, producing a string as if it had been typed ‘"hello, world\n"’. This is useful when a single string needs to contain a large amount of text.

The POSIX standard says explicitly that newlines are not allowed inside string constants. And indeed, all awk implementations report an error if you try to do so. For example:

$ gawk 'BEGIN { print "hello, 
> world" }'
-| gawk: cmd. line:1: BEGIN { print "hello,
-| gawk: cmd. line:1:               ^ unterminated string
-| gawk: cmd. line:1: BEGIN { print "hello,
-| gawk: cmd. line:1:               ^ syntax error

Although POSIX doesn’t define what happens if you use an escaped newline, as in the previous C example, all known versions of awk allow you to do so. Unfortunately, what each one does with such a string varies. (d.c.) gawk, mawk, and the OpenSolaris POSIX awk (see Other Freely Available awk Implementations) elide the backslash and newline, as in C:

$ gawk 'BEGIN { print "hello, \
> world" }'
-| hello, world

In POSIX mode (see Command-Line Options), gawk does not allow escaped newlines. Otherwise, it behaves as just described.

BWK awk33 and BusyBox awk remove the backslash but leave the newline intact, as part of the string:

$ nawk 'BEGIN { print "hello, \
> world" }'
-| hello,
-| world

Footnotes

(32)

The internal representation of all numbers, including integers, uses double-precision floating-point numbers. On most modern systems, these are in IEEE 754 standard format. See Arithmetic and Arbitrary-Precision Arithmetic with gawk, for much more information.

(33)

In all examples throughout this Web page, nawk is BWK awk.