Next: Comparison Operators, Up: Typing and Comparison
The 1992 POSIX standard introduced
the concept of a numeric string, which is simply a string that looks
like a number—for example, " +2". This concept is used
for determining the type of a variable.
The type of the variable is important because the types of two variables
determine how they are compared.
In gawk, variable typing follows these rules:
getline input, FILENAME, ARGV elements,
ENVIRON elements, and the
elements of an array created by split that are numeric strings
have the strnum attribute. Otherwise, they have the string
attribute.
Uninitialized variables also have the strnum attribute.
The last rule is particularly important. In the following program,
a has numeric type, even though it is later used in a string
operation:
BEGIN {
a = 12.345
b = a " is a cute number"
print b
}
When two operands are compared, either string comparison or numeric comparison may be used. This depends upon the attributes of the operands, according to the following symmetric matrix:
+——————————————————————–
| STRING NUMERIC STRNUM
———–+——————————————————————–
|
STRING | string string string
|
NUMERIC | string numeric numeric
|
STRNUM | string numeric numeric
———–+——————————————————————–
The basic idea is that user input that looks numeric—and only
user input—should be treated as numeric, even though it is actually
made of characters and is therefore also a string.
Thus, for example, the string constant " +3.14",
when it appears in program source code,
is a string—even though it looks numeric—and
is never treated as number for comparison
purposes.
In short, when one operand is a “pure” string, such as a string constant, then a string comparison is performed. Otherwise, a numeric comparison is performed.1
This point bears additional emphasis: All user input is made of characters,
and so is first and foremost of string type; input strings
that look numeric are additionally given the strnum attribute.
Thus, the six-character input string ‘ +3.14’ receives the
strnum attribute. In contrast, the eight-character literal
" +3.14" appearing in program text is a string constant.
The following examples print ‘1’ when the comparison between
the two different constants is true, ‘0’ otherwise:
$ echo ' +3.14' | gawk '{ print $0 == " +3.14" }' True
-| 1
$ echo ' +3.14' | gawk '{ print $0 == "+3.14" }' False
-| 0
$ echo ' +3.14' | gawk '{ print $0 == "3.14" }' False
-| 0
$ echo ' +3.14' | gawk '{ print $0 == 3.14 }' True
-| 1
$ echo ' +3.14' | gawk '{ print $1 == " +3.14" }' False
-| 0
$ echo ' +3.14' | gawk '{ print $1 == "+3.14" }' True
-| 1
$ echo ' +3.14' | gawk '{ print $1 == "3.14" }' False
-| 0
$ echo ' +3.14' | gawk '{ print $1 == 3.14 }' True
-| 1
[1] The POSIX standard has been revised. The revised standard's rules for typing and comparison are the same as just described for gawk.