6.1.3.1 Using Variables in a Program

Variables let you give names to values and refer to them later. Variables have already been used in many of the examples. The name of a variable must be a sequence of letters, digits, or underscores, and it may not begin with a digit. Here, a letter is any one of the 52 upper- and lowercase English letters. Other characters that may be defined as letters in non-English locales are not valid in variable names. Case is significant in variable names; a and A are distinct variables.

A variable name is a valid expression by itself; it represents the variable’s current value. Variables are given new values with assignment operators, increment operators, and decrement operators (see Assignment Expressions). In addition, the sub() and gsub() functions can change a variable’s value, and the match(), split(), and patsplit() functions can change the contents of their array parameters (see String-Manipulation Functions).

A few variables have special built-in meanings, such as FS (the field separator) and NF (the number of fields in the current input record). See Predefined Variables for a list of the predefined variables. These predefined variables can be used and assigned just like all other variables, but their values are also used or changed automatically by awk. All predefined variables’ names are entirely uppercase.

Variables in awk can be assigned either numeric or string values. The kind of value a variable holds can change over the life of a program. By default, variables are initialized to the empty string, which is zero if converted to a number. There is no need to explicitly initialize a variable in awk, which is what you would do in C and in most other traditional languages.