In a program,
you keep track of information and values in things called variables.
A variable is just a name for a given value, such as
address, and so on.
awk has several predefined variables, and it has
special names to refer to the current input record
and the fields of the record.
You may also group multiple
associated values under one name, as an array.
Data, particularly in
awk, consists of either numeric
values, such as 42 or 3.1415927, or string values.
String values are essentially anything that’s not a number, such as a name.
Strings are sometimes referred to as character data, since they
store the individual characters that comprise them.
Individual variables, as well as numeric and string variables, are
referred to as scalar values.
Groups of values, such as arrays, are not scalars.
Computer Arithmetic, provided a basic introduction to numeric types (integer and floating-point) and how they are used in a computer. Please review that information, including a number of caveats that were presented.
While you are probably used to the idea of a number without a value (i.e., zero),
it takes a bit more getting used to the idea of zero-length character data.
Nevertheless, such a thing exists.
It is called the null string.
The null string is character data that has no value.
In other words, it is empty. It is written in
Humans are used to working in decimal; i.e., base 10. In base 10, numbers go from 0 to 9, and then “roll over” into the next column. (Remember grade school? 42 = 4 x 10 + 2.)
There are other number bases though. Computers commonly use base 2 or binary, base 8 or octal, and base 16 or hexadecimal. In binary, each column represents two times the value in the column to its right. Each column may contain either a 0 or a 1. Thus, binary 1010 represents (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1), or decimal 10. Octal and hexadecimal are discussed more in Nondecimal-numbers.
At the very lowest level, computers store values as groups of binary digits,
or bits. Modern computers group bits into groups of eight, called bytes.
Advanced applications sometimes have to manipulate bits directly,
gawk provides functions for doing so.
Programs are written in programming languages.
Hundreds, if not thousands, of programming languages exist.
One of the most popular is the C programming language.
The C language had a very strong influence on the design of
There have been several versions of C. The first is often referred to
as “K&R” C, after the initials of Brian Kernighan and Dennis Ritchie,
the authors of the first book on C. (Dennis Ritchie created the language,
and Brian Kernighan was one of the creators of
In the mid-1980s, an effort began to produce an international standard
for C. This work culminated in 1989, with the production of the ANSI
standard for C. This standard became an ISO standard in 1990.
In 1999, a revised ISO C standard was approved and released.
Where it makes sense, POSIX
awk is compatible with 1999 ISO C.