7.1.5 The BEGINFILE and ENDFILE Special Patterns

This section describes a gawk-specific feature.

Two special kinds of rule, BEGINFILE and ENDFILE, give you “hooks” into gawk’s command-line file processing loop. As with the BEGIN and END rules (see The BEGIN and END Special Patterns), BEGINFILE rules in a program execute in the order they are read by gawk. Similarly, all ENDFILE rules also execute in the order they are read.

The bodies of the BEGINFILE rules execute just before gawk reads the first record from a file. FILENAME is set to the name of the current file, and FNR is set to zero.

Prior to version 5.1.1 of gawk, as an accident of the implementation, $0 and the fields retained any previous values they had in BEGINFILE rules. Starting with version 5.1.1, $0 and the fields are cleared, since no record has been read yet from the file that is about to be processed.

The BEGINFILE rule provides you the opportunity to accomplish two tasks that would otherwise be difficult or impossible to perform:

The ENDFILE rule is called when gawk has finished processing the last record in an input file. For the last input file, it will be called before any END rules. The ENDFILE rule is executed even for empty input files.

Normally, when an error occurs when reading input in the normal input-processing loop, the error is fatal. However, if a BEGINFILE rule is present, the error becomes non-fatal, and instead ERRNO is set. This makes it possible to catch and process I/O errors at the level of the awk program.

The next statement (see The next Statement) is not allowed inside either a BEGINFILE or an ENDFILE rule. The nextfile statement is allowed only inside a BEGINFILE rule, not inside an ENDFILE rule.

The getline statement (see Explicit Input with getline) is restricted inside both BEGINFILE and ENDFILE: only redirected forms of getline are allowed.

BEGINFILE and ENDFILE are gawk extensions. In most other awk implementations, or if gawk is in compatibility mode (see Command-Line Options), they are not special.