Next: , Previous: String Functions, Up: Built-in


9.1.4 Input/Output Functions

The following functions relate to input/output (I/O). Optional parameters are enclosed in square brackets ([ ]):

close(filename [, how])
Close the file filename for input or output. Alternatively, the argument may be a shell command that was used for creating a coprocess, or for redirecting to or from a pipe; then the coprocess or pipe is closed. See Close Files And Pipes, for more information.

When closing a coprocess, it is occasionally useful to first close one end of the two-way pipe and then to close the other. This is done by providing a second argument to close(). This second argument should be one of the two string values "to" or "from", indicating which end of the pipe to close. Case in the string does not matter. See Two-way I/O, which discusses this feature in more detail and gives an example.

fflush([filename])
Flush any buffered output associated with filename, which is either a file opened for writing or a shell command for redirecting output to a pipe or coprocess.

Many utility programs buffer their output; i.e., they save information to write to a disk file or the screen in memory until there is enough for it to be worthwhile to send the data to the output device. This is often more efficient than writing every little bit of information as soon as it is ready. However, sometimes it is necessary to force a program to flush its buffers; that is, write the information to its destination, even if a buffer is not full. This is the purpose of the fflush() function—gawk also buffers its output and the fflush() function forces gawk to flush its buffers.

fflush() was added to Brian Kernighan's version of awk in 1994. For over two decades, it was not part of the POSIX standard. As of December, 2012, it was accepted for inclusion into the POSIX standard. See the Austin Group website.

POSIX standardizes fflush() as follows: If there is no argument, or if the argument is the null string (""), then awk flushes the buffers for all open output files and pipes.

NOTE: Prior to version 4.0.2, gawk would flush only the standard output if there was no argument, and flush all output files and pipes if the argument was the null string. This was changed in order to be compatible with Brian Kernighan's awk, in the hope that standardizing this feature in POSIX would then be easier (which indeed helped).

With gawk, you can use ‘fflush("/dev/stdout")’ if you wish to flush only the standard output.

fflush() returns zero if the buffer is successfully flushed; otherwise, it returns non-zero (gawk returns −1). In the case where all buffers are flushed, the return value is zero only if all buffers were flushed successfully. Otherwise, it is −1, and gawk warns about the problem filename.

gawk also issues a warning message if you attempt to flush a file or pipe that was opened for reading (such as with getline), or if filename is not an open file, pipe, or coprocess. In such a case, fflush() returns −1, as well.

system(command)
Execute the operating-system command command and then return to the awk program. Return command's exit status.

For example, if the following fragment of code is put in your awk program:

          END {
               system("date | mail -s 'awk run done' root")
          }

the system administrator is sent mail when the awk program finishes processing input and begins its end-of-input processing.

Note that redirecting print or printf into a pipe is often enough to accomplish your task. If you need to run many commands, it is more efficient to simply print them down a pipeline to the shell:

          while (more stuff to do)
              print command | "/bin/sh"
          close("/bin/sh")

However, if your awk program is interactive, system() is useful for running large self-contained programs, such as a shell or an editor. Some operating systems cannot implement the system() function. system() causes a fatal error if it is not supported.

NOTE: When --sandbox is specified, the system() function is disabled (see Options).

Interactive Versus Noninteractive Buffering

As a side point, buffering issues can be even more confusing, depending upon whether your program is interactive, i.e., communicating with a user sitting at a keyboard.1

Interactive programs generally line buffer their output; i.e., they write out every line. Noninteractive programs wait until they have a full buffer, which may be many lines of output. Here is an example of the difference:

     $ awk '{ print $1 + $2 }'
     1 1
     -| 2
     2 3
     -| 5
     Ctrl-d

Each line of output is printed immediately. Compare that behavior with this example:

     $ awk '{ print $1 + $2 }' | cat
     1 1
     2 3
     Ctrl-d
     -| 2
     -| 5

Here, no output is printed until after the Ctrl-d is typed, because it is all buffered and sent down the pipe to cat in one shot.

Controlling Output Buffering with system()

The fflush() function provides explicit control over output buffering for individual files and pipes. However, its use is not portable to many older awk implementations. An alternative method to flush output buffers is to call system() with a null string as its argument:

     system("")   # flush output

gawk treats this use of the system() function as a special case and is smart enough not to run a shell (or other command interpreter) with the empty command. Therefore, with gawk, this idiom is not only useful, it is also efficient. While this method should work with other awk implementations, it does not necessarily avoid starting an unnecessary shell. (Other implementations may only flush the buffer associated with the standard output and not necessarily all buffered output.)

If you think about what a programmer expects, it makes sense that system() should flush any pending output. The following program:

     BEGIN {
          print "first print"
          system("echo system echo")
          print "second print"
     }

must print:

     first print
     system echo
     second print

and not:

     system echo
     first print
     second print

If awk did not flush its buffers before calling system(), you would see the latter (undesirable) output.


Footnotes

[1] A program is interactive if the standard output is connected to a terminal device. On modern systems, this means your keyboard and screen.