5.6 Redirecting Output of print and printf

So far, the output from print and printf has gone to the standard output, usually the screen. Both print and printf can also send their output to other places. This is called redirection.

NOTE: When --sandbox is specified (see Command-Line Options), redirecting output to files, pipes, and coprocesses is disabled.

A redirection appears after the print or printf statement. Redirections in awk are written just like redirections in shell commands, except that they are written inside the awk program.

There are four forms of output redirection: output to a file, output appended to a file, output through a pipe to another command, and output to a coprocess. We show them all for the print statement, but they work identically for printf:

print items > output-file

This redirection prints the items into the output file named output-file. The file name output-file can be any expression. Its value is changed to a string and then used as a file name (see Expressions).

When this type of redirection is used, the output-file is erased before the first output is written to it. Subsequent writes to the same output-file do not erase output-file, but append to it. (This is different from how you use redirections in shell scripts.) If output-file does not exist, it is created. For example, here is how an awk program can write a list of peoples’ names to one file named name-list, and a list of phone numbers to another file named phone-list:

$ awk '{ print $2 > "phone-list"
>        print $1 > "name-list" }' mail-list
$ cat phone-list
-| 555-5553
-| 555-3412
...
$ cat name-list
-| Amelia
-| Anthony
...

Each output file contains one name or number per line.

print items >> output-file

This redirection prints the items into the preexisting output file named output-file. The difference between this and the single-‘>’ redirection is that the old contents (if any) of output-file are not erased. Instead, the awk output is appended to the file. If output-file does not exist, then it is created.

print items | command

It is possible to send output to another program through a pipe instead of into a file. This redirection opens a pipe to command, and writes the values of items through this pipe to another process created to execute command.

The redirection argument command is actually an awk expression. Its value is converted to a string whose contents give the shell command to be run. For example, the following produces two files, one unsorted list of peoples’ names, and one list sorted in reverse alphabetical order:

awk '{ print $1 > "names.unsorted"
       command = "sort -r > names.sorted"
       print $1 | command }' mail-list

The unsorted list is written with an ordinary redirection, while the sorted list is written by piping through the sort utility.

The next example uses redirection to mail a message to the mailing list bug-system. This might be useful when trouble is encountered in an awk script run periodically for system maintenance:

report = "mail bug-system"
print("Awk script failed:", $0) | report
print("at record number", FNR, "of", FILENAME) | report
close(report)

The close() function is called here because it’s a good idea to close the pipe as soon as all the intended output has been sent to it. See Closing Input and Output Redirections for more information.

This example also illustrates the use of a variable to represent a file or command—it is not necessary to always use a string constant. Using a variable is generally a good idea, because (if you mean to refer to that same file or command) awk requires that the string value be written identically every time.

print items |& command

This redirection prints the items to the input of command. The difference between this and the single-‘|’ redirection is that the output from command can be read with getline. Thus, command is a coprocess, which works together with but is subsidiary to the awk program.

This feature is a gawk extension, and is not available in POSIX awk. See Using getline from a Coprocess, for a brief discussion. See Two-Way Communications with Another Process, for a more complete discussion.

Redirecting output using ‘>’, ‘>>’, ‘|’, or ‘|&’ asks the system to open a file, pipe, or coprocess only if the particular file or command you specify has not already been written to by your program or if it has been closed since it was last written to. In other words, files, pipes, and coprocesses remain open until explicitly closed. All further print and printf statements continue to write to the same open file, pipe, or coprocess.

In the shell, when you are building up a file a line at a time, you first use ‘>’ to create the file, and then you use ‘>>’ for subsequent additions to it, like so:

echo Name: Arnold Robbins > data
echo Street Address: 1234 A Pretty Street, NE >> data
echo City and State: MyTown, MyState 12345-6789 >> data

In awk, the ‘>’ and ‘>>’ operators are subtly different. The operator you use the first time you write to a file determines how awk will open (or create) the file. If you use ‘>’, the file is truncated, and then all subsequent output appends data to the file, even if additional print or printf statements continue to use ‘>’. If you use ‘>>’ the first time, then existing data is not truncated, and all subsequent print or printf statements append data to the file.

You should be consistent and always use the same operator for all output to the same file. (You can mix ‘>’ and ‘>>’, and nothing bad will happen, but mixing the operators is considered to be bad style in awk. If invoked with the --lint option, gawk issues a warning when it encounters both operators being used for the same open file.)

As mentioned earlier (see Points to Remember About getline), many Many older awk implementations limit the number of pipelines that an awk program may have open to just one! In gawk, there is no such limit. gawk allows a program to open as many pipelines as the underlying operating system permits.

Piping into sh

A particularly powerful way to use redirection is to build command lines and pipe them into the shell, sh. For example, suppose you have a list of files brought over from a system where all the file names are stored in uppercase, and you wish to rename them to have names in all lowercase. The following program is both simple and efficient:

{ printf("mv %s %s\n", $0, tolower($0)) | "sh" }

END { close("sh") }

The tolower() function returns its argument string with all uppercase characters converted to lowercase (see String-Manipulation Functions). The program builds up a list of command lines, using the mv utility to rename the files. It then sends the list to the shell for execution.

See Quoting Strings to Pass to the Shell for a function that can help in generating command lines to be fed to the shell.