17.7.6 Reading Directories

The readdir extension adds an input parser for directories. The usage is as follows:

@load "readdir"

When this extension is in use, instead of skipping directories named on the command line (or with getline), they are read, with each entry returned as a record.

The record consists of three fields separated by forward slash characters. The first two are the inode number and the file name, and the third field is a single letter indicating the type of the file. The letters and their corresponding file types are shown in Table 17.4.

LetterFile type
bBlock device
cCharacter device
dDirectory
fRegular file
lSymbolic link
pNamed pipe (FIFO)
sSocket

Table 17.4: File types returned by the readdir extension

On systems where the directory entry contains the file type, the third field is filled in from that information. On systems without the file type information, the extension falls back to calling the stat() system call in order to provide the information. Thus the third field should never be ‘u’ (for “unknown”).

Normally, when reading directories, you should set FS equal to "/". However, you may instead chose to create PROCINFO["readdir_override"] (with any value). If this element exists when the directory is opened, then the extension automatically sets the fields in each record for you.

By default, if a directory cannot be opened (due to permission problems, for example), gawk will exit. As with regular files, this situation can be handled using a BEGINFILE rule that checks ERRNO and prints an error or otherwise handles the problem.

Here is an example:

@load "readdir"
...
BEGIN { FS = "/" }
{ print "file name is", $2 }