12.5 Using gawk for Network Programming

EMRED:
    A host is a host from coast to coast,
    and nobody talks to a host that’s close,
    unless the host that isn’t close
    is busy, hung, or dead.

Mike O’Brien (aka Mr. Protocol)

In addition to being able to open a two-way pipeline to a coprocess on the same system (see Two-Way Communications with Another Process), it is possible to make a two-way connection to another process on another system across an IP network connection.

You can think of this as just a very long two-way pipeline to a coprocess. The way gawk decides that you want to use TCP/IP networking is by recognizing special file names that begin with one of ‘/inet/’, ‘/inet4/’, or ‘/inet6/’.

The full syntax of the special file name is /net-type/protocol/local-port/remote-host/remote-port. The components are:

net-type

Specifies the kind of Internet connection to make. Use ‘/inet4/’ to force IPv4, and ‘/inet6/’ to force IPv6. Plain ‘/inet/’ (which used to be the only option) uses the system default, most likely IPv4.

protocol

The protocol to use over IP. This must be either ‘tcp’, or ‘udp’, for a TCP or UDP IP connection, respectively. TCP should be used for most applications.

local-port

The local TCP or UDP port number to use. Use a port number of ‘0’ when you want the system to pick a port. This is what you should do when writing a TCP or UDP client. You may also use a well-known service name, such as ‘smtp’ or ‘http’, in which case gawk attempts to determine the predefined port number using the C getaddrinfo() function.

remote-host

The IP address or fully qualified domain name of the Internet host to which you want to connect.

remote-port

The TCP or UDP port number to use on the given remote-host. Again, use ‘0’ if you don’t care, or else a well-known service name.

NOTE: Failure in opening a two-way socket will result in a nonfatal error being returned to the calling code. The value of ERRNO indicates the error (see Built-in Variables That Convey Information).

Consider the following very simple example:

BEGIN {
    Service = "/inet/tcp/0/localhost/daytime"
    Service |& getline
    print $0
    close(Service)
}

This program reads the current date and time from the local system’s TCP daytime server. It then prints the results and closes the connection.

Because this topic is extensive, the use of gawk for TCP/IP programming is documented separately. See TCP/IP Internetworking with gawk, which comes as part of the gawk distribution, for a much more complete introduction and discussion, as well as extensive examples.

NOTE: gawk can only open direct sockets. There is currently no way to access services available over Secure Socket Layer (SSL); this includes any web service whose URL starts with ‘https://’.