B.5 Other Freely Available awk Implementations

It’s kind of fun to put comments like this in your awk code:
      // Do C++ comments work? answer: yes! of course

Michael Brennan

There are a number of other freely available awk implementations. This section briefly describes where to get them:

Unix awk

Brian Kernighan, one of the original designers of Unix awk, has made his implementation of awk freely available. You can retrieve it from GitHub:

git clone https://github.com/onetrueawk/awk bwkawk

This command creates a copy of the Git repository in a directory named bwkawk. If you omit the last argument from the git command line, the repository copy is created in a directory named awk.

This version requires an ISO C (1990 standard) compiler; the C compiler from GCC (the GNU Compiler Collection) works quite nicely.

To build it, review the settings in the makefile, and then just run make. Note that the result of compilation is named a.out; you will have to rename it to something reasonable.

See Common Extensions Summary for a list of extensions in this awk that are not in POSIX awk.

In 2023, Brian Kernighan, along with Al Aho and Peter Weinberger, published a second edition of their book on awk. Professor Kernighan also maintains a companion web site for the book. A copy of all the book’s programs are available there for download.

As a side note, Dan Bornstein has created a Git repository tracking all the versions of BWK awk that he could find. It’s available at https://github.com/danfuzz/one-true-awk.

mawk

Michael Brennan wrote an independent implementation of awk, called mawk. It is available under the GPL (see GNU General Public License), just as gawk is.

The original distribution site for the mawk source code no longer has it. A copy is available at http://www.skeeve.com/gawk/mawk1.3.3.tar.gz.

In 2009, Thomas Dickey took on mawk maintenance. Basic information is available on the project’s web page. The download URL is http://invisible-island.net/datafiles/release/mawk.tar.gz.

Once you have it, gunzip may be used to decompress this file. Installation is similar to gawk’s (see Compiling and Installing gawk on Unix-Like Systems).

See Common Extensions Summary for a list of extensions in mawk that are not in POSIX awk.

mawk 2.0

In 2016, Michael Brennan resumed mawk development. His development snapshots are available via Git from the project’s GitHub page.

awka

Written by Andrew Sumner, awka translates awk programs into C, compiles them, and links them with a library of functions that provide the core awk functionality. It also has a number of extensions.

Both the awk translator and the library are released under the GPL.

To get awka, go to https://sourceforge.net/projects/awka.

The project seems to be frozen; no new code changes have been made since approximately 2001.

Revive Awka

This project, available at https://github.com/noyesno/awka, intends to fix bugs in awka and add more features.

pawk

Nelson H.F. Beebe at the University of Utah has modified BWK awk to provide timing and profiling information. It is different from gawk with the --profile option (see Profiling Your awk Programs) in that it uses CPU-based profiling, not line-count profiling. You may find it at either ftp://ftp.math.utah.edu/pub/pawk/pawk-20030606.tar.gz or http://www.math.utah.edu/pub/pawk/pawk-20030606.tar.gz.

BusyBox awk

BusyBox is a GPL-licensed program providing small versions of many applications within a single executable. It is aimed at embedded systems. It includes a full implementation of POSIX awk. When building it, be careful not to do ‘make install’ as it will overwrite copies of other applications in your /usr/local/bin. For more information, see the project’s home page.

The OpenSolaris POSIX awk

The versions of awk in /usr/xpg4/bin and /usr/xpg6/bin on Solaris are more or less POSIX-compliant. They are based on the awk from Mortice Kern Systems for PCs. We were able to make this code compile and work under GNU/Linux with 1–2 hours of work. Making it more generally portable (using GNU Autoconf and/or Automake) would take more work, and this has not been done, at least to our knowledge.

The source code used to be available from the OpenSolaris website. However, that project was ended and the website shut down. Fortunately, the Illumos project makes this implementation available. You can view the files one at a time from https://github.com/joyent/illumos-joyent/blob/master/usr/src/cmd/awk_xpg4.

frawk

This is a language for writing short programs. “To a first approximation, it is an implementation of the AWK language; many common awk programs produce equivalent output when passed to frawk.” However, it has a number of important additional features. The code is available at https://github.com/ezrosent/frawk.

goawk

This is an awk interpreter written in the Go programming language. It implements POSIX awk, with a few minor extensions. Source code is available from https://github.com/benhoyt/goawk. The author wrote a nice article describing the implementation.

AWKgo

This is an awk to Go translator. It was written by the author of goawk. (See the previous entry in this list.) Source code is available from https://github.com/benhoyt/goawk/tree/master/awkgo. The author’s article about it is at https://benhoyt.com/writings/awkgo/.

jawk

This is an interpreter for awk written in Java. It claims to be a full interpreter, although because it uses Java facilities for I/O and for regexp matching, the language it supports is different from POSIX awk. More information is available on the project’s home page.

Hoijui’s jawk

This project, available at https://github.com/hoijui/Jawk, is another awk interpreter written in Java. It uses modern Java build tools.

Libmawk

This is an embeddable awk interpreter derived from mawk. For more information, see http://repo.hu/projects/libmawk/.

Mircea Neacsu’s Embeddable awk

Mircea Neacsu has created an embeddable awk interpreter, based on BWK awk. It’s available at https://github.com/neacsum/awk.

pawk

This is a Python module that claims to bring awk-like features to Python. See https://github.com/alecthomas/pawk for more information. (This is not related to Nelson Beebe’s modified version of BWK awk, described earlier.)

awkcc

This is an early adaptation of Unix awk that translates awk into C code. It was done by J. Christopher Ramming at Bell Labs, circa 1988. It’s available at https://github.com/nokia/awkcc. Bringing this up to date would be an interesting software engineering exercise.

QSE awk

This is an embeddable awk interpreter. For more information, see https://code.google.com/p/qse/.

QTawk

This is an independent implementation of awk distributed under the GPL. It has a large number of extensions over standard awk and may not be 100% syntactically compatible with it. See http://www.quiktrim.org/QTawk.html for more information, including the manual. The download link there is out of date; see http://www.quiktrim.org/#AdditionalResources for the latest download link.

The project may also be frozen; no new code changes have been made since approximately 2014.

cppawk

Quoting from the web page, “cppawk is a tiny shell script that is used like awk. It invokes the C preprocessor (GNU cpp) on the Awk code and calls Awk on the result.” This program may be of use if the way gawk’s @include facility works doesn’t suit your needs. For more information, see https://www.kylheku.com/cgit/cppawk/.

Other versions

See also the “Versions and implementations” section of the Wikipedia article on awk for information on additional versions.

An interesting collection of library functions is available at https://github.com/e36freak/awk-libs.

An interesting collection of gawk extensions is available https://github.com/su8/gawk-extensions.