The term awk refers to a particular program as well as to the language you use to tell this program what to do. When we need to be careful, we call the language “the awk language,” and the program “the awk utility.” This Web page explains both how to write programs in the awk language and how to run the awk utility. The term awk program refers to a program written by you in the awk programming language.
Primarily, this Web page explains the features of awk as defined in the POSIX standard. It does so in the context of the gawk implementation. While doing so, it also attempts to describe important differences between gawk and other awk implementations.1 Finally, any gawk features that are not in the POSIX standard for awk are noted.
This Web page has the difficult task of being both a tutorial and a reference. If you are a novice, feel free to skip over details that seem too complex. You should also ignore the many cross-references; they are for the expert user and for the online Info and HTML versions of the document.
There are sidebars scattered throughout the Web page. They add a more complete explanation of points that are relevant, but not likely to be of interest on first reading. All appear in the index, under the heading “sidebar.”
Most of the time, the examples use complete awk programs. Some of the more advanced sections show only the part of the awk program that illustrates the concept currently being described.
While this Web page is aimed principally at people who have not been exposed to awk, there is a lot of information here that even the awk expert should find useful. In particular, the description of POSIX awk and the example programs in Library Functions, and in Sample Programs, should be of interest.
This Web page is split into several parts, as follows:
Part I describes the awk language and gawk program in detail. It starts with the basics, and continues through all of the features of awk. It contains the following chapters:
Getting Started, provides the essentials you need to know to begin using awk.
Invoking Gawk, describes how to run gawk, the meaning of its command-line options, and how it finds awk program source files.
Regexp, introduces regular expressions in general, and in particular the flavors supported by POSIX awk and gawk.
describes how awk reads your data.
It introduces the concepts of records and fields, as well
I/O redirection is first described here.
Network I/O is also briefly introduced here.
describes how awk programs can produce output with
Expressions, describes expressions, which are the basic building blocks for getting most things done in a program.
Patterns and Actions, describes how to write patterns for matching records, actions for doing something when a record is matched, and the built-in variables awk and gawk use.
Arrays, covers awk's one-and-only data structure: associative arrays. Deleting array elements and whole arrays is also described, as well as sorting arrays in gawk. It also describes how gawk provides arrays of arrays.
Functions, describes the built-in functions awk and gawk provide, as well as how to define your own functions.
Part II shows how to use awk and gawk for problem solving. There is lots of code here for you to read and learn from. It contains the following chapters:
Library Functions, which provides a number of functions meant to be used from main awk programs.
Sample Programs, which provides many sample awk programs.
Reading these two chapters allows you to see awk solving real problems.
Part III focuses on features specific to gawk. It contains the following chapters:
Advanced Features, describes a number of gawk-specific advanced features. Of particular note are the abilities to have two-way communications with another process, perform TCP/IP networking, and profile your awk programs.
Internationalization, describes special features in gawk for translating program messages into different languages at runtime.
Debugger, describes the awk debugger.
Arbitrary Precision Arithmetic, describes advanced arithmetic facilities provided by gawk.
Dynamic Extensions, describes how to add new variables and functions to gawk by writing extensions in C or C++.
Part IV provides the appendices, the Glossary, and two licenses that cover the gawk source code and this Web page, respectively. It contains the following appendices:
Language History, describes how the awk language has evolved since its first release to present. It also describes how gawk has acquired features over time.
Installation, describes how to get gawk, how to compile it on POSIX-compatible systems, and how to compile and use it on different non-POSIX systems. It also describes how to report bugs in gawk and where to get other freely available awk implementations.
Notes, describes how to disable gawk's extensions, as well as how to contribute new code to gawk, and some possible future directions for gawk development.
Basic Concepts, provides some very cursory background material for those who are completely unfamiliar with computer programming.
The Glossary, defines most, if not all, the significant terms used throughout the book. If you find terms that you aren't familiar with, try looking them up here.
Copying, and GNU Free Documentation License, present the licenses that cover the gawk source code and this Web page, respectively.
 All such differences appear in the index under the entry “differences in awk and gawk.”