4.6.4 Field Values With Fixed-Width Data

So far, so good. But what happens if there isn’t as much data as there should be based on the contents of FIELDWIDTHS? Or, what happens if there is more data than expected?

For many years, what happens in these cases was not well defined. Starting with version 4.2, the rules are as follows:

Enough data for some fields

For example, if FIELDWIDTHS is set to "2 3 4" and the input record is ‘aabbb’. In this case, NF is set to two.

Not enough data for a field

For example, if FIELDWIDTHS is set to "2 3 4" and the input record is ‘aab’. In this case, NF is set to two and $2 has the value "b". The idea is that even though there aren’t as many characters as were expected, there are some, so the data should be made available to the program.

Too much data

For example, if FIELDWIDTHS is set to "2 3 4" and the input record is ‘aabbbccccddd’. In this case, NF is set to three and the extra characters (‘ddd’) are ignored. If you want gawk to capture the extra characters, supply a final ‘*’ in the value of FIELDWIDTHS.

Too much data, but with ‘*’ supplied

For example, if FIELDWIDTHS is set to "2 3 4 *" and the input record is ‘aabbbccccddd’. In this case, NF is set to four, and $4 has the value "ddd".