[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.6 Field Specifiers

There are a number of options that require a list of fields from one of the files. All take an argument consisting of a comma-separated list of individual field specifications.

The individual field specifications are of the form ‘s-e.p(instruction)’. As a standard (with no field delimiter given for the file), ‘s’ is a number indicating the starting position of the field within the records in th given file, and ‘e’ is a number pointing out the ending position. If there is no hyphen, the single number ‘s’ represents a one-byte field at position ‘s’. If the first number and a hyphen exist without the number ‘e’, then the field is assumed to start at ‘s’ and extend to the end of the record. The numbering of the bytes in the record starts at 1.

If you do provide a delimiter for identifying the borders between fields in your file, then combine does not need to count bytes, and you just need to provide the field number (starting at 1) of the fields you want to declare. Currently, giving a range of fields results in combine acting as though you listed every field from the lower end to the upper end of the range individually. Any precision or extensibility information you provide will be applied to each field in the range.

The optional ‘.p’ gives the number of digits of precision with which the field should be considered. At present its only practical use is in fields to be summed from data files, where it is used in converting between string and number formats.

The optional ‘(instruction)’ is intended for extensibility. For output fields and key fields it is a scheme command that should be run when the field is read into the system. If the associated field specification is a key for matching reference and data records, this can affect the way the records are matched. If it is a field to be written to an output file, it will affect what the user sees.

To refer to the field actually read from the source data within the scheme command, use the following sceme variable in the appropriate place in the command: ‘input-field

Scheme commands attached to individual fields need to return a character string and have available only the current text of the field as described above(1). In delimited output, feel free to alter the length of the return string at will. In fixed-width output, it will be your responsibility to ensure that you return a string of a useful length if you want to maintain the fixed structure.

In the following string,

 
1-10,15-20.2,30(HiMom input-field),40-

The fist field runs from position 1 to 10 in the record. The second from 15 to 20, with a stated (but possibly meaningless) precision of 2. The third is the single byte at position 30, to be adjusted by replacing it with whatever the scheme function HiMom does to it. The fourth field starts at position 40 and extends to the end of the record.

In a delimited example, the following two sets of options are equivalent.

 
-D ',' -o 4,5,6,7,8
-D ',' -o 4-8

In both cases the fourth, fifth, sixth, seventh, and eighth comma-delimited fields are each treated as an individual field for processing. Note that if you also provide a field order (with the ‘-O’ option), that order would refer to these fields as 1, 2, 3, 4, and 5 because of the order in which you specified them, not caring about the order in the original file.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

This document was generated by Daniel P. Valentine on July 28, 2013 using texi2html 1.82.