GNU Astronomy Utilities

Next: , Previous: , Up: Table   [Contents][Index]

5.3.2 Operation precedence in Table

The Table program can do many operations on the rows and columns of the input tables and they are not always applied in the order you call the operation on the command-line. In this section we will describe which operation is done before/after which operation. Knowing this precedence table is important to avoid confusion when you ask for more than one operation. For a description of each option, please see Invoking Table.

Column information (--information or -i)

When given this option, the column data are not read at all. Table simply reads the column metadata (name, units, numeric data type and comments), and the number of rows and prints them. Table then terminates and no other operation is done. This can therefore be called at the end of an arbitrarily long Table command only to remember the column metadata, then deleted to continue writing the command (using the shell’s history to retrieve the previous command with an up-arrow key).

Column selection (--column)

When this option is given, only the columns given to this option (from the main input) will be used for all future steps. When --column (or -c) is not given, then all the main input’s columns will be used in the next steps.

Column(s) from other file(s) (--catcolumnfile and --catcolumnhdu, --catcolumns)

When column concatenation (addition) is requested, columns from other tables (in other files, or other HDUs of the same FITS file) will be added after the existing columns read from the main input. In one command, you can call these options multiple times to allow addition of columns from many files.

The rest of the operations below are done on the rows, therefore you can merge the columns of various tables into one table, then start adding/limiting the rows of the output. If any of the row-based operations below are requested in the same asttable command, they will also be applied to the rows of the added columns. However, the conditions to keep/reject rows can only be applied to the rows of the columns in main input table (not the columns that are added with these options).

Rows from other file(s) (--catrowfile and --catrowhdu)

With this feature, you can import rows from other tables (in other files, or other HDUs of the same FITS file). The same column selection of --column is applied to the tables given here. The column metadata (name, units and comments) will be taken from the main input. Two conditions are mandatory for adding rows:

  • The number of columns used from the new tables must be equal to the number of columns in memory, by the time control reaches here.
  • The data type of each column (see Numeric data types) should be the same as the respective column in memory by the time control reaches here. If the data types are different, you can use the type conversion operators of Table’s column arithmetic on the inputs in a separate command first (see Numerical type conversion operators and Column arithmetic).
Row selection by value in a column
  • --range: only keep rows within a certain interval in given column.
  • --inpolygon: only keep rows within the polygon of --polygon.
  • --outpolygon: only keep rows outside the polygon of --polygon.
  • --equal: only keep rows with specified value in given column.
  • --notequal: only keep rows without specified value in given column.
  • --noblank: only keep rows that are not blank in the given column(s).

These options take certain column(s) as input and remove some rows from the full table (all columns), based on the given limitations. They can be called any number of times (to limit the final rows based on values in different columns for example). Since these are row-rejection operations, their internal order is irrelevant. In other words, it makes no difference if --equal is called before or after --range for example.

As a side-effect, because NaN/blank values are defined to fail on any condition, these operations will also remove rows with NaN/blank values in the specified column they are checking. Also, the columns that are used for these operations do not necessarily have to be in the final output table (you may not need the column after doing the selection based on it).

Even though these options are applied after merging columns from other tables, currently their condition-columns can only come from the main input table. In other words, even though the rows of the added columns (from another file) will also be selected with these options, the condition to keep/reject rows cannot be taken from the newly added columns.

These options are applied first because the speed of later operations can be greatly affected by the number of rows. For example, if you also call the --sort option, and your row selection will result in 50 rows (from an input of 1000 rows), limiting the number of rows can greatly speed up the sorting in your final output.

Sorting (--sort)

Sort of the rows based on values in a certain column. The column to sort by can only come from the main input table columns (not columns that may have been added with --catcolumnfile).

Row selection (by position)
  • --head: keep only requested number of top rows.
  • --tail: keep only requested number of bottom rows.
  • --rowrandom: keep only a random number of rows.
  • --rowrange: keep only rows within a certain positional interval.

These options limit/select rows based on their position within the table (not their value in any certain column).

Column arithmetic

Once the final rows are selected in the requested order, column arithmetic is done (if requested). For more on column arithmetic, see Column arithmetic.

Column metadata (--colmetadata)

Changing column metadata is necessary after column arithmetic or adding new columns from other tables (that were done above).

Output row selection (--noblankend)

Only keep the output rows that do not have a blank value in the given column(s). For example, you may need to apply arithmetic operations on the columns (through Column arithmetic) before rejecting the undesired rows. After the arithmetic operation is done, you can use the where operator to set the non-desired columns to NaN/blank and use --noblankend option to remove them just before writing the output. In other scenarios, you may want to remove blank values based on columns in another table. You can also use the modified metadata of the previous steps to use updated names! See the example below for applying any generic value-based row selection based on --noblankend.

As an example, let’s review how Table interprets the command below. We are assuming that table.fits contains at least three columns: RA, DEC and PARAM and you only want the RA and Dec of the rows where \(p\times 2<5\) (\(p\) is the value of each row in the PARAM column).

asttable table.fits -cRA,DEC --noblankend=MULTIP \
         -c'arith PARAM 2 x set-i i i 5 gt nan where' \
         --colmetadata=3,MULTIP,unit,"Description of column"

Due to the precedence described in this section, Table does these operations (which are independent of the order of the operations written on the command-line):

  1. At the start (with -cRA,DEC), Table reads the RA and DEC columns.
  2. In between all the operations in the command above, Column arithmetic (with -c'arith ...') has the highest precedence. So the arithmetic operation is done and stored as a new (third) column. In this arithmetic operation, we multiply all the values of the PARAM column by 2, then set all those with a value larger than 5 to NaN (for more on understanding this operation, see the ‘set-’ and ‘where’ operators in Arithmetic operators).
  3. Updating column metadata (with --colmetadata) is then done to give a name (MULTIP) to the newly calculated (third) column. During the process, besides a name, we also set a unit and description for the new column. These metadata entries are very important, so always be sure to add metadata after doing column arithmetic.
  4. The lowest precedence operation is --noblankend=MULTIP. So only rows that are not blank/NaN in the MULTIP column are kept.
  5. Finally, the output table (with three columns) is written to the command-line. If you also want to print the column metadata, you can use the --colinfoinstdout option. Alternatively, if you want the output in a file, you can use the --output option to save the table in FITS or plain-text format.

Out of precedence: It may happen that your desired operation needs a separate precedence. In this case you can pipe the output of Table into another call of Table and use the --colinfoinstdout option to preserve the metadata between the two calls.

For example, let’s assume that you want to sort the output table from the example command above based on the new MULTIP column. Since sorting is done prior to column arithmetic, you cannot do it in one command, but you can circumvent this limitation by simply piping the output (including metadata) to another call to Table:

asttable table.fits -cRA,DEC --noblankend=MULTIP --colinfoinstdout \
         -c'arith PARAM 2 x set-i i i 5 gt nan where' \
         --colmetadata=3,MULTIP,unit,"Description of column" \
         | asttable --sort=MULTIP --output=selected.fits

Next: Invoking Table, Previous: Column arithmetic, Up: Table   [Contents][Index]