GNU Astronomy Utilities



12.3.10 Table input output (table.h)

Tables are a collection of one dimensional datasets that are packed together into one file. They are the single most common format to store high-level (processed) information, hence they play a very important role in Gnuastro. For a more thorough introduction, please see Table. Gnuastro’s Table program, and all the other programs that can read from and write into tables, use the functions of this section for reading and writing their input/output tables. For a simple demonstration of using the constructs introduced here, see Library demo - reading and writing table columns.

Currently only plain text (see Gnuastro text table format) and FITS (ASCII and binary) tables are supported by Gnuastro. However, the low-level table infra-structure is written such that accommodating other formats is also possible and in future releases more formats will hopefully be supported. Please do not hesitate to suggest your favorite format so it can be implemented when possible.

Macro: GAL_TABLE_DEF_WIDTH_STR
Macro: GAL_TABLE_DEF_WIDTH_INT
Macro: GAL_TABLE_DEF_WIDTH_LINT
Macro: GAL_TABLE_DEF_WIDTH_FLT
Macro: GAL_TABLE_DEF_WIDTH_DBL
Macro: GAL_TABLE_DEF_PRECISION_INT
Macro: GAL_TABLE_DEF_PRECISION_FLT
Macro: GAL_TABLE_DEF_PRECISION_DBL

The default width and precision for generic types to use in writing numeric types into a text file (plain text and FITS ASCII tables). When the dataset does not have any pre-set width and precision (see disp_width and disp_precision in Generic data container (gal_data_t)) these will be directly used in C’s printf command to write the number as a string.

Macro: GAL_TABLE_DISPLAY_FMT_STRING
Macro: GAL_TABLE_DISPLAY_FMT_DECIMAL
Macro: GAL_TABLE_DISPLAY_FMT_UDECIMAL
Macro: GAL_TABLE_DISPLAY_FMT_OCTAL
Macro: GAL_TABLE_DISPLAY_FMT_HEX
Macro: GAL_TABLE_DISPLAY_FMT_FIXED
Macro: GAL_TABLE_DISPLAY_FMT_EXP
Macro: GAL_TABLE_DISPLAY_FMT_GENERAL

The display format used in C’s printf to display data of different types. The _STRING and _DECIMAL are unique for printing strings and signed integers, they are mainly here for completeness. However, unsigned integers and floating points can be displayed in multiple formats:

Unsigned integer

For unsigned integers, it is possible to choose from _UDECIMAL (unsigned decimal), _OCTAL (octal notation, for example, 125 in decimal will be displayed as 175), and _HEX (hexadecimal notation, for example, 125 in decimal will be displayed as 7D).

Floating point

For floating point, it is possible to display the number in _FLOAT (floating point, for example, 1500.345), _EXP (exponential, for example, 1.500345e+03), or _GENERAL which is the best of the two for the given number.

Macro: GAL_TABLE_FORMAT_INVALID
Macro: GAL_TABLE_FORMAT_TXT
Macro: GAL_TABLE_FORMAT_AFITS
Macro: GAL_TABLE_FORMAT_BFITS

All the current acceptable table formats to Gnuastro. The AFITS and BFITS represent FITS ASCII tables and FITS Binary tables. You can use these anywhere you see the tableformat variable.

Macro: GAL_TABLE_SEARCH_INVALID
Macro: GAL_TABLE_SEARCH_NAME
Macro: GAL_TABLE_SEARCH_UNIT
Macro: GAL_TABLE_SEARCH_COMMENT

When the desired column is not a number, these values determine if the string to match, or regular expression to search, be in the name, units or comments of the column metadata. These values should be used for the searchin variables of the functions.

Function:
uint8_t
gal_table_displayflt_from_str (char *string)

Convert the input string into one of the GAL_TABLE_DISPLAY_FMT_FIXED (for fixed-point notation) or GAL_TABLE_DISPLAY_FMT_EXP (for exponential notation).

Function:
char *
gal_table_displayflt_to_str (uint8_t id)

Convert the input identifier (one of the GAL_TABLE_DISPLAY_FMT_FIXED; for fixed-point notation, or GAL_TABLE_DISPLAY_FMT_EXP; for exponential notation) into a standard string that is used to identify them.

Function:
gal_data_t *
gal_table_info (char *filename, char *hdu, gal_list_str_t *lines, size_t *numcols, size_t *numrows, int *tableformat)

Store the information of each column of a table into an array of meta-data gal_data_ts. In a metadata gal_data_t, the size elements are zero (ndim=size=0 and dsize=NULL) but other relevant elements are filled). See the end of this description for the exact components of each gal_data_t that are filled.

The returned array of gal_data_ts has numcols datasets (one data structure for each column). The number of rows in each dataset is stored in numrows (in a table, all the columns have the same number of rows). The format of the table (e.g., ASCII text file, or FITS binary or ASCII table) will be put in tableformat (macros defined above). If the filename is not a FITS file, then hdu will not be used (can be NULL).

The input must be either a file (specified by filename) or a list of strings (lines). lines is a list of strings with each node representing one line (including the new-line character), see List of strings. It will mostly be the output of gal_txt_stdin_read, which is used to read the program’s input as separate lines from the standard input (see Text files (txt.h)). Note that filename and lines are mutually exclusive and one of them must be NULL.

In the output datasets, only the meta-data strings (column name, units and comments), will be allocated and set as shown below. This function is just for column information (meta-data), not column contents.

*restrict array  ->  Blank value (if present, in col's own type).
           type  ->  Type of column data.
           ndim  ->  0
         *dsize  ->  NULL
           size  ->  0
      quietmmap  ->  ------------
      *mmapname  ->  ------------
     minmapsize  ->  Repeat (length of vector; 1 if not vector).
           nwcs  ->  ------------
           *wcs  ->  ------------
           flag  ->  'GAL_TABLEINTERN_FLAG_*' macros.
         status  ->  ------------
          *name  ->  Column name.
          *unit  ->  Column unit.
       *comment  ->  Column comments.
       disp_fmt  ->  'GAL_TABLE_DISPLAY_FMT' macros.
     disp_width  ->  Width of string columns.
 disp_precision  ->  ------------
          *next  ->  Pointer to next column's metadata
         *block  ->  ------------
Function:
void
gal_table_print_info (gal_data_t *allcols, size_t numcols, size_t numrows, char *hdu_option_name)

Print the column information for all the columns (output of gal_table_info) to standard output. The output is in the same format as this command with Gnuastro Table program (see Invoking Table):

$ asttable --info table.fits
Function:
gal_data_t *
gal_table_read (char *filename, char *hdu, gal_list_str_t *lines, gal_list_str_t *cols, int searchin, int ignorecase, size_t numthreads, size_t minmapsize, int quietmmap, size_t *colmatch, char *hdu_option_name)

Read the specified columns in a file (named filename), or list of strings (lines) into a linked list of data structures. If the file is FITS, then hdu will also be used, otherwise, hdu is ignored. For more on hdu_option_name see the description of gal_array_read in Array input output.

lines is a list of strings with each node representing one line (including the new-line character), see List of strings. It will mostly be the output of gal_txt_stdin_read, which is used to read the program’s input as separate lines from the standard input (see Text files (txt.h)). Note that filename and lines are mutually exclusive and one of them must be NULL.

The information to search for columns should be specified by the cols list of strings (see List of strings). The string in each node of the list may be a number, an exact match to a column name, or a regular expression (in GNU AWK format) enclosed in / /. The searchin value must be one of the macros defined above. If cols is NULL, then this function will read the full table. Also, the ignorecase value should be 1 if you want to ignore the case of alphabetic characters while matching/searching column meta-data (see Input/Output options).

For FITS tables, each column will be read independently. Therefore they will be read in numthreads CPU threads to greatly speed up the reading when there are many columns and rows. However, this only happens if CFITSIO was configured with --enable-reentrant. This test has been done at Gnuastro’s configuration time; if so, GAL_CONFIG_HAVE_FITS_IS_REENTRANT will have a value of 1, otherwise, it will have a value of 0. For more on this macro, see Configuration information (config.h)). Multi-threaded table reading is not currently applicable to other table formats (only for FITS tables).

The output is an individually allocated list of datasets (see List of gal_data_t) with the same order of the cols list. Note that one column node in the cols list might give multiple columns (for example, from regular expressions), in this case, the order of output columns that correspond to that one input, are in order of the table (which column was read first). So the first requested column is the first popped data structure and so on.

if colmatch!=NULL, it is assumed to be an array that has at least the same number of elements as nodes in the cols list. The number of columns that matched each input column will be stored in each element.

Function:
gal_list_sizet_t *
gal_table_list_of_indexs (gal_list_str_t *cols, gal_data_t *allcols, size_t numcols, int searchin, int ignorecase, char *filename, char *hdu, size_t *colmatch)

Returns a list of indices (starting from 0) of the input columns that match the names/numbers given to cols. This is a low-level operation which is called by gal_table_read (described above), see there for more on each argument’s description. allcols is the returned array of gal_table_info.

Function:
void
gal_table_comments_add_intro (gal_list_str_t **comments, char *program_string, time_t *rawtime)

Add some basic information to the list of comments. This basic information includes the following information

  • If the program is run in a Git version controlled directory, Git’s description is printed (see description under COMMIT in Output FITS files).
  • The calendar time that is stored in rawtime (time_t is C’s calendar time format defined in time.h). You can calculate the time in this format with the following expressions:
    time_t rawtime;
    time(&rawtime);
    
  • The name of your program in program_string. If it is NULL, this line is ignored.
Function:
void
gal_table_write (gal_data_t *cols, struct gal_fits_list_key_t **keylist, gal_list_str_t *comments, int tableformat, char *filename, char *extname, uint8_t colinfoinstdout, int freekeys)

Write cols (a list of datasets, see List of gal_data_t) into a table stored in filename. The format of the table can be determined with tableformat that accepts the macros defined above. When filename==NULL, the column information will be printed on the standard output (command-line).

If comments!=NULL, the list of comments (see List of strings) will also be printed into the output table. When the output table is a plain text file, every node of comments will be printed after a # (so it can be considered as a comment) and in FITS table they will follow a COMMENT keyword.

If a file named filename already exists, the operation depends on the type of output. When filename is a FITS file, the table will be added as a new extension after all existing extensions. If filename is a plain text file, this function will abort with an error.

If filename is a FITS file, the table extension will have the name extname.

When colinfoinstdout!=0 and filename==NULL (columns are printed in the standard output), the dataset metadata will also printed in the standard output. When printing to the standard output, the column information can be piped into another program for further processing and thus the meta-data (lines starting with a #) must be ignored. In such cases, you only print the column values by passing 0 to colinfoinstdout.

Function:
void
gal_table_write_log (gal_data_t *logll, char *program_string, time_t *rawtime, gal_list_str_t *comments, char *filename, int quiet)

Write the logll list of datasets into a table in filename (see List of gal_data_t). This function is just a wrapper around gal_table_comments_add_intro and gal_table_write (see above). If quiet is non-zero, this function will print a message saying that the filename has been created.

Function:
gal_data_t *
gal_table_col_vector_extract (gal_data_t *vector, gal_list_sizet_t *indexs)

Given the “vector” column vector (which is assumed to be a 2D dataset), extract the tokens that are identified in the indexs list into a list of one dimensional datasets. For more on vector columns in tables, see Vector columns.

Function:
gal_data_t *
gal_table_cols_to_vector (gal_data_t *list)

Merge the one-dimensional datasets in the given list into one 2-dimensional dataset that can be treated as a vector column. All the input datasets have to have the same size and type. For more on vector columns in tables, see Vector columns.