GNU Astronomy Utilities



4.10 Output FITS files

The output of many of Gnuastro’s programs are (or can be) FITS files. The FITS format has many useful features for storing scientific datasets (cubes, images and tables) along with a robust features for archivability. For more on this standard, please see Fits.

As a community convention described in Fits, the first extension of all FITS files produced by Gnuastro’s programs only contains the meta-data that is intended for the file’s extension(s). For a Gnuastro program, this generic meta-data (that is stored as FITS keyword records) is its configuration when it produced this dataset: file name(s) of input(s) and option names, values and comments. You can use the --outfitsnoconfig option to stop the programs from writing these keywords into the first extension of their output.

When the configuration is too trivial (only input filename, for example, the program Table) no meta-data is written in this extension. FITS keywords have the following limitations in regards to generic option names and values which are described below:

The keywords above are classified (separated by an empty line and title) as a group titled “ProgramName configuration”. This meta-data extension also contains a final group of keywords to keep the basic date and version information of Gnuastro, its dependencies and the pipeline that is using Gnuastro (if it is under version control); they are listed below.

DATE

The creation time of the FITS file. This date is written directly by CFITSIO and is in UT format.

While the date can be a good metadata in most scenarios, it does have a caveat: when everything else in your output is the same between multiple runs, the date will be different! If exact reproducibility is important for you, this can be annoying! To stop any Gnuastro program from writing the DATE keyword, you can use the --outfitsnodate (see Input/Output options).

DATEUTC

If the date in the DATE keyword is in UTC, this keyword will have a value of 1; otherwise, it will have a value of 0. If DATE is not written, this is also ignored.

COMMIT

Git’s commit description from the running directory of Gnuastro’s programs. If the running directory is not version controlled or libgit2 is not installed (see Optional dependencies) then this keyword will not be present. The printed value is equivalent to the output of the following command:

git describe --dirty --always

If the running directory contains non-committed work, then the stored value will have a ‘-dirty’ suffix. This can be very helpful to let you know that the data is not ready to be shared with collaborators or submitted to a journal. You should only share results that are produced after all your work is committed (safely stored in the version controlled history and thus reproducible).

At first sight, version control appears to be mainly a tool for software developers. However progress in a scientific research is almost identical to progress in software development: first you have a rough idea that starts with handful of easy steps. But as the first results appear to be promising, you will have to extend, or generalize, it to make it more robust and work in all the situations your research covers, not just your first test samples. Slowly you will find wrong assumptions or bad implementations that need to be fixed (‘bugs’ in software development parlance). Finally, when you submit the research to your collaborators or a journal, many comments and suggestions will come in, and you have to address them.

Software developers have created version control systems precisely for this kind of activity. Each significant moment in the project’s history is called a “commit”, see Version controlled source. A snapshot of the project in each “commit” is safely stored away, so you can revert back to it at a later time, or check changes/progress. This way, you can be sure that your work is reproducible and track the progress and history. With version control, experimentation in the project’s analysis is greatly facilitated, since you can easily revert back if a brainstorm test procedure fails.

One important feature of version control is that the research result (FITS image, table, report or paper) can be stamped with the unique commit information that produced it. This information will enable you to exactly reproduce that same result later, even if you have made changes/progress. For one example of a research paper’s reproduction pipeline, please see the reproduction pipeline of Akhlaghi and Ichikawa 2015 describing NoiseChisel.

In case you don’t want the COMMIT keyword in the first extension of your output FITS file, you can use the --outfitsnocommit option (see Input/Output options).

CFITSIO

The version of CFITSIO used (see CFITSIO). This can be disabled with --outfitsnoversions (see Input/Output options).

WCSLIB

The version of WCSLIB used (see WCSLIB). Note that older versions of WCSLIB do not report the version internally. So this is only available if you are using more recent WCSLIB versions. This can be disabled with --outfitsnoversions (see Input/Output options).

GSL

The version of GNU Scientific Library that was used, see GNU Scientific Library. This can be disabled with --outfitsnoversions (see Input/Output options).

GNUASTRO

The version of Gnuastro used (see Version numbering). This can be disabled with --outfitsnoversions (see Input/Output options).