The output of many of Gnuastro’s programs are (or can be) FITS files. The FITS format has many useful features for storing scientific datasets (cubes, images and tables) along with a robust features for archivability. For more on this standard, please see Fits.
As a community convention described in Fits, the first extension of all FITS files produced by Gnuastro’s programs only contains the meta-data that is intended for the file’s extension(s). For a Gnuastro program, this generic meta-data (that is stored as FITS keyword records) is its configuration when it produced this dataset: file name(s) of input(s) and option names, values and comments. Note that when the configuration is too trivial (only input filename, for example the program Table) no meta-data is written in this extension.
FITS keywords have the following limitations in regards to generic option names and values which are described below:
HIERARCHwhich is followed by the keyword name.
$ astfits image_detected.fits -h0 | grep -i snquant
The keywords above are classified (separated by an empty line and title) as a group titled “ProgramName configuration”. This meta-data extension, as well as all the other extensions (which contain data), also contain have final group of keywords to keep the basic date and version information of Gnuastro, its dependencies and the pipeline that is using Gnuastro (if its under version control).
The creation time of the FITS file. This date is written directly by CFITSIO and is in UT format.
Git’s commit description from the running directory of Gnuastro’s programs. If the running directory is not version controlled or libgit2 isn’t installed (see Optional dependencies) then this keyword will not be present. The printed value is equivalent to the output of the following command:
git describe --dirty --always
If the running directory contains non-committed work, then the stored value will have a ‘
This can be very helpful to let you know that the data is not ready to be shared with collaborators or submitted to a journal.
You should only share results that are produced after all your work is committed (safely stored in the version controlled history and thus reproducible).
At first sight, version control appears to be mainly a tool for software developers. However progress in a scientific research is almost identical to progress in software development: first you have a rough idea that starts with handful of easy steps. But as the first results appear to be promising, you will have to extend, or generalize, it to make it more robust and work in all the situations your research covers, not just your first test samples. Slowly you will find wrong assumptions or bad implementations that need to be fixed (‘bugs’ in software development parlance). Finally, when you submit the research to your collaborators or a journal, many comments and suggestions will come in, and you have to address them.
Software developers have created version control systems precisely for this kind of activity. Each significant moment in the project’s history is called a “commit”, see Version controlled source. A snapshot of the project in each “commit” is safely stored away, so you can revert back to it at a later time, or check changes/progress. This way, you can be sure that your work is reproducible and track the progress and history. With version control, experimentation in the project’s analysis is greatly facilitated, since you can easily revert back if a brainstorm test procedure fails.
One important feature of version control is that the research result (FITS image, table, report or paper) can be stamped with the unique commit information that produced it. This information will enable you to exactly reproduce that same result later, even if you have made changes/progress. For one example of a research paper’s reproduction pipeline, please see the reproduction pipeline of the paper describing NoiseChisel.
The version of CFITSIO used (see CFITSIO).
The version of WCSLIB used (see WCSLIB). Note that older versions of WCSLIB do not report the version internally. So this is only available if you are using more recent WCSLIB versions.
The version of GNU Scientific Library that was used, see GNU Scientific library.
The version of Gnuastro used (see Version numbering).
Here is one example of the last few lines of an example output.
/ Versions and date DATE = '...' / file creation date COMMIT = 'v0-8-g547f6eb' / Commit description in running dir. CFITSIO = '3.45 ' / CFITSIO version. WCSLIB = '5.19 ' / WCSLIB version. GSL = '2.5 ' / GNU Scientific Library version. GNUASTRO= '0.7 ' / GNU Astronomy Utilities version. END