Coadding operators (GNU Astronomy Utilities)

Next: Filtering (smoothing) operators, Previous: Statistical operators, Up: Arithmetic operators [Contents][Index]

6.2.4.7 Coadding operators ¶

The operators in this section are used when you have multiple datasets that you would like to merge into one. For example, you have taken ten exposures of your scientific target, and you would like to combine them all into one deep stacked image that is deeper. This is commonly known as “stacking” or “coaddition”. We use the latter in Gnuastro because “stack” refers to the intermediate 3D data set (if we are coadding 2D images). However, the final product of this operation has the same number of dimensions as each of the inputs. The astronomical community has traditionally used the term “coadd” to specify that the nature of the output 2D image is different from each of the input 2D images (it is created from them). Furthermore, within Arithmetic, “Stack” is already reserved for the stack of operands that the operators read from (see Reverse polish notation).

Masking outliers (before coadding): Outliers in one of the inputs (for example star ghosts, satellite trails, or cosmic rays, can leave their imprints in the final coadd. One good way to remove them is the madclip-maskfilled operator that can be called before the operators here. It is described in Statistical operators; and a full tutorial on understanding outliers and how best to remove them is available in Clipping outliers.

Hundreds or thousands of images to coadd: It can happen that you need to coadd hundreds or thousands of images. Added with the possibly long file/directory names, this can lead to an extremely long shell command that may cause an “Argument list too long” error in your shell. To avoid this, you should use Arithmetic’s --arguments option, see Invoking Arithmetic.

When calling the coadding operators you should determine how many operands they should take in: unlike the rest of the operators that have a fixed number of input operands, these operators have a variable number of input operators. As described in the first operand below, you do this through their first (or early) popped operands (which should be a single integer number that is larger than one). Below are some important points for all the coadding operators described in this section:

NaN/blank pixels will be ignored, see Blank pixels.
The operation will be multi-threaded, greatly speeding up the process if you have large and numerous data to coadd. You can disable multi-threaded operations with the --numthreads=1 option (see Multi-threaded operations).

min

max

sum

std

mad

mean

median

number

For each pixel, calculate the respective statistic from in all given datasets. For the min and max operators, the output will have the same type as the input. For the number operator, the output will have an unsigned 32-bit integer type and the rest will be 32-bit floating point.

The first popped operand to this operator must be a positive integer number which specifies how many further operands should be popped from the stack. All the subsequently popped operands must have the same type and size. This operator (and all the variable-operand operators similar to it that are discussed below) will work in multi-threaded mode unless Arithmetic is called with the --numthreads=1 option, see Multi-threaded operations.

For example, the following command will produce an image with the same size and type as the three inputs, but each output pixel value will be the minimum of the same pixel’s values in all three input images.

$ astarithmetic a.fits b.fits c.fits 3 min --output=min.fits

Regarding the number operator: some datasets may have blank values (which are also ignored in all similar operators like min, sum, mean or median). Hence, the final pixel values of this operator will not, in general, be equal to the number of inputs. This operator is therefore mostly called in parallel with those operators to know the “weight” of each pixel (in case you want to only keep pixels that had the full exposure for example).

quantile

For each pixel, find the quantile from all given datasets. The output will have the same numeric data type and size as the input datasets. Besides the input datasets, the quantile operator also needs a single parameter (the requested quantile). The parameter should be the first popped operand, with a value between (and including) 0 and 1. The second popped operand must be the number of datasets to use.

In the example below, the first-popped operand (0.7) is the quantile, the second-popped operand (3) is the number of datasets to pop.

astarithmetic a.fits b.fits c.fits 3 0.7 quantile

sigclip-mad ¶

sigclip-std

sigclip-mean

sigclip-median

Return the respective statistic after \(\sigma\)-clipping the values of the same pixel of all the input operands. The respective statistic will be stored in a 32-bit floating point number. The number of inputs used to make the desired measurement for each pixel is also returned as a second output operand; see below for more on how to deal with the second output operand.

For a complete tutorial on clipping outliers when coadding images see Clipping outliers (if you haven’t read it yet, we encourage you to read through it before continuing). In particular, the most robust solution is to first use madclip-maskfilled (described in Statistical operators), then use any of these.

This operator is very similar to min, with the exception that it expects two extra operands (parameters for MAD-clipping) before the total number of inputs. The first popped operand is the termination criteria and the second is the multiple of the median absolute deviation.

For example, in the command below, the first popped operand of sigclip-mean (0.1) is the \(\sigma\)-clipping termination criteria. If the termination criteria is larger than, or equal to 1, it is interpreted as the total number of clips. But if it is between 0 and 1, then it is the tolerance level on the change in the median absolute deviation (see Sigma clipping). The second popped operand (4) is the multiple of sigma (STD) to use. The third popped operand (3) is number of datasets that should be coadded (similar to the first popped operand to min). Two other side-notes should be mentioned here:

As mentioned above, before this operator, we are masking the filled MAD-clipped elements with madclip-maskfilled. As described in Clipping outliers, this is very important for removing the types of outliers that we have in astronomical imaging.
We are using --writeall because this operator places two operands on the coadd: your desired statistics, and the number of inputs that were used in it (after clipping).

$ astarithmetic a.fits b.fits c.fits -g1 --writeall \
                3 5 0.01 madclip-maskfilled \
                3 4 0.1  sigclip-mean

The numbers image has the smallest unsigned integer type that fits the total number of your input datasets (see Numeric data types). For example if you have less than 255 input operands (not pixels!), then it will have an unsigned 8-bit integer type, if you have 1000 input operands (or any number less than 65534 inputs), it will be an unsigned 16-bit integer. Recall that when you have many input files to coadd, it may be necessary to write the arguments into a text file and use --arguments (see Invoking Arithmetic).

The numbers image is included by default because it is usually important in clipping based coadds (where the number of inputs used in the calculation of each pixel can be different from another pixel, and this affects the final output noise). In case you are not interested in the numbers image, you should first swap the two output operands, then free the top operand like below.

$ astarithmetic a.fits b.fits c.fits -g1 \
                3 5 0.01 madclip-maskfilled \
                3 4 0.1  sigclip-mean swap free \
                --output=single-hdu.fits

In case you just want the numbers image, you can use sigclip-median (which is always calculated as part of the clipping process: no extra overhead), and free the top operand (without the swap: the median-coadd image), leaving only the numbers image:

$ astarithmetic a.fits b.fits c.fits -g1 \
                3 5 0.01 madclip-maskfilled \
                3 4 0.1  sigclip-median free \
                --output=single-hdu-only-numbers.fits

madclip-mad

madclip-std

madclip-mean

madclip-median

Similar to the sigclip-* operators, but using Median Absolute Deviation (MAD) as the measure of spread. See MAD clipping and more generally Clipping outliers. See the description of sigclip-* for usage details; just remember that o1 MAD is equivalent to \(0.67449\sigma\).

sigclip-all

madclip-all

Similar to the sigclip-* or madclip-* operators, but instead of returning only a single measured statistic (mean, STD, median or MAD) all of them are returned, and in the following order: mean, STD, median, MAD, number. This will greatly speed up your analysis pipelines when more than one statistic is required (for example you need both the mean and the standard deviation). Just note that as described under the sigma-clip operators, --writeall will be necessary if you want them all to be put in the output.

For example, let’s assume you want the mean and standard deviation after a mad-clipping like the examples before (not just one of them). In that case, the cleanest way is to use a combination of the set- and free operators like the example below. The first (set- pops the top operand from the stack and stores it in a named variable (that you can use any number of times later), while the second (free) simply pops and deletes the top operand on the stack. For more on these operators, see Operand storage in memory or a file. Finally, to help in later usage of the HDUs, we use the --metadata operator to add a name to each output HDU in the same Arithmetic call.

$ astarithmetic a.fits b.fits c.fits -g1 --writeall \
                3 7.5 0.2 madclip-all set-m set-s \
                free free free s m --metaname=MEAN,STD \
                --output=out.fits

Different combinations of the various statistics are commonly necessary for different purposes. The reason we haven’t specified an option for each combination is because that would dramatically increase the operator-count and would confuse the developers and users. On the other hand, the process of mask-filling or even just clipping is much more computationally expensive on large datasets than these measurements combined. We therefore recommend the process above to extract the statistics you want.