Filled re-clipping (GNU Astronomy Utilities)

Previous: MAD clipping, Up: Clipping outliers [Contents][Index]

2.10.4 Filled re-clipping ¶

When source of the outlier covers more than one element, and its flux is close to the noise level, not all of its elements will be clipped: because of noise, some of its elements will remain un-clipped; and thus affecting the output. Examples of this were created and thoroughly discussed in previous sections with \(\sigma\)-clipping and MAD-clipping (see Sigma clipping and MAD clipping).

To avoid this problem, in Gnuastro we have an extra set of clipping operators that will do two rounds of clips and with some basic operations in the middle.

The requested clipping is first applied. This will the return the median and dispersion measure (MAD or STD).
A mask is created for each input image (in stacking) or 1D array (in collapsing). Any pixel that is outside the requested clip range is set to 1; the rest are set to 0.
Isolated regions are found:
- For 2D images (were each pixel has 8 neighbors) the mask pixels are first dilated (so the edges of the regions are closed off from the surrounding noise).
- For 1D arrays (where each element only has two neighbors), the mask is first eroded. This is necessary because the next step (where the holes are filled), two pixels that have been clipped purely due to noise with a large distance between them can wrongly mask a very large range of the input data.
Any 0-valued pixel in the masks that are fully surrounded by 1s (or “holes”) are filled (given a value of 1).
All the pixels that have a value of 1 in the mask are set to NaN in the respective input data (that the mask corresponds to).
The requested clipping is repeated on the newly masked inputs.

Through this process, the less significant outliers (which do not get clipped independently) are clipped based on their surrounding elements. The filled re-clipping operators have an extra -fill in their names. For example the filled MAD-clipped mean is called madclip-fill-mean (while the simple MAD-clipped mean operator was called madclip-mean). Let’s run our script with the filled \(\sigma\)-clipping and \(MAD\)-clipping (before each run, make sure the values shown under the grep command are correct).

With the last command, we’ll view all the outputs generated so far (in this section and the previous ones (Building inputs and analysis without clipping, Sigma clipping and MAD clipping):

$ grep '^clip_\|^profsum' script.sh
profsum=1e5
clip_operator=madclip-fill
clip_multiple=4.48
clip_tolerance=0.01

$ bash ./script

$ $  grep '^clip_\|^profsum' script.sh
profsum=1e5
clip_operator=sigclip-fill
clip_multiple=3
clip_tolerance=0.1

$ bash ./script

$ imgs=""
$ for m in std mean median mad number; do \
   imgs="$imgs $p-$m.fits $p-sigclip-$m.fits $p-sigclip-fill-$m.fits" \
   imgs="$p-madclip-$m.fits $p-madclip-fill-$m.fits"; \
  done
$ astscript-fits-view $imgs --ds9extra="-tile grid layout 5 6" \
                            --ds9extra="-scale limits 1 9" \
                            --ds9extra="-frame move back -scale limits 1 9" \
                            --ds9extra="-frame move back -scale limits 1 9" \
                            --ds9extra="-frame move back -scale limits 1 9" \
                            --ds9extra="-frame move back -scale limits 1 9"

The last column (belonging to the madclip-fill-* operators) is finally free of the outlying circle (that was present in only one of nine inputs). The filling operation did not affect the sigclip-fill-* operators (third column DS9 from the last command above)! The reason is clear from the bottom row (showing the number of images used in each pixel). The weak over density of clipped pixels over the circle is barely visible and was not strong enough for defining “holes” (that will be filled). On the contrary, when comparing the madclip-number.fits and madclip-fill-number.fits, the filled holes within the circle are clearly visible.

But the script also generated collapsed columns of build/in-9.fits (to a 1D table). In this case, for each column, the number of outliers increase as we enter the circle and reach a maximum in the middle of the image. Let’s have a look at those outputs:

$ astscript-fits-view build/collapsed-madclip-9.fits \
                      build/collapsed-madclip-fill-9.fits

When comparing the two in TOPCAT (following the same process described in Building inputs and analysis without clipping) you will notice that the difference is only in the edges of the circle. The 4.48 multiple of MAD-clipping (corresponding to 3 sigma), was not successful in removing the many outlying pixels due to the circle in the central pixels of the image.

This is a relatively high threshold and was used because for the images, we only had 9 elements in each clipping for every pixel. But for the collapsing, we have many more pixels in each vertical direction of the image (201 pixels). Let’s decrease the threshold to 3 and calculate the collapsed mean after MAD-clipping, once with filled re-clipping and once without it:

$ for m in mean number; do \
   for clip in madclip madclip-fill; do \
    astarithmetic build/in-9.fits 3 0.01 2 collapse-$clip-$m \
                  counter --writeall -ocollapse-$clip-$m.fits; \
   done; \
  done

The two loops above created four tables. First, with the command below, let’s look at the two measured mean values (one with filling and the other without it):

$ astscript-fits-view collapse-*-mean.fits

In the table without filled re-clipping, you see a small shift in the center of the image (around 100 in the horizontal axis). Let’s have a look at the final number of pixels used in each clipping:

$ astscript-fits-view collapse-*-number.fits

The difference is now clearly visible when you plot both in one “Plane plot” window. In the filled re-clipping case, we see a clear dip in the number of pixels that very nicely corresponds to the number of pixels associated to the circle. But the dip is much more noisy in the simple MAD-clipping.