NoiseChisel optimization for detection (GNU Astronomy Utilities)

Next: NoiseChisel optimization for storage, Previous: NoiseChisel and Multi-Extension FITS files, Up: General program usage tutorial [Contents][Index]

2.1.11 NoiseChisel optimization for detection ¶

In NoiseChisel and Multi-Extension FITS files, we ran NoiseChisel and reviewed NoiseChisel’s output format. Now that you have a better feeling for multi-extension FITS files, let’s optimize NoiseChisel for this particular dataset.

One good way to see if you have missed any signal (small galaxies, or the wings of brighter galaxies) is to mask all the detected pixels and inspect the noise pixels. For this, you can use Gnuastro’s Arithmetic program (in particular its where operator, see Arithmetic operators). The command below will produce mask-det.fits. In it, all the pixels in the INPUT-NO-SKY extension that are flagged 1 in the DETECTIONS extension (dominated by signal, not noise) will be set to NaN.

Since the various extensions are in the same file, for each dataset we need the file and extension name. To make the command easier to read/write/understand, let’s use shell variables: ‘in’ will be used for the Sky-subtracted input image and ‘det’ will be used for the detection map. Recall that a shell variable’s value can be retrieved by adding a $ before its name, also note that the double quotations are necessary when we have white-space characters in a variable value (like this case).

$ in="xdf-f160w_detected.fits -hINPUT-NO-SKY"
$ det="xdf-f160w_detected.fits -hDETECTIONS"
$ astarithmetic $in $det nan where --output=mask-det.fits

To invert the result (only keep the detected pixels), you can flip the detection map (from 0 to 1 and vice-versa) by adding a ‘not’ after the second $det:

$ astarithmetic $in $det not nan where --output=mask-sky.fits

Look again at the DETECTIONS extension, in particular the long worm-like structure around ³³ pixel 1650 (X) and 1470 (Y). These types of long wiggly structures show that we have dug too deep into the noise, and are a signature of correlated noise. Correlated noise is created when we warp (for example, rotate) individual exposures (that are each slightly offset compared to each other) into the same pixel grid before adding them into one deeper image. During the warping, nearby pixels are mixed and the effect of this mixing on the noise (which is in every pixel) is called “correlated noise”. Correlated noise is a form of convolution and it slightly smooths the image.

In terms of the number of exposures (and thus correlated noise), the XDF dataset is by no means an ordinary dataset. Therefore the default parameters need to be slightly customized. It is the result of warping and adding roughly 80 separate exposures which can create strong correlated noise/smoothing. In common surveys the number of exposures is usually 10 or less. See Figure 2 of Akhlaghi 2019 and the discussion on --detgrowquant there for more on how NoiseChisel “grow”s the detected objects and the patterns caused by correlated noise.

Let’s tweak NoiseChisel’s configuration a little to get a better result on this dataset. Do not forget that “Good statistical analysis is not a purely routine matter, and generally calls for more than one pass through the computer” (Anscombe 1973, see Gnuastro manifesto: Science and its tools). A good scientist must have a good understanding of her tools to make a meaningful analysis. So do not hesitate in playing with the default configuration and reviewing the manual when you have a new dataset (from a new instrument) in front of you. Robust data analysis is an art, therefore a good scientist must first be a good artist. Once you have found the good configuration for that particular noise pattern (instrument) you can safely use it for all new data that have a similar noise pattern.

NoiseChisel can produce “Check images” to help you visualize and inspect how each step is done. You can see all the check images it can produce with this command.

$ astnoisechisel --help | grep check

Let’s check the overall detection process to get a better feeling of what NoiseChisel is doing with the following command. To learn the details of NoiseChisel in more detail, please see NoiseChisel, Akhlaghi and Ichikawa 2015 and Akhlaghi 2019.

$ astnoisechisel flat-ir/xdf-f160w.fits --checkdetection

The check images/tables are also multi-extension FITS files. As you saw from the command above, when check datasets are requested, NoiseChisel will not go to the end. It will abort as soon as all the extensions of the check image are ready. Please list the extensions of the output with astfits and then opening it with ds9 as we done above. If you have read the paper, you will see why there are so many extensions in the check image.

$ astfits xdf-f160w_detcheck.fits
$ astscript-fits-view xdf-f160w_detcheck.fits

In order to understand the parameters and their biases (especially as you are starting to use Gnuastro, or running it a new dataset), it is strongly encouraged to play with the different parameters and use the respective check images to see which step is affected by your changes and how, for example, see Detecting large extended targets.

Let’s focus on one step: the OPENED_AND_LABELED extension shows the initial detection step of NoiseChisel. We see the seeds of that correlated noise structure with many small detections (a relatively early stage in the processing). Such connections at the lowest surface brightness limits usually occur when the dataset is too smoothed, the threshold is too low, or the final “growth” is too much.

As you see from the 2nd (CONVOLVED) extension, the first operation that NoiseChisel does on the data is to slightly smooth it. However, the natural correlated noise of this dataset is already one level of artificial smoothing, so further smoothing it with the default kernel may be the culprit. To see the effect, let’s use a sharper kernel as a first step to convolve/smooth the input.

By default NoiseChisel uses a Gaussian with full-width-half-maximum (FWHM) of 2 pixels. We can use Gnuastro’s MakeProfiles to build a kernel with FWHM of 1.5 pixel (truncated at 5 times the FWHM, like the default) using the following command. MakeProfiles is a powerful tool to build any number of mock profiles on one image or independently, to learn more of its features and capabilities, see MakeProfiles.

$ astmkprof --kernel=gaussian,1.5,5 --oversample=1

Please open the output kernel.fits and have a look (it is very small and sharp). We can now tell NoiseChisel to use this instead of the default kernel with the following command (we will keep the --checkdetection to continue checking the detection steps)

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                 --checkdetection

Open the output xdf-f160w_detcheck.fits as a multi-extension FITS file and go to the last extension (DETECTIONS-FINAL, it is the same pixels as the final NoiseChisel output without --checkdetections). Look again at that position mentioned above (1650,1470), you see that the long wiggly structure is gone. This shows we are making progress :-).

Looking at the new OPENED_AND_LABELED extension, we see that the thin connections between smaller peaks has now significantly decreased. Going two extensions/steps ahead (in the first HOLES-FILLED), you can see that during the process of finding false pseudo-detections, too many holes have been filled: do you see how the many of the brighter galaxies are connected? At this stage all holes are filled, irrespective of their size.

Try looking two extensions ahead (in the first PSEUDOS-FOR-SN), you can see that there are not too many pseudo-detections because of all those extended filled holes. If you look closely, you can see the number of pseudo-detections in the printed outputs of NoiseChisel (around 6400). This is another side-effect of correlated noise. To address it, we should slightly increase the pseudo-detection threshold (before changing --dthresh, run with -P to see the default value):

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits \
                 --dthresh=0.1 --checkdetection

Before visually inspecting the check image, you can already see the effect of this small change in NoiseChisel’s command-line output: notice how the number of pseudo-detections has increased to more than 7100! Open the check image now and have a look, you can see how the pseudo-detections are distributed much more evenly in the blank sky regions of the PSEUDOS-FOR-SN extension.

Maximize the number of pseudo-detections: When using NoiseChisel on datasets with a new noise-pattern (for example, going to a Radio astronomy image, or a shallow ground-based image), play with --dthresh until you get a maximal number of pseudo-detections: the total number of pseudo-detections is printed on the command-line when you run NoiseChisel, you do not even need to open a FITS viewer.

In this particular case, try --dthresh=0.2 and you will see that the total printed number decreases to around 6700 (recall that with --dthresh=0.1, it was roughly 7100). So for this type of very deep HST images, we should set --dthresh=0.1.

As discussed in Section 3.1.5 of Akhlaghi and Ichikawa 2015, the signal-to-noise ratio of pseudo-detections are critical to identifying/removing false detections. For an optimal detection they are very important to get right (where you want to detect the faintest and smallest objects in the image successfully). Let’s have a look at their signal-to-noise distribution with --checksn.

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                 --dthresh=0.1 --checkdetection --checksn

The output (xdf-f160w_detsn.fits) contains two extensions for the pseudo-detections containing two-column tables over the undetected (SKY_PSEUDODET_SN) regions and those over detections (DET_PSEUDODET_SN). With the first command below you can see the HDUs of this file, and with the second you can see the information of the table in the first HDU (which is the default when you do not use --hdu):

$ astfits xdf-f160w_detsn.fits
$ asttable xdf-f160w_detsn.fits -i

You can see the table columns with the first command below and get a feeling of the signal-to-noise value distribution with the second command (the two Table and Statistics programs will be discussed later in the tutorial):

$ asttable xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN
$ aststatistics xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN -c2
... [output truncated] ...
Histogram:
 |           *
 |          ***
 |         ******
 |        *********
 |        **********
 |       *************
 |      *****************
 |     ********************
 |    **************************
 |   ********************************
 |*******************************************************   * **       *
 |----------------------------------------------------------------------

The correlated noise is again visible in the signal-to-noise distribution of sky pseudo-detections! Do you see how skewed this distribution is? In an image with less correlated noise, this distribution would be much more symmetric. A small change in the quantile will translate into a big change in the S/N value. For example, see the difference between the three 0.99, 0.95 and 0.90 quantiles with this command:

$ aststatistics xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN -c2      \
                --quantile=0.99 --quantile=0.95 --quantile=0.90

We get a change of almost 2 units (which is very significant). If you run NoiseChisel with -P, you’ll see the default signal-to-noise quantile --snquant is 0.99. In effect with this option you specify the purity level you want (contamination by false detections). With the aststatistics command above, you see that a small number of extra false detections (impurity) in the final result causes a big change in completeness (you can detect more lower signal-to-noise true detections). So let’s loosen-up our desired purity level, remove the check-image options, and then mask the detected pixels like before to see if we have missed anything.

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                 --dthresh=0.1 --snquant=0.95
$ in="xdf-f160w_detected.fits -hINPUT-NO-SKY"
$ det="xdf-f160w_detected.fits -hDETECTIONS"
$ astarithmetic $in $det nan where --output=mask-det.fits

Overall it seems good, but if you play a little with the color-bar and look closer in the noise, you’ll see a few very sharp, but faint, objects that have not been detected. For example, the object around pixel (456, 1662). Despite its high valued pixels, this object was lost because erosion ignores the precise pixel values. Losing small/sharp objects like this only happens for under-sampled datasets like HST (where the pixel size is larger than the point spread function FWHM). So this will not happen on ground-based images.

To address this problem of sharp objects, we can use NoiseChisel’s --noerodequant option. All pixels above this quantile will not be eroded, thus allowing us to preserve small/sharp objects (that cover a small area, but have a lot of signal in it). Check its default value, then run NoiseChisel like below and make the mask again.

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits     \
                 --noerodequant=0.95 --dthresh=0.1 --snquant=0.95

This seems to be fine and the object above is now detected. We will stop editing the configuration of NoiseChisel here, but please feel free to keep looking into the data to see if you can improve it even more.

Once you have found the proper configuration for the type of images you will be using you do not need to change them any more. The same configuration can be used for any dataset that has been similarly produced (and has a similar noise pattern). But entering all these options on every call to NoiseChisel is annoying and prone to bugs (mistakenly typing the wrong value for example). To simplify things, we will make a configuration file in a visible config directory. Then we will define the hidden .gnuastro directory (that all Gnuastro’s programs will look into for configuration files) as a symbolic link to the config directory. Finally, we will write the finalized values of the options into NoiseChisel’s standard configuration file within that directory. We will also put the kernel in a separate directory to keep the top directory clean of any files we later need.

$ mkdir kernel config
$ ln -s config/ .gnuastro
$ mv kernel.fits kernel/noisechisel.fits
$ echo "kernel kernel/noisechisel.fits" > config/astnoisechisel.conf
$ echo "noerodequant 0.95"             >> config/astnoisechisel.conf
$ echo "dthresh      0.1"              >> config/astnoisechisel.conf
$ echo "snquant      0.95"             >> config/astnoisechisel.conf

We are now ready to finally run NoiseChisel on the three filters and keep the output in a dedicated directory (which we will call nc for simplicity).

$ rm *.fits
$ mkdir nc
$ for f in f105w f125w f160w; do \
    astnoisechisel flat-ir/xdf-$f.fits --output=nc/xdf-$f.fits; \
  done

Footnotes

(33)

To find a particular coordiante easily in DS9, you can do this: Click on the “Edit” menu, and select “Region”. Then click on any random part of the image to see a circle show up in that location (this is the “region”). Double-click on the region and a “Circle” window will open. If you have celestial coordinates, keep the default “fk5” in the scroll-down menu after the “Center”. But if you have pixel/image coordinates, click on the “fk5” and select “Image”. Now you can set the “Center” coordinates of the region (1650 and 1470 in this case) by manually typing them in the two boxes in front of “Center”. Finally, when everything is ready, click on the “Apply” button and your region will go over your requested coordinates. You can zoom out (to see the whole image) and visually find it.