Segmentation and making a catalog (GNU Astronomy Utilities)

Next: Measuring the dataset limits, Previous: NoiseChisel optimization for storage, Up: General program usage tutorial [Contents][Index]

2.1.13 Segmentation and making a catalog ¶

The main output of NoiseChisel is the binary detection map (DETECTIONS extension, see NoiseChisel optimization for detection). It only has two values: 1 or 0. This is useful when studying the noise or background properties, but hardly of any use when you actually want to study the targets/galaxies in the image, especially in such a deep field where almost everything is connected. To find the galaxies over the detections, we will use Gnuastro’s Segment program:

$ mkdir seg
$ astsegment nc/xdf-f160w.fits -oseg/xdf-f160w.fits
$ astsegment nc/xdf-f125w.fits -oseg/xdf-f125w.fits
$ astsegment nc/xdf-f105w.fits -oseg/xdf-f105w.fits

Segment’s operation is very much like NoiseChisel (in fact, prior to version 0.6, it was part of NoiseChisel). For example, the output is a multi-extension FITS file (previously discussed in NoiseChisel and Multi-Extension FITS files), it has check images and uses the undetected regions as a reference (previously discussed in NoiseChisel optimization for detection). Please have a look at Segment’s multi-extension output to get a good feeling of what it has done. Do not forget to flip through the extensions in the “Cube” window.

$ astscript-fits-view seg/xdf-f160w.fits

Like NoiseChisel, the first extension is the input. The CLUMPS extension shows the true “clumps” with values that are \(\ge1\), and the diffuse regions labeled as \(-1\). Please flip between the first extension and the clumps extension and zoom-in on some of the clumps to get a feeling of what they are. In the OBJECTS extension, we see that the large detections of NoiseChisel (that may have contained many galaxies) are now broken up into separate labels. Play with the color-bar and hover your mouse of the various detections to see their different labels.

The clumps are not affected by the hard-to-deblend and low signal-to-noise diffuse regions, they are more robust for calculating the colors (compared to objects). From this step onward, we will continue with clumps.

Having localized the regions of interest in the dataset, we are ready to do measurements on them with MakeCatalog. MakeCatalog is specialized and optimized for doing measurements over labeled regions of an image. In other words, through MakeCatalog, you can “reduce” an image to a table (catalog of certain properties of objects in the image). Each requested measurement (over each label) will be given a column in the output table. To see the full set of available measurements run it with --help like below (and scroll up), note that measurements are classified by context.

$ astmkcatalog --help

So let’s select the properties we want to measure in this tutorial. First of all, we need to know which measurement belongs to which object or clump, so we will start with the --ids (read as: IDs³⁵). We also want to measure (in this order) the Right Ascension (with --ra), Declination (--dec), magnitude (--magnitude), and signal-to-noise ratio (--sn) of the objects and clumps. Furthermore, as mentioned above, we also want measurements on clumps, so we also need to call --clumpscat. The following command will make these measurements on Segment’s F160W output and write them in a catalog for each object and clump in a FITS table. For more on the zero point, see Brightness, Flux, Magnitude and Surface brightness.

$ mkdir cat
$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --zeropoint=25.94 --clumpscat --output=cat/xdf-f160w.fits

From the printed statements on the command-line, you see that MakeCatalog read all the extensions in Segment’s output for the various measurements it needed. Let’s look at the output of the command above:

$ astfits cat/xdf-f160w.fits

You will see that the output of the MakeCatalog has two extensions. The first extension shows the measurements over the OBJECTS, and the second extension shows the measurements over the clumps CLUMPS.

To calculate colors, we also need magnitude measurements on the other filters. So let’s repeat the command above on them, just changing the file names and zero point (which we got from the XDF survey web page):

$ astmkcatalog seg/xdf-f125w.fits --ids --ra --dec --magnitude --sn \
               --zeropoint=26.23 --clumpscat --output=cat/xdf-f125w.fits

$ astmkcatalog seg/xdf-f105w.fits --ids --ra --dec --magnitude --sn \
               --zeropoint=26.27 --clumpscat --output=cat/xdf-f105w.fits

However, the galaxy properties might differ between the filters (which is the whole purpose behind observing in different filters!). Also, the noise properties and depth of the datasets differ. You can see the effect of these factors in the resulting clump catalogs, with Gnuastro’s Table program. We will go deep into working with tables in the next section, but in summary: the -i option will print information about the columns and number of rows. To see the column values, just remove the -i option. In the output of each command below, look at the Number of rows:, and note that they are different.

$ asttable cat/xdf-f105w.fits -hCLUMPS -i
$ asttable cat/xdf-f125w.fits -hCLUMPS -i
$ asttable cat/xdf-f160w.fits -hCLUMPS -i

Matching the catalogs is possible (for example, with Match). However, the measurements of each column are also done on different pixels: the clump labels can/will differ from one filter to another for one object. Please open them and focus on one object to see for yourself. This can bias the result, if you match catalogs.

An accurate color calculation can only be done when magnitudes are measured from the same pixels on all images and this can be done easily with MakeCatalog. In fact this is one of the reasons that NoiseChisel or Segment do not generate a catalog like most other detection/segmentation software. This gives you the freedom of selecting the pixels for measurement in any way you like (from other filters, other software, manually, etc.). Fortunately in these images, the Point spread function (PSF) is very similar, allowing us to use a single labeled image output for all filters³⁶.

The F160W image is deeper, thus providing better detection/segmentation, and redder, thus observing smaller/older stars and representing more of the mass in the galaxies. We will thus use the F160W filter as a reference and use its segment labels to identify which pixels to use for which objects/clumps. But we will do the measurements on the sky-subtracted F105W and F125W images (using MakeCatalog’s --valuesfile option) as shown below: Notice that the only difference between these calls and the call to generate the raw F160W catalog (excluding the zero point and the output name) is the --valuesfile.

$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --valuesfile=nc/xdf-f125w.fits --zeropoint=26.23 \
               --clumpscat --output=cat/xdf-f125w-on-f160w-lab.fits

$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --valuesfile=nc/xdf-f105w.fits --zeropoint=26.27 \
               --clumpscat --output=cat/xdf-f105w-on-f160w-lab.fits

After running the commands above, look into what MakeCatalog printed on the command-line. You can see that (as requested) the object and clump pixel labels in both were taken from the respective extensions in seg/xdf-f160w.fits. However, the pixel values and pixel Sky standard deviation were respectively taken from nc/xdf-f105w.fits and nc/xdf-f125w.fits. Since we used the same labeled image on all filters, the number of rows in both catalogs are now identical. Let’s have a look:

$ asttable cat/xdf-f105w-on-f160w-lab.fits -hCLUMPS -i
$ asttable cat/xdf-f125w-on-f160w-lab.fits -hCLUMPS -i
$ asttable cat/xdf-f160w.fits -hCLUMPS -i

Finally, MakeCatalog also does basic calculations on the full dataset (independent of each labeled region but related to whole data), for example, pixel area or per-pixel surface brightness limit. They are stored as keywords in the FITS headers (or lines starting with # in plain text). This (and other ways to measure the limits of your dataset) are discussed in the next section: Measuring the dataset limits.

Footnotes

(35)

This option is plural because we need two ID columns for identifying “clumps” in the clumps catalog/table: the first column will be the ID of the host “object”, and the second one will be the ID of the clump within that object. In the “objects” catalog/table, only a single column will be returned for this option.

(36)

When the PSFs between two images differ largely, you would have to PSF-match the images before using the same pixels for measurements.