GNU Astronomy Utilities


Next: , Previous: , Up: Data analysis   [Contents][Index]


7.3 Segment

Once signal is separated from noise (for example with NoiseChisel), you have a binary dataset: each pixel is either signal (1) or noise (0). Signal (for example every galaxy in your image) has been “detected”, but all detections have a label of 1. Therefore while we know which pixels contain signal, we still can’t find out how many galaxies they contain or which detected pixels correspond to which galaxy. At the lowest (most generic) level, detection is a kind of segmentation (segmenting the the whole dataset into signal and noise, see NoiseChisel). Here, we’ll define segmentation only on signal: to separate and find sub-structure within the detections.

If the targets are clearly separated, or their detected regions aren’t touching, a simple connected components128 algorithm (very basic segmentation) is enough to separate the regions that are touching/connected. This is such a basic and simple form of segmentation that Gnuastro’s Arithmetic program has an operator for it: see connected-components in Arithmetic operators. Assuming the binary dataset is called binary.fits, you can use it with a command like this:

$ astarithmetic binary.fits 2 connected-components

You can even do a very basic detection (a threshold, say at value 100) and segmentation in Arithmetic with a single command like below:

$ astarithmetic in.fits 100 gt 2 connected-components

However, in most astronomical situations our targets are not nicely separated or have a sharp boundary/edge (for a threshold to suffice): they touch (for example merging galaxies), or are simply in the same line-of-sight (which is much more common). This causes their images to overlap.

In particular, when you do your detection with NoiseChisel, you will detect signal to very low surface brightness limits: deep into the faint wings of galaxies or bright stars (which can extend very far and irregularly from their center). Therefore, it often happens that several galaxies are detected as one large detection. Since they are touching, a simple connected components algorithm will not suffice. It is therefore necessary to do a more sophisticated segmentation and break up the detected pixels (even those that are touching) into multiple target objects as accurately as possible.

Segment will use a detection map and its corresponding dataset to find sub-structure over the detected areas and use them for its segmentation. Until Gnuastro version 0.6 (released in 2018), Segment was part of NoiseChisel. Therefore, similar to NoiseChisel, the best place to start reading about Segment and understanding what it does (with many illustrative figures) is Section 3.2 of Akhlaghi and Ichikawa [2015].

As a summary, Segment first finds true clumps over the detections. Clumps are associated with local maxima/minima129 and extend over the neighboring pixels until they reach a local minimum/maximum (river/watershed). By default, Segment will use the distribution of clump signal-to-noise ratios over the undetected regions as reference to find “true” clumps over the detections. Using the undetected regions can be disabled by directly giving a signal-to-noise ratio to --clumpsnthresh.

The true clumps are then grown to a certain threshold over the detections. Based on the strength of the connections (rivers/watersheds) between the grown clumps, they are considered parts of one object or as separate objects. See Section 3.2 of Akhlaghi and Ichikawa [2015] (link above) for more. Segment’s main output are thus two labeled datasets: 1) clumps, and 2) objects. See Segment output for more.

To start learning about Segment, especially in relation to detection (NoiseChisel) and measurement (MakeCatalog), the recommended references are Akhlaghi and Ichikawa [2015] and Akhlaghi [2016].

Those papers cannot be updated any more but the software will evolve. For example Segment became a separate program (from NoiseChisel) in 2018 (after those papers were published). Therefore this book is the definitive reference. To help in the transition from those papers to the software you are using, see Segment changes after publication. Finally, in Invoking Segment, we’ll discuss Segment’s inputs, outputs and configuration options.


Footnotes

(128)

https://en.wikipedia.org/wiki/Connected-component_labeling

(129)

By default the maximum is used as the first clump pixel, to define clumps based on local minima, use the --minima option.


Next: , Previous: , Up: Data analysis   [Contents][Index]