GNU Astronomy Utilities



7.4.7.1 MakeCatalog inputs and basic settings

MakeCatalog works by using a localized/labeled dataset (see MakeCatalog). This dataset maps/labels pixels to a specific target (row number in the final catalog) and is thus the only necessary input dataset to produce a minimal catalog in any situation. Because it only has labels/counters, it must have an integer type (see Numeric data types), see below if your labels are in a floating point container. When the requested measurements only need this dataset (for example, --geo-x, --geo-y, or --geo-area), MakeCatalog will not read any more datasets.

Low-level measurements that only use the labeled image are rarely sufficient for any high-level science case. Therefore necessary input datasets depend on the requested columns in each run. For example, let’s assume you want the brightness/magnitude and signal-to-noise ratio of your labeled regions. For these columns, you will also need to provide an extra dataset containing values for every pixel of the labeled input (to measure magnitude) and another for the Sky standard deviation (to measure error). All such auxiliary input files have to have the same size (number of pixels in each dimension) as the input labeled image. Their numeric data type is irrelevant (they will be converted to 32-bit floating point internally). For the full list of available measurements, see MakeCatalog measurements.

The “values” dataset is used for measurements like brightness/magnitude, or flux-weighted positions. If it is a real image, by default it is assumed to be already Sky-subtracted prior to running MakeCatalog. If it is not, you use the --subtractsky option to, so MakeCatalog reads and subtracts the Sky dataset before any processing. To obtain the Sky value, you can use the --sky option of Statistics, but the best recommended method is NoiseChisel, see Sky value.

MakeCatalog can also do measurements on sub-structures of detections. In other words, it can produce two catalogs. Following the nomenclature of Segment (see Segment), the main labeled input dataset is known as “object” labels and the (optional) sub-structure input dataset is known as “clumps”. If MakeCatalog is run with the --clumpscat option, it will also need a labeled image containing clumps, similar to what Segment produces (see Segment output). Since clumps are defined within detected regions (they exist over signal, not noise), MakeCatalog uses their boundaries to subtract the level of signal under them.

There are separate options to explicitly request a file name and HDU/extension for each of the required input datasets as fully described below (with the --*file format). When each dataset is in a separate file, these options are necessary. However, one great advantage of the FITS file format (that is heavily used in astronomy) is that it allows the storage of multiple datasets in one file. So in most situations (for example, if you are using the outputs of NoiseChisel or Segment), all the necessary input datasets can be in one file.

When none of the --*file options are given (for example --clumpsfile or --valuesfile), MakeCatalog will assume the necessary input datasets are available as HDUs in the file given as its argument (without any option). When the Sky or Sky standard deviation datasets are necessary and the only --*file option called is --valuesfile, MakeCatalog will search for these datasets (with the default/given HDUs) in the file given to --valuesfile (before looking into the main argument file).

It may happen that your labeled objects image was created with a program that only outputs floating point files. However, you know it only has integer valued pixels that are stored in a floating point container. In such cases, you can use Gnuastro’s Arithmetic program (see Arithmetic) to change the numerical data type of the image (float.fits) to an integer type image (int.fits) with a command like below:

$ astarithmetic float.fits int32 --output=int.fits

To summarize: if the input file to MakeCatalog is the default/full output of Segment (see Segment output) you do not have to worry about any of the --*file options below. You can just give Segment’s output file to MakeCatalog as described in Invoking MakeCatalog. To feed NoiseChisel’s output into MakeCatalog, just change the labeled dataset’s header (with --hdu=DETECTIONS). The full list of input dataset options and general setting options are described below.

-l FITS
--clumpsfile=FITS

The FITS file containing the labeled clumps dataset when --clumpscat is called (see MakeCatalog output). When --clumpscat is called, but this option is not, MakeCatalog will look into the main input file (given as an argument) for the required extension/HDU (value to --clumpshdu).

--clumpshdu=STR

The HDU/extension of the clump labels dataset. Only pixels with values above zero will be considered. The clump labels dataset has to be an integer data type (see Numeric data types) and only pixels with a value larger than zero will be used. See Segment output for a description of the expected format.

-v FITS
--valuesfile=FITS

The file name of the (sky-subtracted) values dataset. When any of the columns need values to associate with the input labels (for example, to measure the sum of pixel values or magnitude of a galaxy, see Brightness, Flux, Magnitude and Surface brightness), MakeCatalog will look into a “values” for the respective pixel values. In most common processing, this is the actual astronomical image that the labels were defined, or detected, over. The HDU/extension of this dataset in the given file can be specified with --valueshdu. If this option is not called, MakeCatalog will look for the given extension in the main input file.

--valueshdu=STR/INT

The name or number (counting from zero) of the extension containing the “values” dataset, see the descriptions above and those in --valuesfile for more.

-s FITS/FLT
--insky=FITS/FLT

Sky value as a single number, or the file name containing a dataset (different values per pixel or tile). The Sky dataset is only necessary when --subtractsky is called or when a column directly related to the Sky value is requested (currently --sky). This dataset may be a tessellation, with one element per tile (see --oneelempertile of NoiseChisel’s Processing options).

When the Sky dataset is necessary but this option is not called, MakeCatalog will assume it is an HDU/extension (specified by --skyhdu) in one of the already given files. First it will look for it in the --valuesfile (if it is given) and then the main input file (given as an argument).

By default the values dataset is assumed to be already Sky subtracted, so this dataset is not necessary for many of the columns.

--skyhdu=STR

HDU/extension of the Sky dataset, see --skyfile.

--subtractsky

Subtract the sky value or dataset from the values file prior to any processing.

-t STR/FLT
--instd=STR/FLT

Sky standard deviation value as a single number, or the file name containing a dataset (different values per pixel or tile). With the --variance option you can tell MakeCatalog to interpret this value/dataset as a variance image, not standard deviation.

Important note: This must only be the SKY standard deviation or variance (not including the signal’s contribution to the error). In other words, the final standard deviation of a pixel depends on how much signal there is in it. MakeCatalog will find the amount of signal within each pixel (while subtracting the Sky, if --subtractsky is called) and account for the extra error due to it’s value (signal). Therefore if the input standard deviation (or variance) image also contains the contribution of signal to the error, then the final error measurements will be over-estimated.

--stdhdu=STR

The HDU of the Sky value standard deviation image.

--variance

The dataset given to --instd (and --stdhdu has the Sky variance of every pixel, not the Sky standard deviation.

--forcereadstd

Read the input STD image even if it is not required by any of the requested columns. This is because some of the output catalog’s metadata may need it, for example, to calculate the dataset’s surface brightness limit (see Quantifying measurement limits, configured with --sfmagarea and --sfmagnsigma in MakeCatalog output).

Furthermore, if the input STD image does not have the MEDSTD keyword (that is meant to contain the representative standard deviation of the full image), with this option, the median will be calculated and used for the surface brightness limit.

-z FLT
--zeropoint=FLT

The zero point magnitude for the input image, see Brightness, Flux, Magnitude and Surface brightness.

--sigmaclip FLT,FLT

The sigma-clipping parameters when any of the sigma-clipping related columns are requested (for example, --sigclip-median or --sigclip-number).

This option takes two values: the first is the multiple of \(\sigma\), and the second is the termination criteria. If the latter is larger than 1, it is read as an integer number and will be the number of times to clip. If it is smaller than 1, it is interpreted as the tolerance level to stop clipping. See Sigma clipping for a complete explanation.

--frac-max=FLT[,FLT]

The fractions (one or two) of maximum value in objects or clumps to be used in the related columns, for example, --frac-max1-area, --frac-max1-sum or --frac-max1-radius, see MakeCatalog measurements. For the maximum value, see the description of --maximum column below. The value(s) of this option must be larger than 0 and smaller than 1 (they are a fraction). When only --frac-max1-area or --frac-max1-sum is requested, one value must be given to this option, but if --frac-max2-area or --frac-max2-sum are also requested, two values must be given to this option. The values can be written as simple floating point numbers, or as fractions, for example, 0.25,0.75 and 0.25,3/4 are the same.

--spatialresolution=FLT

The error in measuring spatial properties (for example, the area) in units of pixels. You can think of this as the FWHM of the dataset’s PSF and is used in measurements like the error in surface brightness (--sb-error, see MakeCatalog measurements). Ideally, images are taken in the optimal Nyquist sampling Sampling theorem, so the default value for this option is 2. But in practice real images my be over-sampled (usually ground-based images, where you will need to increase the default value) or undersampled (some space-based images, where you will need to decrease the default value).

--inbetweenints

Output will contain one row for all integers between 1 and the largest label in the input (irrespective of their existance in the input image). By default, MakeCatalog’s output will only contain rows with integers that actually corresponded to at least one pixel in the input dataset.

For example, if the input’s only labeled pixel values are 11 and 13, MakeCatalog’s default output will only have two rows. If you use this option, it will have 13 rows and all the columns corresponding to integer identifiers that did not correspond to any pixel will be 0 or NaN (depending on context).