GNU Astronomy Utilities

Next: , Previous: , Up: Tutorials   [Contents][Index]

2.2 General program usage tutorial

Measuring colors of astronomical objects in broad-band or narrow-band images is one of the most basic and common steps in astronomical analysis. Here, we will use Gnuastro’s programs to get a physical scale (area at certain redshifts) of the field we are studying, detect objects in a Hubble Space Telescope (HST) image, measure their colors and identify the ones with the largest colors to visual inspection and their spatial position in the image. After this tutorial, you can also try the Detecting large extended targets tutorial which goes into a little more detail on optimally configuring NoiseChisel (Gnuastro’s detection tool) in special situations.

During the tutorial, we will take many detours to explain, and practically demonstrate, the many capabilities of Gnuastro’s programs. In the end you will see that the things you learned during this tutorial are much more generic than this particular problem and can be used in solving a wide variety of problems involving the analysis of data (images or tables). So please don’t rush, and go through the steps patiently to optimally master Gnuastro.

In this tutorial, we’ll use the HST eXtreme Deep Field dataset. Like almost all astronomical surveys, this dataset is free for download and usable by the public. You will need the following tools in this tutorial: Gnuastro, SAO DS9 24, GNU Wget25, and AWK (most common implementation is GNU AWK26).

This tutorial was first prepared for the “Exploring the Ultra-Low Surface Brightness Universe” workshop (November 2017) at the ISSI in Bern, Switzerland. It was further extended in the “4th Indo-French Astronomy School” (July 2018) organized by LIO, CRAL CNRS UMR5574, UCBL, and IUCAA in Lyon, France. We are very grateful to the organizers of these workshops and the attendees for the very fruitful discussions and suggestions that made this tutorial possible.

Write the example commands manually: Try to type the example commands on your terminal manually and use the history feature of your command-line (by pressing the “up” button to retrieve previous commands). Don’t simply copy and paste the commands shown here. This will help simulate future situations when you are processing your own datasets.

A handy feature of Gnuastro is that all program names start with ast. This will allow your command-line processor to easily list and auto-complete Gnuastro’s programs for you. Try typing the following command (press TAB key when you see <TAB>) to see the list:

$ ast<TAB><TAB>

Any program that starts with ast (including all Gnuastro programs) will be shown. By choosing the subsequent characters of your desired program and pressing <TAB><TAB> again, the list will narrow down and the program name will auto-complete once your input characters are unambiguous. In short, you often don’t need to type the full name of the program you want to run.

Gnuastro contains a large number of programs and it is natural to forget the details of each program’s options or inputs and outputs. Therefore, before starting the analysis, let’s review how you can access this book to refresh your memory any time you want. For example when working on the command-line, without having to take your hands off the keyboard. When you install Gnuastro, this book is also installed on your system along with all the programs and libraries, so you don’t need an internet connection to to access/read it. Also, by accessing this book as described below, you can be sure that it corresponds to your installed version of Gnuastro.

GNU Info27 is the program in charge of displaying the manual on the command-line (for more, see Info). To see this whole book on your command-line, please run the following command and press subsequent keys. Info has its own mini-environment, therefore we’ll show the keys that must be pressed in the mini-environment after a -> sign. You can also ignore anything after the # sign in the middle of the line, they are only for your information.

$ info gnuastro                # Open the top of the manual.
-> <SPACE>                     # All the book chapters.
-> <SPACE>                     # Continue down: show sections.
-> <SPACE> ...                 # Keep pressing space to go down.
-> q                           # Quit Info, return to the command-line.

The thing that greatly simplifies navigation in Info is the links (regions with an underline). You can immediately go to the next link in the page with the <TAB> key and press <ENTER> on it to go into that part of the manual. Try the commands above again, but this time also use <TAB> to go to the links and press <ENTER> on them to go to the respective section of the book. Then follow a few more links and go deeper into the book. To return to the previous page, press l (small L). If you are searching for a specific phrase in the whole book (for example an option name), press s and type your search phrase and end it with an <ENTER>.

You don’t need to start from the top of the manual every time. For example, to get to Invoking NoiseChisel, run the following command. In general, all programs have such an “Invoking ProgramName” section in this book. These sections are specifically for the description of inputs, outputs and configuration options of each program. You can access them directly for each program by giving its executable name to Info.

$ info astnoisechisel

The other sections don’t have such shortcuts. To directly access them from the command-line, you need to tell Info to look into Gnuastro’s manual, then look for the specific section (an unambiguous title is necessary). For example, if you only want to review/remember NoiseChisel’s Detection options), just run the following command. Note how case is irrelevant for Info when calling a title in this manner.

$ info gnuastro "Detection options"

In general, Info is a powerful and convenient way to access this whole book with detailed information about the programs you are running. If you are not already familiar with it, please run the following command and just read along and do what it says to learn it. Don’t stop until you feel sufficiently fluent in it. Please invest the half an hour’s time necessary to start using Info comfortably. It will greatly improve your productivity and you will start reaping the rewards of this investment very soon.

$ info info

As a good scientist you need to feel comfortable to play with the features/options and avoid (be critical to) using default values as much as possible. On the other hand, our human memory is limited, so it is important to be able to easily access any part of this book fast and remember the option names, what they do and their acceptable values.

If you just want the option names and a short description, calling the program with the --help option might also be a good solution like the first example below. If you know a few characters of the option name, you can feed the output to grep like the second or third example commands.

$ astnoisechisel --help
$ astnoisechisel --help | grep quant
$ astnoisechisel --help | grep check

Let’s start the processing. First, to keep things clean, let’s create a gnuastro-tutorial directory and continue all future steps in it:

$ mkdir gnuastro-tutorial
$ cd gnuastro-tutorial

We will be using the near infra-red Wide Field Camera dataset. If you already have them in another directory (for example XDFDIR), you can set the download directory to be a symbolic link to XDFDIR with a command like this:

$ ln -s XDFDIR download

If the following images aren’t already present on your system, you can make a download directory and download them there.

$ mkdir download
$ cd download
$ xdfurl=
$ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f105w_v1_sci.fits
$ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits
$ cd ..

In this tutorial, we’ll just use these two filters. Later, you will probably need to download more filters, you can use the shell’s for loop to download them all in series (one after the other28) with one command like the one below for the WFC3 filters. Put this command instead of the two wget commands above. Recall that all the extra spaces, back-slashes (\), and new lines can be ignored if you are typing on the lines on the terminal.

$ for f in f105w f125w f140w f160w; do                              \
    wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits;   \

First, let’s visually inspect the dataset. Let’s take F160W image as an example. Do the steps below with the other image(s) too (and later with any dataset that you want to work on). It is very important to understand your dataset visually. Note how ds9 doesn’t follow the GNU style of options where “long” and “short” options are preceded by -- and - respectively (for example --width and -w, see Options).

Ds9’s -zscale option is a good scaling to highlight the low surface brightness regions, and as the name suggests, -zoom to fit will fit the whole dataset in the window. If the window is too small, expand it with your mouse, then press the “zoom” button on the top row of buttons above the image, then in the row below it, press “zoom fit”. You can also zoom in and out by scrolling your mouse or the respective operation on your touch-pad when your cursor/pointer is over the image.

$ ds9 download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits     \
      -zscale -zoom to fit

The first thing you might notice is that the regions with no data have a value of zero in this image. The next thing might be that the dataset actually has two “depth”s (see Quantifying measurement limits). The exposure time of the deep inner region is more than 4 times of the outer parts. Fortunately the XDF survey webpage (above) contains the vertices of the deep flat WFC3-IR field. With Gnuastro’s Crop program29, you can use those vertices to cutout this deep infra-red region from the larger image. We’ll make a directory called flat-ir and keep the flat infra-red regions in that directory (with a ‘xdf-’ suffix for a shorter and easier filename).

$ mkdir flat-ir
$ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f105w.fits              \
          --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :    \
                     53.134517,-27.787144 : 53.161906,-27.807208"     \
$ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f160w.fits              \
          --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :    \
                     53.134517,-27.787144 : 53.161906,-27.807208"     \

The only thing varying in the two calls to Gnuastro’s Crop program is the filter name. Therefore, to simplify the command, and later allow work on more filters, we can use the shell’s for loop. Notice how the two places where the filter names (f105w and f160w) are used above have been replaced with $f (the shell variable that for is in charge of setting) below. To generalize this for more filters later, you can simply add the other filter names in the first line before the semi-colon (;).

$ for f in f105w f160w; do                                            \
    astcrop --mode=wcs -h0 --output=flat-ir/xdf-$f.fits               \
            --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :  \
                       53.134517,-27.787144 : 53.161906,-27.807208"   \
            download/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits; \

Please open these images and inspect them with the same ds9 command you used above. You will see how it is nicely flat now and doesn’t have varying depths. Another important result of this crop is that regions with no data now have a NaN (Not-a-Number, or a blank value) value, not zero. Zero is a number, and thus a meaningful value, especially when you later want to NoiseChisel30. Generally, when you want to ignore some pixels in a dataset, and avoid higher-level ambiguities or complications, it is always best to give them blank values (not zero, or some other absurdly large or small number).

This is the deepest image we currently have of the sky. The first thing that comes to mind may be this: “How large is this field?”. The FITS world coordinate system (WCS) meta data standard contains the key to answering this question: the CDELT keyword31. With the commands below, we’ll use it (along with the image size) to find the answer. The lines starting with ## are just comments for you to help in following the steps. Don’t type them on the terminal. The commands are intentionally repetitive in some places to better understand each step and also to demonstrate the beauty of command-line features like variables, pipes and loops. Later, if you would like to repeat this process on another dataset, you can just use commands 3, 7, and 9.

Use shell history: Don’t forget to make effective use of your shell’s history. This is especially convenient when you just want to make a small change to your previous command. Press the “up” key on your keyboard (possibly multiple times) to see your previous command(s).

## (1)  See the general statistics of non-blank pixel values.
$ aststatistics flat-ir/xdf-f160w.fits

## (2)  We only want the number of non-blank pixels.
$ aststatistics flat-ir/xdf-f160w.fits --number

## (3)  Keep the result of the command above in the shell variable `n'.
$ n=$(aststatistics flat-ir/xdf-f160w.fits --number)

## (4)  See what is stored the shell variable `n'.
$ echo $n

## (5)  Show all the FITS keywords of this image.
$ astfits flat-ir/xdf-f160w.fits -h1

## (6)  The resolution (in degrees/pixel) is in the `CDELT' keywords.
##      Only show lines that contain these characters, by feeding
##      the output of the previous command to the `grep' program.
$ astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT

## (7)  Save the resolution (same in both dimensions) in the variable
##      `r'. The last part uses AWK to print the third `field' of its
##      input line. The first two fields were `CDELT1' and `='.
$ r=$(astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1   \
              | awk '{print $3}')

## (8)  Print the values of `n' and `r'.
$ echo $n $r

## (9)  Use the number of pixels (first number passed to AWK) and
##      length of each pixel's edge (second number passed to AWK)
##      to estimate the area of the field in arc-minutes squared.
$ area=$(echo $n $r | awk '{print $1 * ($2^2) * 3600}')

The area of this field is 4.03817 (or 4.04) arc-minutes squared. Just for comparison, this is roughly 175 times smaller than the average moon’s angular area (with a diameter of 30arc-minutes or half a degree).

AWK for table/value processing: AWK is a powerful and simple tool for text processing. Above (and further below) some simple examples are shown. GNU AWK (the most common implementation) comes with a free and wonderful book in the same format as this book which will allow you to master it nicely. Just like this manual, you can also access GNU AWK’s manual on the command-line whenever necessary without taking your hands off the keyboard.

This takes us to the second question that you have probably asked yourself when you saw the field for the first time: “How large is this area at different redshifts?”. To get a feeling of the tangential area that this field covers at redshift 2, you can use CosmicCalculator. In particular, you need the tangential distance covered by 1 arc-second as raw output. Combined with the field’s area, we can then calculate the tangential distance in Mega Parsecs squared (\(Mpc^2\)).

## Print general cosmological properties at redshift 2.
$ astcosmiccal -z2

## When given a "Specific calculation" option, CosmicCalculator
## will just print that particular calculation. See the options
## under this title in the output of `--help' for more.
$ astcosmiccal --help

## Only print the "Tangential dist. covered by 1arcsec at z (kpc)".
## in units of kpc/arc-seconds.
$ astcosmiccal -z2 --arcsectandist

## Convert this distance to kpc^2/arcmin^2 and save in `k'.
$ k=$(astcosmiccal -z2 --arcsectandist | awk '{print ($1*60)^2}')

## Multiply by the area of the field (in arcmin^2) and divide by
## 10^6 to return value in Mpc^2.
$ echo $k $area | awk '{print $1 * $2 / 1e6}'

At redshift 2, this field therefore covers 1.07145 \(Mpc^2\). If you would like to see how this tangential area changes with redshift, you can use a shell loop like below.

$ for z in 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0; do           \
    k=$(astcosmiccal -z$z --arcsectandist);                      \
    echo $z $k $area | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \

Fortunately, the shell has a useful tool/program to print a sequence of numbers that is nicely called seq. You can use it instead of typing all the different redshifts in this example. For example the loop below will print the same range of redshifts (between 0.5 and 5) but with increments of 0.1.

$ for z in $(seq 0.5 0.1 5); do                                  \
    k=$(astcosmiccal -z$z --arcsectandist);                      \
    echo $z $k $area | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \

This is a fast and simple way for this repeated calculation when it is only necessary once. However, if you commonly need this calculation and possibly for a larger number of redshifts, the command above can be slow. This is because the CosmicCalculator program has a lot of overhead. To be generic and easy to operate, it has to parse the command-line and all configuration files (see below) which contain human-readable characters and need a lot of processing to be ready for processing by the computer. Afterwards, CosmicCalculator has to check the sanity of its inputs and check which of its many options you have asked for. It has to do all of these for every redshift in the loop above.

To greatly speed up the processing, you can directly access the root work-horse of CosmicCalculator without all that overhead. Using Gnuastro’s library, you can write your own tiny program particularly designed for this exact calculation (and nothing else!). To do that, copy and paste the following C program in a file called myprogram.c.

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <gnuastro/cosmology.h>

  double area=4.03817;          /* Area of field (arcmin^2). */
  double z, adist, tandist;     /* Temporary variables.      */

  /* Constants from Plank 2018 (arXiv:1807.06209, Table 2) */
  double H0=67.66, olambda=0.6889, omatter=0.3111, oradiation=0;

  /* Do the same thing for all redshifts (z) between 0.1 and 5. */
  for(z=0.1; z<5; z+=0.1)
      /* Calculate the angular diameter distance. */
      adist=gal_cosmology_angular_distance(z, H0, olambda,
                                           omatter, oradiation);

      /* Calculate the tangential distance of one arcsecond. */
      tandist = adist * 1000 * M_PI / 3600 / 180;

      /* Print the redshift and area. */
      printf("%-5.2f %g\n", z, pow(tandist * 60,2) * area / 1e6);

  /* Tell the system that everything finished successfully. */
  return EXIT_SUCCESS;

To greatly simplify the compilation, linking and running of simple C programs like this that use Gnuastro’s library, Gnuastro has BuildProgram. This program designed to manage Gnuastro’s dependencies, compile and link the program and then run the new program. To build and run the program above, simply run the following command:

$ astbuildprog myprogram.c

Did you notice how much faster this was compared to the shell loop we wrote above? You might have noticed that a new file called myprogram is also created in the directory. This is the compiled program that was created and run by the command above (its in binary machine code format, not human-readable any more). You can run it again to get the same results with a command like this:

$ ./myprogram

The efficiency of myprogram compared to CosmicCalculator is because the requested processing is faster/comparable to the overheads necessary for each processing. For other programs that take large input datasets (images for example), the overhead is usually negligible compared to the processing. In such cases, the libraries are only useful if you want a different/new processing compared to the functionalities in Gnuastro’s existing programs.

Gnuastro has a large library which is heavily used by all the programs. In other words, the library is like the skeleton of Gnuastro. For the full list of available functions classified by context, please see Gnuastro library. Gnuastro’s library and BuildProgram are created to make it easy for you to use these powerful features as you like. This gives you a high level of creativity, while also providing efficiency and robustness. Several other complete working examples (involving images and tables) of Gnuastro’s libraries can be see in Library demo programs. Let’s stop the discussion on libraries at this point in this tutorial and get back to Gnuastro’s already built programs which were the main purpose of this tutorial.

None of Gnuastro’s programs keep a default value internally within their code. However, when you ran CosmicCalculator with the -z2 option above, it completed its processing and printed results. So where did the “default” cosmological parameter values (like the matter density and etc) come from? The values come from the command-line or a configuration file (see Configuration file precedence).

CosmicCalculator has a limited set of parameters and doesn’t need any particular file inputs. Therefore, we’ll use it to discuss configuration files which are an important part of all Gnuastro’s programs (see Configuration files).

Once you get comfortable with configuration files, you can easily do the same for the options of all Gnuastro programs. For example, NoiseChisel has the largest number of options in the programs. Therefore configuration files will be useful for it when you use different datasets (with different noise properties or in different research contexts). The configuration of each program (besides its version) is vital for the reproducibility of your results, so it is important to manage them properly.

As we saw above, the full list of the options in all Gnuastro programs can be seen with the --help option. Try calling it with CosmicCalculator as shown below. Note how options are grouped by context to make it easier to find your desired option. However, in each group, options are ordered alphabetically.

$ astcosmiccal --help

The options that need a value have an = sign after their long version and FLT, INT or STR for floating point numbers, integer numbers and strings (filenames for example) respectively. All options have a long format and some have a short format (a single character), for more see Options.

When you are using a program, it is often necessary to check the value the option has just before the program starts its processing. In other words, after it has parsed the command-line options and all configuration files. You can see the values of all options that need one with the --printparams or -P option that is common to all programs (see Common options). In the command below, try replacing -P with --printparams to see how both do the same operation.

$ astcosmiccal -P

Let’s say you want a different Hubble constant. Try running the following command to see how the Hubble constant in the output of the command above has changed. Afterwards, delete the -P and add a -z2 to see the results with the new cosmology (or configuration).

$ astcosmiccal -P --H0=70

From the output of the --help option, note how the option for Hubble constant has both short (-H) and long (--H0) formats. One final note is that the equal (=) sign is not mandatory. In the short format, the value can stick to the actual option (the short option name is just one character after-all and thus easily identifiable) and in the long format, a white-space character is also enough.

$ astcosmiccal -H70    -z2
$ astcosmiccal --H0 70 -z2 --arcsectandist

Let’s assume that in one project, you want to only use rounded cosmological parameters (H0 of 70km/s/Mpc and matter density of 0.3). You should therefore run CosmicCalculator like this:

$ astcosmiccal --H0=70 --olambda=0.7 --omatter=0.3 -z2

But having to type these extra options every time you run CosmicCalculator will be prone to errors (typos in particular) and also will be frustrating and slow. Therefore in Gnuastro, you can put all the options and their values in a “Configuration file” and tell the programs to read the option values from there.

Let’s create a configuration file. In your favorite text editor, make a file named my-cosmology.conf (or my-cosmology.txt, the suffix doesn’t matter) which contains the following lines. One space between the option value and name is enough, the values are just under each other to help in readability. Also note that you can only use long option names in configuration files.

H0       70
olambda  0.7
omatter  0.3

You can now tell CosmicCalculator to read this file for option values immediately using the --config option as shown below. Do you see how the output of the following command corresponds to the option values in my-cosmology.conf (previous command)?

$ astcosmiccal --config=my-cosmology.conf -z2

If you need this cosmology every time you are working in a specific directory, you can benefit from Gnuastro’s default configuration files to avoid having to call the --config option. Let’s assume that you want any CosmicCalculator call you make in the my-cosmology directory to use these parameters. You just have to copy the above configuration file into a special directory and file:

$ mkdir my-cosmology
$ mkdir my-cosmology/.gnuastro
$ mv my-cosmology.conf my-cosmology/.gnuastro/astcosmiccal.conf

Once you run CosmicCalculator within my-cosmology as shown below, you will see how your cosmology has been implemented without having to type anything extra on the command-line.

$ cd my-cosmology
$ astcosmiccal -z2
$ cd ..

To further simplify the process, you can use the --setdirconf option. If you are already in your desired directory, calling this option with the others will automatically write the final values (along with descriptions) in .gnuastro/astcosmiccal.conf. For example the commands below will make the same configuration file automatically (with one extra call to CosmicCalculator).

$ mkdir my-cosmology2
$ cd my-cosmology2
$ astcosmiccal --H0 70 --olambda=0.7 --omatter=0.3 --setdirconf
$ astcosmiccal -z2
$ cd ..

Gnuastro’s programs also have default configuration files for a specific user (when run in any directory). This allows you to set a special behavior every time a program is run by a specific user. Only the directory and filename differ from the above, the rest of the process is similar to before. Finally, there are also system-wide configuration files that can be used to define the option values for all users on a system. See Configuration file precedence for a more detailed discussion.

We are now ready to start processing the downloaded images. Since these datasets are already aligned, you don’t need to align them to make sure the pixel grid covers the same region in all inputs. Gnuastro’s Warp program has features for such pixel-grid warping (see Warp). Therefore, just for a demonstration, let’s assume one image needs to be rotated by 20 degrees to correspond to the other. To do that, you can run the following command:

$ astwarp flat-ir/xdf-f160w.fits --rotate=20

Open the output and see it. If your final image is already aligned with RA and Dec, you can simply use the --align option and let Warp calculate the necessary rotation and apply it.

Warp can generally be used for any kind of pixel grid manipulation (warping). For example the outputs of the commands below will respectively have larger pixels (new resolution being one quarter the original resolution), get shifted by 2.8 (by sub-pixel), get a shear of 2, and be tilted (projected). After running each, please open the output file and see the effect.

$ astwarp flat-ir/xdf-f160w.fits --scale=0.25
$ astwarp flat-ir/xdf-f160w.fits --translate=2.8
$ astwarp flat-ir/xdf-f160w.fits --shear=2
$ astwarp flat-ir/xdf-f160w.fits --project=0.001,0.0005

If you need to do multiple warps, you can combine them in one call to Warp. For example to first rotate the image, then scale it, run this command:

$ astwarp flat-ir/xdf-f160w.fits --rotate=20 --scale=0.25

If you have multiple warps, do them all in one command. Don’t warp them in separate commands because the correlated noise will become too strong. As you see in the matrix that is printed when you run Warp, it merges all the warps into a single warping matrix (see Warping basics and Merging multiple warpings) and simply applies that just once. Recall that since this is done through matrix multiplication, order matters in the separate operations. In fact through Warp’s --matrix option, you can directly request your desired final warp and don’t have to break it up into different warps like above (see Invoking Warp).

Fortunately these datasets are already aligned to the same pixel grid, so you don’t actually need the files that were just generated. You can safely delete them all with the following command. Here, you see why we put the processed outputs that we need later into a separate directory. In this way, the top directory can be used for temporary files for testing that you can simply delete with a generic command like below.

$ rm *.fits

To detect the signal in the image (separate interesting pixels from noise), we’ll run NoiseChisel (NoiseChisel):

$ astnoisechisel flat-ir/xdf-f160w.fits

NoiseChisel’s output is a single FITS file containing multiple extensions. In the FITS format, each extension contains a separate dataset (image in this case). You can get basic information about the extensions in a FITS file with Gnuastro’s Fits program (see Fits):

$ astfits xdf-f160w_detected.fits

From the output list, we see that NoiseChisel’s output contains 5 extensions and the first (counting from zero, with name NOISECHISEL-CONFIG) is empty: it has value of 0 in the last column (which shows its size). The first extension in all the outputs of Gnuastro’s programs only contains meta-data: data about/describing the datasets within (all) the output’s extension(s). This allows the first extension to keep meta-data about all the extensions and is recommended by the FITS standard, see Fits for more. This generic meta-data (for the whole file) is very important for being able to reproduce this same result later.

The second extension of NoiseChisel’s output (numbered 1 and named INPUT-NO-SKY) is the Sky-subtracted input that you provided. The third (DETECTIONS) is NoiseChisel’s main output which is a binary image with only two possible values for all pixels: 0 for noise and 1 for signal. Since it only has two values, to avoid taking too much space on your computer, its numeric datatype an unsigned 8-bit integer (or uint8)32. The fourth and fifth (SKY and SKY_STD) extensions, have the Sky and its standard deviation values for the input on a tile grid and were calculated over the undetected regions (for more on the importance of the Sky value, see Sky value).

Reproducing your results later (or checking the configuration of the program that produced the dataset at a later time during your higher-level analysis) is very important in any research. Therefore, Let’s first take a closer look at the NOISECHISEL-CONFIG extension. But first, we’ll run NoiseChisel with -P to see the option values in a format we are already familiar with (to help in the comparison).

$ astnoisechisel -P
$ astfits xdf-f160w_detected.fits -h0

The first group of FITS header keywords are standard keywords (containing the SIMPLE and BITPIX keywords the first empty line). They are required by the FITS standard and must be present in any FITS extension. The second group contain the input file and all the options with their values in that run of NoiseChisel. Finally, the last group contain the date and version information of Gnuastro and its dependencies. The “versions and date” group of keywords are present in all Gnuastro’s FITS extension outputs, for more see Output FITS files.

Note that if a keyword name is larger than 8 characters, it is preceded by a HIERARCH keyword and that all keyword names are in capital letters. Therefore, if you want to see only one keyword’s value by feeding the output to Grep, you should ask Grep to ignore case with its -i option (short name for --ignore-case). For example, below we’ll check the value to the --snminarea option, note how we don’t need Grep’s -i option when it is fed with astnoisechisel -P since it is already in small-caps there. The extra white spaces in the first command are only to help in readability, you can ignore them when typing.

$ astnoisechisel -P                   | grep    snminarea
$ astfits xdf-f160w_detected.fits -h0 | grep -i snminarea

Let’s continue with the extensions in NoiseChisel’s output that contain a dataset by visually inspecting them (here, we’ll use SAO DS9). Since the file contains multiple related extensions, the easiest way to view all of them in DS9 is to open the file as a “Multi-extension data cube” with the -mecube option as shown below33.

$ ds9 -mecube xdf-f160w_detected.fits -zscale -zoom to fit

A “cube” window opens along with DS9’s main window. The buttons and horizontal scroll bar in this small new window can be used to navigate between the extensions. In this mode, all DS9’s settings (for example zoom or color-bar) will be identical between the extensions. Try zooming into to one part and flipping through the extensions to see how the galaxies were detected along with the Sky and Sky standard deviation values for that region. Just have in mind that NoiseChisel’s job is only detection (separating signal from noise), We’ll do segmentation on this result later to find the individual galaxies/peaks over the detected pixels.

One good way to see if you have missed any signal (small galaxies, or the wings of brighter galaxies) is to mask all the detected pixels and inspect the noise pixels. For this, you can use Gnuastro’s Arithmetic program (in particular its where operator, see Arithmetic operators). With the command below, all detected pixels (in the DETECTIONS extension) will be set to NaN in the output (nc-masked.fits). To make the command easier to read/write, let’s just put the file name in a shell variable (img) first. A shell variable’s value can be retrieved by adding a $ before its name.

$ img=xdf-f160w_detected.fits
$ astarithmetic $img $img nan where -hINPUT-NO-SKY -hDETECTIONS      \

To invert the result (only keep the values of detected pixels), you can flip the detected pixel values (from 0 to 1 and vice-versa) by adding a not after the second $img:

$ astarithmetic $img $img not nan where -hINPUT-NO-SKY -hDETECTIONS  \

Looking again at the detected pixels, we see that there are thin connections between many of the smaller objects or extending from larger objects. This shows that we have dug in too deep, and that we are following correlated noise.

Correlated noise is created when we warp datasets from individual exposures (that are each slightly offset compared to each other) into the same pixel grid, then add them to form the final result. Because it mixes nearby pixel values, correlated noise is a form of convolution and it smooths the image. In terms of the number of exposures (and thus correlated noise), the XDF dataset is by no means an ordinary dataset. It is the result of warping and adding roughly 80 separate exposures which can create strong correlated noise/smoothing. In common surveys the number of exposures is usually 10 or less.

Let’s tweak NoiseChisel’s configuration a little to get a better result on this dataset. Don’t forget that “Good statistical analysis is not a purely routine matter, and generally calls for more than one pass through the computer” (Anscombe 1973, see Science and its tools). A good scientist must have a good understanding of her tools to make a meaningful analysis. So don’t hesitate in playing with the default configuration and reviewing the manual when you have a new dataset in front of you. Robust data analysis is an art, therefore a good scientist must first be a good artist.

NoiseChisel can produce “Check images” to help you visualize and inspect how each step is done. You can see all the check images it can produce with this command.

$ astnoisechisel --help | grep check

Let’s check the overall detection process to get a better feeling of what NoiseChisel is doing with the following command. To learn the details of NoiseChisel in more detail, please see Akhlaghi and Ichikawa [2015].

$ astnoisechisel flat-ir/xdf-f160w.fits --checkdetection

The check images/tables are also multi-extension FITS files. As you saw from the command above, when check datasets are requested, NoiseChisel won’t go to the end. It will abort as soon as all the extensions of the check image are ready. Try listing the extensions of the output with astfits and then opening them with ds9 as we done above. In order to understand the parameters and their biases (especially as you are starting to use Gnuastro, or running it a new dataset), it is strongly encouraged to play with the different parameters and use the respective check images to see which step is affected by your changes and how, for example see Detecting large extended targets.

The OPENED_AND_LABELED extension shows the initial detection step of NoiseChisel. We see these thin connections between smaller points are already present here (a relatively early stage in the processing). Such connections at the lowest surface brightness limits usually occur when the dataset is too smoothed. Because of correlated noise, the dataset is already artificially smoothed, therefore further smoothing it with the default kernel may be the problem. Therefore, one solution is to use a sharper kernel (NoiseChisel’s first step in its processing).

By default NoiseChisel uses a Gaussian with full-width-half-maximum (FWHM) of 2 pixels. We can use Gnuastro’s MakeProfiles to build a kernel with FWHM of 1.5 pixel (truncated at 5 times the FWHM, like the default) using the following command. MakeProfiles is a powerful tool to build any number of mock profiles on one image or independently, to learn more of its features and capabilities, see MakeProfiles.

$ astmkprof --kernel=gaussian,1.5,5 --oversample=1

Please open the output kernel.fits and have a look (it is very small and sharp). We can now tell NoiseChisel to use this instead of the default kernel with the following command (we’ll keep checking the detection steps)

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \

Looking at the OPENED_AND_LABELED extension, we see that the thin connections between smaller peaks has now significantly decreased. Going two extensions/steps ahead (in the first HOLES-FILLED), you can see that during the process of finding false pseudo-detections, too many holes have been filled: see how the many of the brighter galaxies are connected?

Try looking two extensions ahead (in the first PSEUDOS-FOR-SN), you can see that there aren’t too many pseudo-detections because of all those extended filled holes. If you look closely, you can see the number of pseudo-detections in the result NoiseChisel prints (around 4000). This is another side-effect of correlated noise. To address it, we should slightly increase the pseudo-detection threshold (--dthresh, run with -P to see its default value):

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                 --dthresh=0.2 --checkdetection

Before visually inspecting the check image, you can see the effect of this change in NoiseChisel’s command-line output: notice how the number of pseudos has increased to roughly 5500. Open the check image now and have a look, you can see how the pseudo-detections are distributed much more evenly in the image. The signal-to-noise ratio of pseudo-detections define NoiseChisel’s reference for removing false detections, so they are very important to get right. Let’s have a look at their signal-to-noise distribution with --checksn.

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                 --dthresh=0.2 --checkdetection --checksn

The output xdf-f160w_detsn.fits file contains two extensions for the pseudo-detections over the undetected (sky) regions and those over detections. The first column is the pseudo-detection label which you can see in the respective34 PSEUDOS-FOR-SN extension of xdf-f160w_detcheck.fits. You can see the table columns with the first command below and get a feeling for its distribution with the second command. We’ll discuss the two Table and Statistics programs later.

$ asttable xdf-f160w_detsn.fits
$ aststatistics xdf-f160w_detsn.fits -c2

The correlated noise is again visible in this pseudo-detection signal-to-noise distribution: it is highly skewed. A small change in the quantile will translate into a big change in the S/N value. For example see the difference between the three 0.99, 0.95 and 0.90 quantiles with this command:

$ aststatistics xdf-f160w_detsn.fits -c2                        \
                --quantile=0.99 --quantile=0.95 --quantile=0.90

If you run NoiseChisel with -P, you’ll see the default signal-to-noise quantile --snquant is 0.99. In effect with this option you specify the purity level you want (contamination by false detections). With the aststatistics command above, you see that a small number of extra false detections (impurity) in the final result causes a big change in completeness (you can detect more lower signal-to-noise true detections). So let’s loosen-up our desired purity level and then mask the detected pixels like before to see if we have missed anything.

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                 --dthresh=0.2 --snquant=0.95
$ img=xdf-f160w_detected.fits
$ astarithmetic $img $img nan where -h1 -h2 --output=mask-det.fits

Overall it seems good, but if you play a little with the color-bar and look closer in the noise, you’ll see a few very sharp, but faint, objects that have not been detected. This only happens for under-sampled datasets like HST (where the pixel size is larger than the point spread function FWHM). So this won’t happen on ground-based images. Because of this, sharp and faint objects will be very small and eroded too easily during NoiseChisel’s erosion step.

To address this problem of sharp objects, we can use NoiseChisel’s --noerodequant option. All pixels above this quantile will not be eroded, thus allowing us to preserve faint and sharp objects. Check its default value, then run NoiseChisel like below and make the mask again. You will see many of those sharp objects are now detected.

$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits     \
                 --noerodequant=0.95 --dthresh=0.2 --snquant=0.95

This seems to be fine and we can continue with our analysis. Before finally running NoiseChisel, let’s just see how you can have all the raw outputs of NoiseChisel (Detection map and Sky and Sky Standard deviation) in a highly compressed format for archivability. For example the Sky-subtracted input is a redundant dataset: you can always generate it by subtracting the Sky from the input image. With the commands below you can turn the default NoiseChisel output that is larger than 100 megabytes in this case into about 200 kilobytes by removing all the redundant information in it, then compressing it:

$ astnoisechisel flat-ir/xdf-f160w.fits --oneelempertile --rawoutput
$ gzip --best xdf-f160w_detected.fits

You can open xdf-f160w_detected.fits.gz directly in SAO DS9 or feed it to any of Gnuastro’s programs without having to uncompress it. Higher-level programs that take NoiseChisel’s output as input can also deal with this compressed image where the Sky and its Standard deviation are one pixel-per-tile.

To avoid having to write these options on every call to NoiseChisel, we’ll just make a configuration file in a visible config directory. Then we’ll define the hidden .gnuastro directory (that all Gnuastro’s programs will look into for configuration files) as a symbolic link to the config directory. Finally, we’ll write the finalized values of the options into NoiseChisel’s standard configuration file within that directory. We’ll also put the kernel in a separate directory to keep the top directory clean of any files we later need.

$ mkdir kernel config
$ ln -s config/ .gnuastro
$ mv kernel.fits det-kernel.fits
$ echo "kernel kernel/det-kernel.fits" > config/astnoisechisel.conf
$ echo "noerodequant 0.95"            >> config/astnoisechisel.conf
$ echo "dthresh      0.2"             >> config/astnoisechisel.conf
$ echo "snquant      0.95"            >> config/astnoisechisel.conf

We are now ready to finally run NoiseChisel on the two filters and keep the output in a dedicated directory (nc).

$ rm *.fits
$ mkdir nc
$ astnoisechisel flat-ir/xdf-f160w.fits --output=nc/xdf-f160w.fits
$ astnoisechisel flat-ir/xdf-f105w.fits --output=nc/xdf-f105w.fits

Before continuing with the higher-level processing of this dataset, let’s pause to use NoiseChisel’s multi-extension output as a demonstration for working with FITS extensions using Gnuastro’s Fits program (see Fits.

Let’s say you need to copy a HDU/extension (image or table) from one FITS file to another. After the command below, objects.fits file will contain only one extension: a copy of NoiseChisel’s binary detection map. There are similar options to conveniently cut (--cut, copy, then remove from the input) or delete (--remove) HDUs from a FITS file also.

$ astfits nc/xdf-f160w.fits --copy=DETECTIONS -odetections.fits

NoiseChisel puts some general information on its outputs in the FITS header of the respective extension. To see the full list of keywords in an extension, you can again use the Fits program like above. But instead of HDU manipulation options, give it the HDU you are interested in with -h. You can also give the HDU number (as listed in the output above), for example -h2 instead of -hDETECTIONS.

$ astfits nc/xdf-f160w.fits -hDETECTIONS

The DETSN keyword in NoiseChisel’s DETECTIONS extension contains the true pseudo-detection signal-to-noise ratio that was found by NoiseChisel on the dataset. It is not easy to find it in the middle of all the other keywords printed by the command above (especially in files that have many more keywords). To fix the problem, you can pipe the output of the command above into grep (a program for matching lines which is available on almost all Unix-like operating systems).

$ astfits nc/xdf-f160w.fits -hDETECTIONS | grep DETSN

If you just want the value of the keyword and not the full FITS keyword line, you can use AWK. In the example below, AWK will print the third word (separated by white space characters) in any line that has a first column value of DETSN.

$ astfits nc/xdf-f160w.fits -h2 | awk '$1=="DETSN" {print $3}'

The main output of NoiseChisel is the binary detection map (DETECTIONS extension), which only has two values of 1 or 0. This is useful when studying the noise, but hardly of any use when you actually want to study the targets/galaxies in the image, especially in such a deep field where the detection map of almost everything is connected. To find the galaxies over the detections, we’ll use Gnuastro’s Segment program:

$ rm *.fits
$ mkdir seg
$ astsegment nc/xdf-f160w.fits -oseg/xdf-f160w.fits

Segment’s operation is very much like NoiseChisel (in fact, prior to version 0.6, it was part of NoiseChisel), for example the output is a multi-extension FITS file, it has check images and uses the undetected regions as a reference. Please have a look at Segment’s multi-extension output with ds9 to get a good feeling of what it has done. Like NoiseChisel, the first extension is the input. The CLUMPS extension shows the true “clumps” with values that are \(\ge1\), and the diffuse regions labeled as \(-1\). In the OBJECTS extension, we see that the large detections of NoiseChisel (that may have contained many galaxies) are now broken up into separate labels. see Segment for more.

Having localized the regions of interest in the dataset, we are ready to do measurements on them with MakeCatalog. Besides the IDs, we want to measure (in this order) the Right Ascension (with --ra), Declination (--dec), magnitude (--magnitude), and signal-to-noise ratio (--sn) of the objects and clumps. The following command will make these measurements on Segment’s F160W output:

$ mkdir cat
$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --zeropoint=25.94                                    \
               --clumpscat --output=cat/xdf-f160w.fits

From the printed statements on the command-line, you see that MakeCatalog read all the extensions in Segment’s output for the various measurements it needed.

The output of the MakeCatalog command above is a FITS table. The two clump and object catalogs are available in the two extensions of the single FITS file35. Let’s inspect the separate extensions with the Fits program like before (as shown below). Later, we’ll inspect the table in each extension with Gnuastro’s Table program (see Table). Note that we could have used -hOBJECTS and -hCLUMPS instead of -h1 and -h2 respectively.

$ astfits  cat/xdf-f160w.fits              # Extension information
$ asttable cat/xdf-f160w.fits -h1 --info   # Objects catalog info.
$ asttable cat/xdf-f160w.fits -h1          # Objects catalog columns.
$ asttable cat/xdf-f160w.fits -h2 -i       # Clumps catalog info.
$ asttable cat/xdf-f160w.fits -h2          # Clumps catalog columns.

As you see above, when given a specific table (file name and extension), Table will print the full contents of all the columns. To see basic information about each column (for example name, units and comments), simply append a --info (or -i).

To print the contents of special column(s), just specify the column number(s) (counting from 1) or the column name(s) (if they have one). For example, if you just want the magnitude and signal-to-noise ratio of the clumps (in -h2), you can get it with any of the following commands

$ asttable cat/xdf-f160w.fits -h2 -c5,6
$ asttable cat/xdf-f160w.fits -h2 -c5,SN
$ asttable cat/xdf-f160w.fits -h2 -c5         -c6
$ asttable cat/xdf-f160w.fits -h2 -cMAGNITUDE -cSN

In the example above, the clumps catalog has two ID columns (one for the over-all clump ID and one for the ID of the clump in its host object), while the objects catalog only has one ID column. Therefore, the location of the magnitude column differs between the object and clumps catalog. So if you want to specify the columns by number, you will need to change the numbers when viewing the clump and objects catalogs. This is a useful advantage of having/using column names36.

$ asttable catalog/xdf-f160w.fits -h1 -c4 -c5
$ asttable catalog/xdf-f160w.fits -h2 -c5 -c6

Finally, the comments in MakeCatalog’s output (COMMENT keywords in the FITS headers, or lines starting with # in plain text) contain some important information about the input dataset that can be useful (for example pixel area or per-pixel surface brightness limit). For example have a look at the output of this command:

$ astfits cat/xdf-f160w.fits -h1 | grep COMMENT

To calculate colors, we also need magnitude measurements on the F105W filter. However, the galaxy properties might differ between the filters (which is the whole purpose behind measuring colors). Also, the noise properties and depth of the datasets differ. Therefore, if we simply follow the same Segment and MakeCatalog calls above for the F105W filter, we are going to get a different number of objects and clumps. Matching the two catalogs is possible (for example with Match), but the fact that the measurements will be done on different pixels, can bias the result. Since the Point spread function (PSF) of both images is very similar, an accurate color calculation can only be done when magnitudes are measured from the same pixels on both images.

The F160W image is deeper, thus providing better detection/segmentation, and redder, thus observing smaller/older stars and representing more of the mass in the galaxies. We will thus use the pixel labels generated on the F160W filter, but do the measurements on the F105W filter (using the --valuesfile option) in the command below. Notice how the only difference between this call to MakeCatalog and the previous one is --valuesfile, the value given to --zeropoint and the output name.

$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --valuesfile=nc/xdf-f105w.fits --zeropoint=26.27     \
               --clumpscat --output=cat/xdf-f105w.fits

Look into what MakeCatalog printed on the command-line. You can see that (as requested) the object and clump labels were taken from the respective extensions in seg/xdf-f160w.fits, while the values and Sky standard deviation were done on nc/xdf-f105w.fits.

Since we used the same labeled image on both filters, the number of rows in both catalogs are the same. The clumps are not affected by the hard-to-deblend and low signal-to-noise diffuse regions, they are more robust for calculating the colors (compared to objects). Therefore from this step onward, we’ll continue with clumps.

We can finally calculate the colors of the objects from these two datasets. If you inspect the contents of the two catalogs, you’ll notice that because they were both derived from the same segmentation maps, the rows are ordered identically (they correspond to the same object/clump in both filters). But to be generic (usable even when the rows aren’t ordered similarly) and display another useful program in Gnuastro, we’ll use Match.

As the name suggests, Gnuastro’s Match program will match rows based on distance (or aperture in 2D) in one (or two) columns. In the command below, the options relating to each catalog are placed under it for easy understanding. You give Match two catalogs (from the two different filters we derived above) as argument, and the HDUs containing them (if they are FITS files) with the --hdu and --hdu2 options. The --ccol1 and --ccol2 options specify which columns should be matched with which in the two catalogs. With --aperture you specify the acceptable error (radius in 2D), in the same units as the columns (see below for why we have requested an aperture of 0.35 arcseconds, or less than 6 HST pixels).

The --outcols is a very convenient feature in Match: you can use it to specify which columns from the two catalogs you want in the output (merge two input catalogs into one). If the first character is an ‘a’, the respective matched column (number or name, similar to Table above) in the first catalog will be written in the output table. When the first character is a ‘b’, the respective column from the second catalog will be written in the output. Also, if the first character is followed by _all, then all the columns from the respective catalog will be put in the output.

$ astmatch cat/xdf-f160w.fits           cat/xdf-f105w.fits         \
           --hdu=CLUMPS                 --hdu2=CLUMPS              \
           --ccol1=RA,DEC               --ccol2=RA,DEC             \
           --aperture=0.35/3600 --log                              \
           --outcols=a_all,bMAGNITUDE,bSN                          \

By default (when --quiet isn’t called), the Match program will just print the number of matched rows in the standard output. If you have a look at your input catalogs, this should be the same as the number of rows in them. Let’s have a look at the columns in the matched catalog:

$ asttable cat/xdf-f160w-f105w.fits -i

Indeed, its exactly the columns we wanted. There is just one confusion however: there are two MAGNITUDE and SN columns. Right now, you know that the first one was from the F160W filter, and the second was for F105W. But in one hour, you’ll start doubting your self: going through your command history, trying to answer this question: “which magnitude corresponds to which filter?”. You should never torture your future-self (or colleagues) like this! So, let’s rename these confusing columns in the matched catalog. The FITS standard for tables stores the column names in the TTYPE header keywords, so let’s have a look:

$ astfits cat/xdf-f160w-f105w.fits -h1 | grep TTYPE

Changing/updating the column names is as easy as updating the values to these options with the first command below, and with the second, confirm this change:

$ astfits cat/xdf-f160w-f105w.fits -h1                          \
          --update=TTYPE5,MAG_F160W   --update=TTYPE6,SN_F160W  \
          --update=TTYPE7,MAG_F105W   --update=TTYPE8,SN_F105W
$ asttable cat/xdf-f160w-f105w.fits -i

If you noticed, when running Match, the previous command, we also asked for --log. Many Gnuastro programs have this option to provide some detailed information on their operation in case you are curious. Here, we are using it to justify the value we gave to --aperture. Even though you asked for the output to be written in the cat directory, a listing of the contents of your current directory will show you an extra astmatch.fits file. Let’s have a look at what columns it contains.

$ ls
$ asttable astmatch.log -i

The MATCH_DIST column contains the distance of the matched rows, let’s have a look at the distribution of values in this column. You might be asking yourself “why should the positions of the two filters differ when I gave MakeCatalog the same segmentation map?” The reason is that the central positions are flux-weighted. Therefore the --valuesfile dataset you give to MakeCatalog will also affect the center measurements37. Recall that the Spectral Energy Distribution (SED) of galaxies is not flat and they have substructure, therefore, they can have different shapes/morphologies in different filters.

Gnuastro has a simple program for basic statistical analysis. The command below will print some basic information about the distribution (minimum, maximum, median and etc), along with a cute little ASCII histogram to visually help you understand the distribution on the command-line without the need for a graphic user interface (see Invoking Statistics). This ASCII histogram can be useful when you just want some coarse and general information on the input dataset. It is also useful when working on a server (where you may not have graphic user interface), and finally, its fast.

$ aststatistics astmatch.fits -cMATCH_DIST

The units of this column are the same as the columns you gave to Match: in degrees. You see that while almost all the objects matched very nicely, the maximum distance is roughly 0.31 arcseconds. This is why we asked for an aperture of 0.35 arcseconds when doing the match.

We can now use AWK to find the colors. We’ll ask AWK to only use rows that don’t have a NaN magnitude in either filter38. We will also ignore columns which don’t have reliable F105W measurement (with a S/N less than 739).

$ asttable cat/xdf-f160w-f105w.fits -cMAG_F160W,MAG_F105W,SN_F105W  \
           | awk '$1!="nan" && $2!="nan" && $3>7 {print $2-$1}'     \
           > f105w-f160w.txt

You can inspect the distribution of colors with the Statistics program again:

$ aststatistics f105w-f160w.txt -c1

You can later use Gnuastro’s Statistics program with the --histogram option to build a much more fine-grained histogram as a table to feed into your favorite plotting program for a much more accurate/appealing plot (for example with PGFPlots in LaTeX). If you just want a specific measure, for example the mean, median and standard deviation, you can ask for them specifically with this command:

$ aststatistics f105w-f160w.txt -c1 --mean --median --std

Some researchers prefer to have colors in a fixed aperture for all the objects. The colors we calculated above used a different segmentation map for each object. This might not satisfy some science cases. So, let’s make a fixed aperture catalog. To make an catalog from fixed apertures, we should make a labeled image which has a fixed label for each aperture. That labeled image can be given to MakeCatalog instead of Segment’s labeled detection image.

To generate the apertures catalog, we’ll first read the positions from F160W catalog and set the other parameters of each profile to be a fixed circle of radius 5 pixels (we want all apertures to be identical in this scenario).

$ rm *.fits *.txt
$ asttable cat/xdf-f160w.fits -hCLUMPS -cRA,DEC                    \
           | awk '!/^#/{print NR, $1, $2, 5, 5, 0, 0, 1, NR, 1}' \
           > apertures.txt

We can now feed this catalog into MakeProfiles to build the apertures for us. See Invoking MakeProfiles for a description of the options. The most important for this particular job is --mforflatpix, it tells MakeProfiles that the values in the magnitude column should be used for each pixel of a flat profile. Without it, MakeProfiles would build the profiles such that the sum of the pixels of each profile would have a magnitude (in log-scale) of the value given in that column (what you would expect when simulating a galaxy for example).

$ astmkprof apertures.txt --background=flat-ir/xdf-f160w.fits     \
            --clearcanvas --replace --type=int16 --mforflatpix    \

The first thing you might notice in the printed information is that the profiles are not built in order. This is because MakeProfiles works in parallel, and parallel CPU operations are asynchronous. You can try running MakeProfiles with one thread (using --numthreads=1) to see how order is respected in that case.

Open the output apertures.fits file and see the result. Where the apertures overlap, you will notice that one label has replaced the other (because of the --replace option). In the future, MakeCatalog will be able to work with overlapping labels, but currently it doesn’t. If you are interested, please join us in completing Gnuastro with added improvements like this (see task 14750 40).

We can now feed the apertures.fits labeled image into MakeCatalog instead of Segment’s output as shown below. In comparison with the previous MakeCatalog call, you will notice that there is no more --clumpscat option, since each aperture is treated as a separate “object” here.

$ astmkcatalog apertures.fits -h1 --zeropoint=26.27        \
               --valuesfile=nc/xdf-f105w.fits              \
               --ids --ra --dec --magnitude --sn           \

This catalog has the same number of rows as the catalog produced from clumps, therefore similar to how we found colors, you can compare the aperture and clump magnitudes for example. You can also change the filter name and zeropoint magnitudes and run this command again to have the fixed aperture magnitude in the F160W filter and measure colors on apertures.

As a final step, let’s go back to the original clumps-based catalogs we generated before. We’ll find the objects with the strongest color and make a cutout to inspect them visually and finally, we’ll see how they are located on the image.

First, let’s see what the objects with a color more than two magnitudes look like. As you see, this is very much like the command above for selecting the colors, only instead of printing the color, we’ll print the RA and Dec. With the command below, the positions of all lines with a color more than 1.5 will be put in reddest.txt

$ asttable cat/xdf-f160w-f105w.fits                                \
           -cMAG_F160W,MAG_F105W,SN_F105W,RA,DEC                   \
           | awk '$1!="nan" && $2!="nan" && $2-$1>1.5 && $3>7      \
                  {print $4,$5}' > reddest.txt

We can now feed reddest.txt into Gnuastro’s crop to see what these objects look like. To keep things clean, we’ll make a directory called crop-red and ask Crop to save the crops in this directory. We’ll also add a -f160w.fits suffix to the crops (to remind us which image they came from). The width of the crops will be 15 arcseconds.

$ mkdir crop-red
$ astcrop --mode=wcs --coordcol=3 --coordcol=4 flat-ir/xdf-f160w.fits \
          --catalog=reddest.txt --width=15/3600,15/3600               \
          --suffix=-f160w.fits --output=crop-red

Like the MakeProfiles command above, you might notice that the crops aren’t made in order. This is because each crop is independent of the rest, therefore crops are done in parallel, and parallel operations are asynchronous. In the command above, you can change f160w to f105w to make the crops in both filters.

To view the crops more easily (not having to open ds9 for each image), you can convert the FITS crops into the JPEG format with a shell loop like below.

$ cd crop-red
$ for f in *.fits; do                                                  \
    astconvertt $f --fluxlow=-0.001 --fluxhigh=0.005 --invert -ojpg;   \
$ cd ..

You can now use your general graphic user interface image viewer to flip through the images more easily. On GNOME, you can use the “Eye of GNOME” image viewer (with executable name of eog). Run the command below to open the first one (if you aren’t using GNOME, use the command of your image viewer instead of eog):

$ eog 1-f160w.jpg

In Eye of GNOME, you can flip through the images and compare them visually more easily by pressing the <SPACE> key. Of course, the flux ranges have been chosen generically here for seeing the fainter parts. Therefore, brighter objects will be fully black.

The for loop above to convert the images will do the job in series: each file is converted only after the previous one is complete. If you have GNU Parallel, you can greatly speed up this conversion. GNU Parallel will run the separate commands simultaneously on different CPU threads in parallel. For more information on efficiently using your threads, see Multi-threaded operations. Here is a replacement for the shell for loop above using GNU Parallel.

$ cd crop-red
$ parallel astconvertt --fluxlow=-0.001 --fluxhigh=0.005 --invert   \
           -ojpg ::: *.fits
$ cd ..

As the final action, let’s see how these objects are positioned over the dataset. DS9 has the “Region”s concept for this purpose. You just have to convert your catalog into a “region file” to feed into DS9. To do that, you can use AWK again as shown below.

$ awk 'BEGIN{print "# Region file format: DS9 version 4.1";     \
             print "global color=green width=2";                \
             print "fk5";}                                      \
       {printf "circle(%s,%s,1\")\n", $1, $2;}' reddest.txt     \
       > reddest.reg

This region file can be loaded into DS9 with its -regions option to display over any image (that has world coordinate system). In the example below, we’ll open Segment’s output and load the regions over all the extensions (to see the image and the respective clump):

$ ds9 -mecube seg/xdf-f160w.fits -zscale -zoom to fit    \
      -regions load all reddest.reg

In conclusion, we hope this extended tutorial has been a good starting point to help in your exciting research. If this book or any of the programs in Gnuastro have been useful for your research, please cite the respective papers and share your thoughts and suggestions with us (it can be very encouraging). All Gnuastro programs have a --cite option to help you cite the authors’ work more easily. Just note that it may be necessary to cite additional papers for different programs, so please try it out for any program you used.

$ astmkcatalog --cite
$ astnoisechisel --cite



See SAO ds9, available at




GNU Info is already available on almost all Unix-like operating systems.


Note that you only have one port to the internet, so downloading in parallel will actually be slower than downloading in series.


To learn more about the crop program see Crop.


As you will see below, unlike most other detection algorithms, NoiseChisel detects the objects from their faintest parts, it doesn’t start with their high signal-to-noise ratio peaks. Since the Sky is already subtracted in many images and noise fluctuates around zero, zero is commonly higher than the initial threshold applied. Therefore not ignoring zero-valued pixels in this image, will cause them to part of the detections!


In the FITS standard, the CDELT keywords (CDELT1 and CDELT2 in a 2D image) specify the scales of each coordinate. In the case of this image it is in units of degrees-per-pixel. See Section 8 of the FITS standard for more. In short, with the CDELT convention, rotation (PC or CD keywords) and scales (CDELT) are separated. In the FITS standard the CDELT keywords are optional. When CDELT keywords aren’t present, the PC matrix is assumed to contain both the coordinate rotation and scales. Note that not all FITS writers use the CDELT convention. So you might not find the CDELT keywords in the WCS meta data of some FITS files. However, all Gnuastro programs (which use the default FITS keyword writing format of WCSLIB), the CDELT convention is used, even if the input doesn’t have it. So when rotation and scaling are combined and finding the pixel scale isn’t trivial from the raw keyword values, you can feed the dataset to any (simple) Gnuastro program (for example Arithmetic). The output will have the CDELT keyword.


To learn more about numeric data types see Numeric data types.


You can configure your graphic user interface to open DS9 in multi-extension cube mode by default when using the GUI (double clicking on the file). If your graphic user interface is GNOME (another GNU software, it is most common in GNU/Linux operating systems), a full description is given in Viewing multiextension FITS images


The first PSEUDOS-FOR-SN in xdf-f160w_detsn.fits is for the pseudo-detections over the undetected regions and the second is for those over detected regions.


MakeCatalog can also output plain text tables. However, in the plain text format you can only have one table per file. Therefore, if you also request measurements on clumps, two plain text tables will be created (suffixed with _o.txt and _c.txt).


Column meta-data (including a name) can also be specified in plain text tables, see Gnuastro text table format.


To only measure the center based on the labeled pixels (and ignore the pixel values), you can ask for the columns that contain geo (for geometric) in them. For example --geow1 or --geow2 for the RA and Declination (first and second world-coordinates).


This can happen even on the reference image. It is because of the current way clumps are defined in Segment when they are placed on strong gradients. It is because of high “river” values on such gradients. See Segment changes after publication. To avoid this problem, you can currently ask for the --brighntessnoriver output column.


The value of 7 is taken from the clump S/N threshold in F160W (where the clumps were defined).


Next: , Previous: , Up: Tutorials   [Contents][Index]