GNU Astronomy Utilities

Next: , Previous: , Up: Tutorials   [Contents][Index]

2.2 General program usage tutorial

Measuring colors of astronomical objects in broad-band or narrow-band images is one of the most basic and common steps in astronomical analysis. Here, we will use Gnuastro’s programs to get a physical scale (area at certain redshifts) of the field we are studying, detect objects in a Hubble Space Telescope (HST) image, measure their colors and identify the ones with the largest colors to visual inspection and their spatial position in the image. After this tutorial, you can also try the Detecting large extended targets tutorial which goes into a little more detail on optimally configuring NoiseChisel (Gnuastro’s detection tool) in special situations.

During the tutorial, we will take many detours to explain, and practically demonstrate, the many capabilities of Gnuastro’s programs. In the end you will see that the things you learned during this tutorial are much more generic than this particular problem and can be used in solving a wide variety of problems involving the analysis of data (images or tables). So please don’t rush, and go through the steps patiently to optimally master Gnuastro.

In this tutorial, we’ll use the HST eXtreme Deep Field dataset. Like almost all astronomical surveys, this dataset is free for download and usable by the public. You will need the following tools in this tutorial: Gnuastro, SAO DS9 25, GNU Wget26, and AWK (most common implementation is GNU AWK27).

This tutorial was first prepared for the “Exploring the Ultra-Low Surface Brightness Universe” workshop (November 2017) at the ISSI in Bern, Switzerland. It was further extended in the “4th Indo-French Astronomy School” (July 2018) organized by LIO, CRAL CNRS UMR5574, UCBL, and IUCAA in Lyon, France. We are very grateful to the organizers of these workshops and the attendees for the very fruitful discussions and suggestions that made this tutorial possible.

Write the example commands manually: Try to type the example commands on your terminal manually and use the history feature of your command-line (by pressing the “up” button to retrieve previous commands). Don’t simply copy and paste the commands shown here. This will help simulate future situations when you are processing your own datasets.

A handy feature of Gnuastro is that all program names start with ast. This will allow your command-line processor to easily list and auto-complete Gnuastro’s programs for you. Try typing the following command (press TAB key when you see <TAB>) to see the list:

$ ast<TAB><TAB>

Any program that starts with ast (including all Gnuastro programs) will be shown. By choosing the subsequent characters of your desired program and pressing <TAB><TAB> again, the list will narrow down and the program name will auto-complete once your input characters are unambiguous. In short, you often don’t need to type the full name of the program you want to run.

Gnuastro contains a large number of programs and it is natural to forget the details of each program’s options or inputs and outputs. Therefore, before starting the analysis, let’s review how you can access this book to refresh your memory any time you want. For example when working on the command-line, without having to take your hands off the keyboard. When you install Gnuastro, this book is also installed on your system along with all the programs and libraries, so you don’t need an internet connection to to access/read it. Also, by accessing this book as described below, you can be sure that it corresponds to your installed version of Gnuastro.

GNU Info28 is the program in charge of displaying the manual on the command-line (for more, see Info). To see this whole book on your command-line, please run the following command and press subsequent keys. Info has its own mini-environment, therefore we’ll show the keys that must be pressed in the mini-environment after a -> sign. You can also ignore anything after the # sign in the middle of the line, they are only for your information.

$ info gnuastro                # Open the top of the manual.
-> <SPACE>                     # All the book chapters.
-> <SPACE>                     # Continue down: show sections.
-> <SPACE> ...                 # Keep pressing space to go down.
-> q                           # Quit Info, return to the command-line.

The thing that greatly simplifies navigation in Info is the links (regions with an underline). You can immediately go to the next link in the page with the <TAB> key and press <ENTER> on it to go into that part of the manual. Try the commands above again, but this time also use <TAB> to go to the links and press <ENTER> on them to go to the respective section of the book. Then follow a few more links and go deeper into the book. To return to the previous page, press l (small L). If you are searching for a specific phrase in the whole book (for example an option name), press s and type your search phrase and end it with an <ENTER>.

You don’t need to start from the top of the manual every time. For example, to get to Invoking NoiseChisel, run the following command. In general, all programs have such an “Invoking ProgramName” section in this book. These sections are specifically for the description of inputs, outputs and configuration options of each program. You can access them directly for each program by giving its executable name to Info.

$ info astnoisechisel

The other sections don’t have such shortcuts. To directly access them from the command-line, you need to tell Info to look into Gnuastro’s manual, then look for the specific section (an unambiguous title is necessary). For example, if you only want to review/remember NoiseChisel’s Detection options), just run the following command. Note how case is irrelevant for Info when calling a title in this manner.

$ info gnuastro "Detection options"

In general, Info is a powerful and convenient way to access this whole book with detailed information about the programs you are running. If you are not already familiar with it, please run the following command and just read along and do what it says to learn it. Don’t stop until you feel sufficiently fluent in it. Please invest the half an hour’s time necessary to start using Info comfortably. It will greatly improve your productivity and you will start reaping the rewards of this investment very soon.

$ info info

As a good scientist you need to feel comfortable to play with the features/options and avoid (be critical to) using default values as much as possible. On the other hand, our human memory is limited, so it is important to be able to easily access any part of this book fast and remember the option names, what they do and their acceptable values.

If you just want the option names and a short description, calling the program with the --help option might also be a good solution like the first example below. If you know a few characters of the option name, you can feed the output to grep like the second or third example commands.

$ astnoisechisel --help
$ astnoisechisel --help | grep quant
$ astnoisechisel --help | grep check

Let’s start the processing. First, to keep things clean, let’s create a gnuastro-tutorial directory and continue all future steps in it:

$ mkdir gnuastro-tutorial
$ cd gnuastro-tutorial

We will be using the near infra-red Wide Field Camera dataset. If you already have them in another directory (for example XDFDIR), you can set the download directory to be a symbolic link to XDFDIR with a command like this:

$ ln -s XDFDIR download

If the following images aren’t already present on your system, you can make a download directory and download them there.

$ mkdir download
$ cd download
$ xdfurl=
$ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f105w_v1_sci.fits
$ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits
$ cd ..

In this tutorial, we’ll just use these two filters. Later, you will probably need to download more filters, you can use the shell’s for loop to download them all in series (one after the other29) with one command like the one below for the WFC3 filters. Put this command instead of the two wget commands above. Recall that all the extra spaces, back-slashes (\), and new lines can be ignored if you are typing on the lines on the terminal.

$ for f in f105w f125w f140w f160w; do                              \
    wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits;   \

First, let’s visually inspect the dataset. Let’s take F160W image as an example. Do the steps below with the other image(s) too (and later with any dataset that you want to work on). It is very important to understand your dataset visually. Note how ds9 doesn’t follow the GNU style of options where “long” and “short” options are preceded by -- and - respectively (for example --width and -w, see Options).

Ds9’s -zscale option is a good scaling to highlight the low surface brightness regions, and as the name suggests, -zoom to fit will fit the whole dataset in the window. If the window is too small, expand it with your mouse, then press the “zoom” button on the top row of buttons above the image, then in the row below it, press “zoom fit”. You can also zoom in and out by scrolling your mouse or the respective operation on your touch-pad when your cursor/pointer is over the image.

$ ds9 download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits     \
      -zscale -zoom to fit

The first thing you might notice is that the regions with no data have a value of zero in this image. The next thing might be that the dataset actually has two “depth”s (see Quantifying measurement limits). The exposure time of the inner region is more than 4 times of the outer parts. Fortunately the XDF survey webpage (above) contains the vertices of the deep flat WFC3-IR field. You can use those vertices in Crop to cutout this deep infra-red region from the larger image. We’ll make a directory called flat-ir and keep the flat infra-red regions in that directory (with a ‘xdf-’ suffix for a shorter and easier filename).

$ mkdir flat-ir
$ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f105w.fits              \
          --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :    \
                     53.134517,-27.787144 : 53.161906,-27.807208"     \
$ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f160w.fits              \
          --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :    \
                     53.134517,-27.787144 : 53.161906,-27.807208"     \

The only thing varying in the two calls to Gnuastro’s Crop program is the filter name. Therefore, to simplify the command, and later allow work on more filters, we can use the shell’s for loop. Notice how the two places where the filter names (f105w and f160w) are used above have been replaced with $f (the shell variable that for is in charge of setting) below. To generalize this for more filters later, you can simply add the other filter names in the first line before the semi-colon (;).

$ for f in f105w f160w; do                                            \
    astcrop --mode=wcs -h0 --output=flat-ir/xdf-$f.fits               \
            --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :  \
                       53.134517,-27.787144 : 53.161906,-27.807208"   \
            download/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits; \

Please open these images and inspect them with the same ds9 commands you used above. You will see how it is completely flat now and doesn’t have varying depths. Another important result of this crop is that regions with no data now have a NaN (blank) value, not zero. Zero is a meaningful value and especially when using NoiseChisel, the input should have NaN values for pixels with no data, not zero.

This is the deepest image we currently have of the sky. The first thing that comes to mind may be this: “How large is this field?”. Let’s find the answer to this question with the commands below. The lines starting with ## are just comments for you to help in following the steps. Don’t type them on the terminal. The commands are intentionally repetitive in some places to better understand each step and also to demonstrate the beauty of command-line features like variables, pipes and loops. Later, if you would like to repeat this process on another dataset, you can just use commands 3, 7, and 9.

Use shell history: Don’t forget to make effective use of your shell’s history. This is especially convenient when you just want to make a small change to your previous command. Press the “up” key on your keyboard (possibly multiple times) to see your previous command(s).

## (1)  See the general statistics of non-blank pixel values.
$ aststatistics flat-ir/xdf-f160w.fits

## (2)  We only want the number of non-blank pixels.
$ aststatistics flat-ir/xdf-f160w.fits --number

## (3)  Keep the result of the command above in the shell variable `n'.
$ n=$(aststatistics flat-ir/xdf-f160w.fits --number)

## (4)  See what is stored the shell variable `n'.
$ echo $n

## (5)  Show all the FITS keywords of this image.
$ astfits flat-ir/xdf-f160w.fits -h1

## (6)  The resolution (in degrees/pixel) is in the `CDELT' keywords.
##      Only show lines that contain these characters, by feeding
##      the output of the previous command to the `grep' program.
$ astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT

## (7)  Save the resolution (same in both dimensions) in the variable
##      `r'. The last part uses AWK to print the third `field' of its
##      input line. The first two fields were `CDELT1' and `='.
$ r=$(astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1   \
              | awk '{print $3}')

## (8)  Print the values of `n' and `r'.
$ echo $n $r

## (9)  Use the number of pixels (first number passed to AWK) and
##      length of each pixel's edge (second number passed to AWK)
##      to estimate the area of the field in arc-minutes squared.
$ area=$(echo $n $r | awk '{print $1 * ($2^2) * 3600}')

The area of this field is 4.03817 (or 4.04) arc-minutes squared. Just for comparison, this is roughly 175 times smaller than the average moon’s angular area (with a diameter of 30arc-minutes or half a degree).

AWK for table/value processing: AWK is a powerful and simple tool for text processing. Above (and further below) some simple examples are shown. GNU AWK (the most common implementation) comes with a free and wonderful book in the same format as this book which will allow you to master it nicely. Just like this manual, you can also access GNU AWK’s manual on the command-line whenever necessary without taking your hands off the keyboard.

This takes us to the second question that you have probably asked yourself when you saw the field for the first time: “How large is this area at different redshifts?”. To get a feeling of the tangential area that this field covers at redshift 2, you can use CosmicCalculator. In particular, you need the tangential distance covered by 1 arc-second as raw output. Combined with the field’s area, we can then calculate the tangential distance in Mega Parsecs squared (\(Mpc^2\)).

## Print general cosmological properties at redshift 2.
$ astcosmiccal -z2

## When given a "Specific calculation" option, CosmicCalculator
## will just print that particular calculation. See the options
## under this title in the output of `--help' for more.
$ astcosmiccal --help

## Only print the "Tangential dist. covered by 1arcsec at z (kpc)".
## in units of kpc/arc-seconds.
$ astcosmiccal -z2 --arcsectandist

## Convert this distance to kpc^2/arcmin^2 and save in `k'.
$ k=$(astcosmiccal -z2 --arcsectandist | awk '{print ($1*60)^2}')

## Multiply by the area of the field (in arcmin^2) and divide by
## 10^6 to return value in Mpc^2.
$ echo $k $area | awk '{print $1 * $2 / 1e6}'

At redshift 2, this field therefore covers 1.07145 \(Mpc^2\). If you would like to see how this tangential area changes with redshift, you can use a shell loop like below.

$ for z in 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0; do           \
    k=$(astcosmiccal -z$z --arcsectandist);                      \
    echo $z $k $area | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \

Fortunately, the shell has a useful tool/program to print a sequence of numbers that is nicely called seq. You can use it instead of typing all the different redshifts in this example. For example the loop below will print the same range of redshifts (between 0.5 and 5) but with increments of 0.1.

$ for z in $(seq 0.5 0.1 5); do                                  \
    k=$(astcosmiccal -z$z --arcsectandist);                      \
    echo $z $k $area | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \

This is a fast and simple way for this repeated calculation when it is only necessary once. However, if you commonly need this calculation and possibly for a larger number of redshifts, the command above can be slow. This is because the CosmicCalculator program has a lot of overhead. To be generic and easy to operate, it has to parse the command-line and all configuration files (see below) which contain human-readable characters and need a lot of processing to be ready for processing by the computer. Afterwards, CosmicCalculator has to check the sanity of its inputs and check which of its many options you have asked for. It has to do all of these for every redshift in the loop above.

To greatly speed up the processing, you can directly access the root work-horse of CosmicCalculator without all that overhead. Using Gnuastro’s library, you can write your own tiny program particularly designed for this exact calculation (and nothing else!). To do that, copy and paste the following C program in a file called myprogram.c.

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <gnuastro/cosmology.h>

  double area=4.03817;          /* Area of field (arcmin^2). */
  double z, adist, tandist;     /* Temporary variables.      */

  /* Constants from Plank 2018 (arXiv:1807.06209, Table 2) */
  double H0=67.66, olambda=0.6889, omatter=0.3111, oradiation=0;

  /* Do the same thing for all redshifts (z) between 0.1 and 5. */
  for(z=0.1; z<5; z+=0.1)
      /* Calculate the angular diameter distance. */
      adist=gal_cosmology_angular_distance(z, H0, olambda,
                                           omatter, oradiation);

      /* Calculate the tangential distance of one arcsecond. */
      tandist = adist * 1000 * M_PI / 3600 / 180;

      /* Print the redshift and area. */
      printf("%-5.2f %g\n", z, pow(tandist * 60,2) * area / 1e6);

  /* Tell the system that everything finished successfully. */
  return EXIT_SUCCESS;

To greatly simplify the compilation, linking and running of simple C programs like this that use Gnuastro’s library, Gnuastro has BuildProgram. This program designed to manage Gnuastro’s dependencies, compile and link the program and then run the new program. To build and run the program above, simply run the following command:

$ astbuildprog myprogram.c

Did you notice how much faster this was compared to the shell loop we wrote above? You might have noticed that a new file called myprogram is also created in the directory. This is the compiled program that was created and run by the command above (its in binary machine code format, not human-readable any more). You can run it again to get the same results with a command like this:

$ ./myprogram

The efficiency of myprogram compared to CosmicCalculator is because the requested processing is faster/comparable to the overheads necessary for each processing. For other programs that take large input datasets (images for example), the overhead is usually negligible compared to the processing. In such cases, the libraries are only useful if you want a different/new processing compared to the functionalities in Gnuastro’s existing programs.

Gnuastro has a large library which is heavily used by all the programs. In other words, the library is like the skeleton of Gnuastro. For the full list of available functions classified by context, please see Gnuastro library. Gnuastro’s library and BuildProgram are created to make it easy for you to use these powerful features as you like. This gives you a high level of creativity, while also providing efficiency and robustness. Several other complete working examples (involving images and tables) of Gnuastro’s libraries can be see in Library demo programs. Let’s stop the discussion on libraries at this point in this tutorial and get back to Gnuastro’s already built programs which were the main purpose of this tutorial.

None of Gnuastro’s programs keep a default value internally within their code. However, when you ran CosmicCalculator with the -z2 option above, it completed its processing and printed results. So where did the “default” cosmological parameter values (like the matter density and etc) come from? The values come from the command-line or a configuration file (see Configuration file precedence).

CosmicCalculator has a limited set of parameters and doesn’t need any particular file inputs. Therefore, we’ll use it to discuss configuration files which are an important part of all Gnuastro’s programs (see Configuration files).

Once you get comfortable with configuration files, you can easily do the same for the options of all Gnuastro programs. For example, NoiseChisel has the largest number of options in the programs. Therefore configuration files will be useful for it when you use different datasets (with different noise properties or in different research contexts). The configuration of each program (besides its version) is vital for the reproducibility of your results, so it is important to manage them properly.

As we saw above, the full list of the options in all Gnuastro programs can be seen with the --help option. Try calling it with CosmicCalculator as shown below. Note how options are grouped by context to make it easier to find your desired option. However, in each group, options are ordered alphabetically.

$ astcosmiccal --help

The options that need a value have an = sign after their long version and FLT, INT or STR for floating point numbers, integer numbers and strings (filenames for example) respectively. All options have a long format and some have a short format (a single character), for more see Options.

When you are using a program, it is often necessary to check the value the option has just before the program starts its processing. In other words, after it has parsed the command-line options and all configuration files. You can see the values of all options that need one with the --printparams or -P option that is common to all programs (see Common options). In the command below, try replacing -P with --printparams to see how both do the same operation.

$ astcosmiccal -P

Let’s say you want a different Hubble constant. Try running the following command to see how the Hubble constant in the output of the command above has changed. Afterwards, delete the -P and add a -z2 to see the results with the new cosmology (or configuration).

$ astcosmiccal -P --H0=70

From the output of the --help option, note how the option for Hubble constant has both short (-H) and long (--H0) formats. One final note is that the equal (=) sign is not mandatory. In the short format, the value can stick to the actual option (the short option name is just one character after-all and thus easily identifiable) and in the long format, a white-space character is also enough.

$ astcosmiccal -H70    -z2
$ astcosmiccal --H0 70 -z2 --arcsectandist

Let’s assume that in one project, you want to only use rounded cosmological parameters (H0 of 70km/s/Mpc and matter density of 0.3). You should therefore run CosmicCalculator like this:

$ astcosmiccal --H0=70 --olambda=0.7 --omatter=0.3 -z2

But having to type these extra options every time you run CosmicCalculator will be prone to errors (typos in particular) and also will be frustrating and slow. Therefore in Gnuastro, you can put all the options and their values in a “Configuration file” and tell the programs to read the option values from there.

Let’s create a configuration file. In your favorite text editor, make a file named my-cosmology.conf (or my-cosmology.txt, the suffix doesn’t matter) which contains the following lines. One space between the option value and name is enough, the values are just under each other to help in readability. Also note that you can only use long option names in configuration files.

H0       70
olambda  0.7
omatter  0.3

You can now tell CosmicCalculator to read this file for option values immediately using the --config option as shown below. Do you see how the output of the following command corresponds to the option values in my-cosmology.txt (previous command)?

$ astcosmiccal --config=my-cosmology.conf -z2

If you need this cosmology every time you are working in a specific directory, you can benefit from Gnuastro’s default configuration files to avoid having to call the --config option. Let’s assume that you want any CosmicCalculator call you make in the my-cosmology directory to use these parameters. You just have to copy the above configuration file into a special directory and file:

$ mkdir my-cosmology
$ mkdir my-cosmology/.gnuastro
$ mv my-cosmology.conf my-cosmology/.gnuastro/astcosmiccal.conf

Once you run CosmicCalculator within my-cosmology as shown below, you will see how your cosmology has been implemented without having to type anything extra on the command-line.

$ cd my-cosmology
$ astcosmiccal -z2
$ cd ..

To further simplify the process, you can use the --setdirconf option. If you are already in your desired directory, calling this option with the others will automatically write the final values (along with descriptions) in .gnuastro/astcosmiccal.conf. For example the commands below will make the same configuration file automatically (with one extra call to CosmicCalculator).

$ mkdir my-cosmology2
$ cd my-cosmology2
$ astcosmiccal --H0 70 --olambda=0.7 --omatter=0.3 --setdirconf
$ astcosmiccal -z2
$ cd ..

Gnuastro’s programs also have default configuration files for a specific user (when run in any directory). This allows you to set a special behavior every time a program is run by a specific user. Only the directory and filename differ from the above, the rest of the process is similar to before. Finally, there are also system-wide configuration files that can be used to define the option values for all users on a system. See Configuration file precedence for a more detailed discussion.

We are now ready to start processing the downloaded images. Since these datasets are already aligned, you don’t need to align them to make sure the pixel grid covers the same region in all inputs. Gnuastro’s Warp program has features for such pixel-grid warping (see Warp). Therefore, just for a demonstration, let’s assume one image needs to be rotated by 20 degrees to correspond to the other. To do that, you can run the following command:

$ astwarp flat-ir/xdf-f160w.fits --rotate=20

Open the output and see the output. If your final image is already aligned with RA and Dec, you can simply use the --align option and let Warp calculate the necessary rotation.

Warp can generally be used for any kind of pixel grid manipulation (warping). For example the outputs of the commands below will respectively have larger pixels (new resolution being one quarter the original resolution), get shifted by 2.8 (by sub-pixel), get a shear of 2, and be tilted (projected). After running each, please open the output file and see the effect.

$ astwarp flat-ir/xdf-f160w.fits --scale=0.25
$ astwarp flat-ir/xdf-f160w.fits --translate=2.8
$ astwarp flat-ir/xdf-f160w.fits --shear=2
$ astwarp flat-ir/xdf-f160w.fits --project=0.001,0.0005

You can also combine multiple warps in one command. For example to first rotate the image, then scale it, run this command:

$ astwarp flat-ir/xdf-f160w.fits --rotate=20 --scale=0.25

If you have multiple warps, do them all in one command. Don’t warp them in separate commands because the correlated noise will become too strong. As you see in the matrix that is printed when you run Warp, it merges all the warps into a single warping matrix (see Warping basics and Merging multiple warpings) and simply applies that just once. Recall that since this is done through matrix multiplication, order matters in the separate operations. In fact through Warp’s --matrix option, you can directly request your desired final warp and don’t have to break it up into different warps like above (see Invoking Warp).

Fortunately these datasets are already aligned to the same pixel grid, so you don’t actually need the files that were just generated. You can safely delete them all with the following command. Here, you see why we put the processed outputs that we need later into a separate directory. In this way, the top directory can be used for temporary files for testing that you can simply delete with a generic command like below.

$ rm *.fits

To detect the signal in the image (separate interesting pixels from noise), we’ll run NoiseChisel (NoiseChisel):

$ astnoisechisel flat-ir/xdf-f160w.fits --output=nc-test.fits

NoiseChisel’s output is a single file containing multiple extensions. You can get basic information about the extensions in a FITS file with Gnuastro’s Fits program (see Fits):

$ astfits nc-test.fits

From the output list, you see NoiseChisel’s output contains 5 extensions and the first (counted as zero) is blank: it has value of 0 in the last/size column, showing that it contains no data, it just contains meta-data. All of Gnuastro’s programs follow this convention of writing no data in the first HDU/extension. This allows the first extension to keep meta-data about all the extensions and is recommended by the FITS standard, see Fits.

The name of each extension describes its contents: the first (INPUT-NO-SKY) is the Sky-subtracted input that you provided. The second (NoiseChisel’s main output, DETECTIONS) has a numeric data type of uint8 with only two possible values for all pixels: 0 for noise and 1 for signal. The third and fourth (called SKY and SKY_STD), have the Sky and its standard deviation values of the input on a tessellation and were calculated over the undetected regions.

To better understand NoiseChisel’s output and the extensions described above, let’s visually inspect it (here, we’ll use SAO DS9). Since the file contains multiple related extensions, the easiest way to view all of them in DS9 is to open it as a “Multi-extension data cube” with the -mecube option as shown below30.

$ ds9 -mecube nc-test.fits -zscale -zoom to fit

The buttons and horizontal scroll bar in the “cube” window can be used to navigate between the extensions. In this mode, all DS9’s settings (for example zoom or color-bar) will be identical between the extensions. Try zooming into to one part and seeing how the galaxies were detected along with the Sky and Sky standard deviation values. Just have in mind that NoiseChisel’s job is only detection (separating signal from noise), We’ll segment this result later.

One good way to see if you have missed any signal is to mask all the detected pixels and inspect the noise pixels. For this, you can use Gnuastro’s Arithmetic program (in particular its where operator, see Arithmetic operators). With the command below, all detected pixels (in the DETECTIONS extension) will be set to NaN in the output (nc-masked.fits). To make the command easier to read/write, let’s just put the file name in a shell variable (img) first. A shell variable’s value can be retrieved by adding a $ before its name.

$ img=nc-test.fits
$ astarithmetic $img $img nan where -hINPUT-NO-SKY -hDETECTIONS      \

To invert the result (only keep the values of detected pixels), you can flip the detected pixel values (from 0 to 1 and vice-versa) by adding a not after the second $img:

$ astarithmetic $img $img not nan where -hINPUT-NO-SKY -hDETECTIONS  \

NoiseChisel can produce “Check images” to help you visualize and inspect how each step is completed. You can see all the check images it can produce with this command.

$ astnoisechisel --help | grep check

Let’s see how the quantile threshold (the first step after convolution) has been found and applied in our previous run of NoiseChisel:

$ astnoisechisel flat-ir/xdf-f160w.fits --checkqthresh

The check images/tables are also multi-extension FITS files. As you see, when check datasets are requested, NoiseChisel won’t go to the end and abort as soon as all its extensions are ready. Try listing the extensions with astfits and then opening them with ds9 as we done above. It is strongly encouraged to play with the different parameters and use the respective check images to see which step is affected by your change.

One major factor determining NoiseChisel’s the quantile threshold is NoiseChisel’s ability to identify signal in a tile (the threshold is only found on those tiles with no major signal). Therefore the larger the tiles are, number-statistics will cause less scatter, therefore helping NoiseChisel find the quantile threshold. However, if the tiles are too large, there might not be enough over the dataset or they won’t be able to deal with possible gradients. Let’s see what the default (small) tile size was with the following command.

$ astnoisechisel -P | grep tilesize

Its a 50 by 50 box. Flip back and forth between the CONVOLVED and QTHRESH_ERODE extensions of the check image to get a feeling of which tiles succeeded (have non-blank values)31. Since this is a relatively large image and we don’t have any gradients, let’s increase the tile size to 100 by 100

$ astnoisechisel flat-ir/xdf-f160w.fits --tilesize=100,100    \

Inspecting the check image, you see that there are now only a handful of useful tiles in the central parts. This shows the field is too crowded, and we should slightly decrease the tile size for a more robust result that also covers more of the dataset. Let’s set it to a 75 by 75 pixel box:

$ astnoisechisel flat-ir/xdf-f160w.fits --tilesize=75,75    \

The result seems reasonable now: we have a larger tile size than the default value, but less scatter, and the tiles cover a sufficiently wide area of the dataset. So, we’ll use this tile size for the next steps. But first, let’s clean all the temporary files and make a directory for the NoiseChisel outputs:

$ rm *.fits
$ mkdir nc
$ astnoisechisel flat-ir/xdf-f160w.fits --tilesize=75,75    \
$ astnoisechisel flat-ir/xdf-f105w.fits --tilesize=75,75    \

Before continuing with the higher-level processing of this dataset, we’ll pause to use NoiseChisel’s multi-extension output for showing how the Fits program can make it easy to work with this wonderful data container (see Fits). Let’s say you need to copy a HDU/extension from one FITS file to another. Try the command below to make an objects.fits file that contains only NoiseChisel’s binary detection map. There are similar options to conveniently cut or delete HDUs from a FITS file also.

$ astfits nc/xdf-f160w.fits --copy=DETECTIONS -odetections.fits

NoiseChisel puts some general information on its outputs in the FITS header of the respective extension. To see the full list of keywords in an extension, you can again use the Fits program like above. But instead of HDU manipulation options, give it the HDU you are interested in with -h. You can also give the HDU number (as listed in the output above), for example -h2 instead of -hDETECTIONS.

$ astfits nc/xdf-f160w.fits -hDETECTIONS

The DETSN keyword in NoiseChisel’s DETECTIONS extension contains the true pseudo-detection signal-to-noise ratio that was found by NoiseChisel on the dataset. It is not easy to find it in the middle of all the other keywords printed by the command above (especially in files that have many more keywords). To fix the problem, you can pipe the output of the command above into grep (a program for matching lines which is available on almost all Unix-like operating systems).

$ astfits nc/xdf-f160w.fits -hDETECTIONS | grep DETSN

If you just want the value of the keyword and not the full FITS keyword line, you can use AWK. In the example below, AWK will print the third word (separated by white space characters) in any line that has a first column value of DETSN.

$ astfits nc/xdf-f160w.fits -h2 | awk '$1=="DETSN" {print $3}'

The main output of NoiseChisel is the binary detection map (DETECTIONS extension), which only has two values of 1 or 0. This is useful when studying the noise, but hardly of any use when you actually want to study the targets/galaxies in the image, especially in such a deep field where the detection map of almost everything is connected. To find the galaxies over the detections, we’ll use Gnuastro’s Segment program:

$ rm *.fits
$ mkdir seg
$ astsegment nc/xdf-f160w.fits -oseg/xdf-f160w.fits

Segment’s operation is very much like NoiseChisel (in fact, prior to version 0.6, it was part of NoiseChisel), for example the output is a multi-extension FITS file, it has check images and uses the undetected regions as a reference. Please have a look at Segment’s multi-extension output with ds9 to get a good feeling of what it has done. Like NoiseChisel, the first extension is the input. The CLUMPS extension shows the true “clumps” with values that are \(\ge1\), and the diffuse regions labeled as \(-1\). In the OBJECTS extension, we see that the large detections of NoiseChisel (that may have contained many galaxies) are now broken up into separate labels. see Segment for more.

Having localized the regions of interest in the dataset, we are ready to do measurements on them with MakeCatalog. Besides the IDs, we want to measure the Right Ascension (with --ra), Declination (--dec, magnitude (--magnitude and signal-to-noise ratio (--sn) of the objects and clumps. The following command will make these measurements on Segment’s F160W output:

$ mkdir cat
$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --zeropoint=25.94                                    \
               --clumpscat --output=cat/xdf-f160w.fits

From the printed statements on the command-line, you see that MakeCatalog read all the extensions in Segment’s output for the various measurements it needed.

The output of the MakeCatalog command above is a FITS table. The two clump and object catalogs are available in the two extensions of the single FITS file32. Let’s inspect the separate extensions with the Fits program like before (as shown below). Later, we’ll inspect the table in each extension with Gnuastro’s Table program (see Table). Note that we could have used -hOBJECTS and -hCLUMPS instead of -h1 and -h2 respectively.

$ astfits  cat/xdf-f160w.fits              # Extension information
$ asttable cat/xdf-f160w.fits -h1 --info   # Objects catalog info.
$ asttable cat/xdf-f160w.fits -h1          # Objects catalog columns.
$ asttable cat/xdf-f160w.fits -h2 -i       # Clumps catalog info.
$ asttable cat/xdf-f160w.fits -h2          # Clumps catalog columns.

As you see above, when given a specific table (file name and extension), Table will print the full contents of all the columns. To see basic information about each column (for example name, units and comments), simply append a --info (or -i).

To print the contents of special column(s), just specify the column number(s) (counting from 1) or the column name(s) (if they have one). For example, if you just want the magnitude and signal-to-noise ratio of the clumps (in -h2), you can get it with any of the following commands

$ asttable cat/xdf-f160w.fits -h2 -c5,6
$ asttable cat/xdf-f160w.fits -h2 -c5,SN
$ asttable cat/xdf-f160w.fits -h2 -c5         -c6
$ asttable cat/xdf-f160w.fits -h2 -cMAGNITUDE -cSN

In the example above, the clumps catalog has two ID columns (one for the over-all clump ID and one for the ID of the clump in its host object), while the objects catalog only has one ID column. Therefore, the location of the magnitude column differs between the object and clumps catalog. So if you want to specify the columns by number, you will need to change the numbers when viewing the clump and objects catalogs. This is a useful advantage of having/using column names33.

$ asttable catalog/xdf-f160w.fits -h1 -c4 -c5
$ asttable catalog/xdf-f160w.fits -h2 -c5 -c6

Finally, the comments in MakeCatalog’s output (COMMENT keywords in the FITS headers, or lines starting with # in plain text) contain some important information about the input dataset that can be useful (for example pixel area or per-pixel surface brightness limit). For example have a look at the output of this command:

$ astfits cat/xdf-f160w.fits -h1 | grep COMMENT

To calculate colors, we also need magnitude measurements on the F105W filter. However, the galaxy properties might differ between the filters (which is the whole purpose behind measuring colors). Also, the noise properties and depth of the datasets differ. Therefore, if we simply follow the same Segment and MakeCatalog calls above for the F105W filter, we are going to get a different number of objects and clumps. Matching the two catalogs is possible (for example with Match), but the fact that the measurements will be done on different pixels, can bias the result. Since the Point spread function (PSF) of both images is very similar, an accurate color calculation can only be done when magnitudes are measured from the same pixels on both images.

The F160W image is deeper, thus providing better detection/segmentation, and redder, thus observing smaller/older stars and representing more of the mass in the galaxies. We will thus use the pixel labels generated on the F160W filter, but do the measurements on the F105W filter (using the --valuesfile option) in the command below. Notice how the only difference between this call to MakeCatalog and the previous one is --valuesfile, the value given to --zeropoint and the output name.

$ astmkcatalog seg/xdf-f160w.fits --ids --ra --dec --magnitude --sn \
               --valuesfile=nc/xdf-f105w.fits --zeropoint=26.27     \
               --clumpscat --output=cat/xdf-f105w.fits

Look into what MakeCatalog printed on the command-line. You can see that (as requested) the object and clump labels were taken from the respective extensions in seg/xdf-f160w.fits, while the values and Sky standard deviation were done on nc/xdf-f105w.fits.

Since we used the same labeled image on both filters, the number of rows in both catalogs are the same. The clumps are not affected by the hard-to-deblend and low signal-to-noise diffuse regions, they are more robust for calculating the colors (compared to objects). Therefore from this step onward, we’ll continue with clumps.

We can finally calculate the colors of the objects from these two datasets. If you inspect the contents of the two catalogs, you’ll notice that because they were both derived from the same segmentation maps, the rows are ordered identically (they correspond to the same object/clump in both filters). But to be generic (usable even when the rows aren’t ordered similarly) and display another useful program in Gnuastro, we’ll use Match.

As the name suggests, Gnuastro’s Match program will match rows based on distance (or aperture in 2D) in one (or two) columns. In the command below, the options relating to each catalog are placed under it for easy understanding. You give Match two catalogs (from the two different filters we derived above) as argument, and the HDUs containing them (if they are FITS files) with the --hdu and --hdu2 options. The --ccol1 and --ccol2 options specify which columns should be matched with which in the two catalogs. With --aperture you specify the acceptable error (radius in 2D), in the same units as the columns (see below for why we have requested an aperture of 0.35 arcseconds, or less than 6 HST pixels).

The --outcols is a very convenient feature in Match: you can use it to specify which columns from the two catalogs you want in the output. If the first character is an ‘a’, the respective matched column (number or name, similar to Table above) in the first catalog will be written in the output table. When the first character is a ‘b’, the respective column from the second catalog will be written in the output. You can use this to mix the desired matched columns from both catalogs in the output.

$ astmatch cat/xdf-f160w.fits           cat/xdf-f105w.fits         \
           --hdu=CLUMPS                 --hdu2=CLUMPS              \
           --ccol1=RA,DEC               --ccol2=RA,DEC             \
           --aperture=0.35/3600                                    \
           --outcols=a1,a2,aRA,aDEC,aMAGNITUDE,aSN,bMAGNITUDE,bSN  \
           --log --output=cat/xdf-f160w-f105w.fits

By default (when --quiet isn’t called), the Match program will just print the number of matched rows in the standard output. If you have a look at your input catalogs, this should be the same as the number of rows in them. Let’s have a look at the columns in the matched catalog:

$ asttable cat/xdf-f160w-f105w.fits -i

Indeed, its exactly the columns we wanted. There is just one confusion however: there are two MAGNITUDE and SN columns. Right now, you know that the first one was from the F160W filter, and the second was for F105W. But in one hour, you’ll start doubting your self: going through your command history, trying to answer this question: “which magnitude corresponds to which filter?”. You should never torture your future-self (or colleagues) like this! So, let’s rename these confusing columns in the matched catalog. The FITS standard for tables stores the column names in the TTYPE header keywords, so let’s have a look:

$ astfits cat/xdf-f160w-f105w.fits -h1 | grep TTYPE

Changing/updating the column names is as easy as updating the values to these options with the first command below, and with the second, confirm this change:

$ astfits cat/xdf-f160w-f105w.fits -h1                          \
          --update=TTYPE5,MAG_F160W   --update=TTYPE6,SN_F160W  \
          --update=TTYPE7,MAG_F105W   --update=TTYPE8,SN_F105W
$ asttable cat/xdf-f160w-f105w.fits -i

If you noticed, when running Match, the previous command, we also asked for --log. Many Gnuastro programs have this option to provide some detailed information on their operation in case you are curious. Here, we are using it to justify the value we gave to --aperture. Even though you asked for the output to be written in the cat directory, a listing of the contents of your current directory will show you an extra astmatch.fits file. Let’s have a look at what columns it contains.

$ ls
$ asttable astmatch.log -i

The MATCH_DIST column contains the distance of the matched rows, let’s have a look at the distribution of values in this column. You might be asking yourself “why should the positions of the two filters differ when I gave MakeCatalog the same segmentation map?” The reason is that the central positions are flux-weighted. Therefore the --valuesfile dataset you give to MakeCatalog will also affect the center measurements34. Recall that the Spectral Energy Distribution (SED) of galaxies is not flat and they have substructure, therefore, they can have different shapes/morphologies in different filters.

Gnuastro has a simple program for basic statistical analysis. The command below will print some basic information about the distribution (minimum, maximum, median and etc), along with a cute little ASCII histogram to visually help you understand the distribution on the command-line without the need for a graphic user interface (see Invoking Statistics). This ASCII histogram can be useful when you just want some coarse and general information on the input dataset. It is also useful when working on a server (where you may not have graphic user interface), and finally, its fast.

$ aststatistics astmatch.fits -cMATCH_DIST

The units of this column are the same as the columns you gave to Match: in degrees. You see that while almost all the objects matched very nicely, the maximum distance is roughly 0.31 arcseconds. This is why we asked for an aperture of 0.35 arcseconds when doing the match.

We can now use AWK to find the colors. We’ll ask AWK to only use rows that don’t have a NaN magnitude in either filter35. We will also ignore columns which don’t have reliable F105W measurement (with a S/N less than 736).

$ asttable cat/xdf-f160w-f105w.fits -cMAG_F160W,MAG_F105W,SN_F105W  \
           | awk '$1!="nan" && $2!="nan" && $3>7 {print $2-$1}'     \
           > f105w-f160w.txt

You can inspect the distribution of colors with the Statistics program again:

$ aststatistics f105w-f160w.txt -c1

You can later use Gnuastro’s Statistics program with the --histogram option to build a much more fine-grained histogram as a table to feed into your favorite plotting program for a much more accurate/appealing plot (for example with PGFPlots in LaTeX). If you just want a specific measure, for example the mean, median and standard deviation, you can ask for them specifically with this command:

$ aststatistics f105w-f160w.txt -c1 --mean --median --std

Some researchers prefer to have colors in a fixed aperture for all the objects. The colors we calculated above used a different segmentation map for each object. This might not satisfy some science cases. So, let’s make a fixed aperture catalog. To make an catalog from fixed apertures, we should make a labeled image which has a fixed label for each aperture. That labeled image can be given to MakeCatalog instead of Segment’s labeled detection image.

To generate the apertures catalog, we’ll first read the positions from F160W catalog and set the other parameters of each profile to be a fixed circle of radius 5 pixels (we want all apertures to be identical in this scenario).

$ rm *.fits *.txt
$ asttable cat/xdf-f160w.fits -hCLUMPS -cRA,DEC                    \
           | awk '!/^#/{print NR, $1, $2, 5, 5, 0, 0, 1, NR, 1}' \
           > apertures.txt

We can now feed this catalog into MakeProfiles to build the apertures for us. See Invoking MakeProfiles for a description of the options. The most important for this particular job is --mforflatpix, it tells MakeProfiles that the values in the magnitude column should be used for each pixel of a flat profile. Without it, MakeProfiles would build the profiles such that the sum of the pixels of each profile would have a magnitude (in log-scale) of the value given in that column (what you would expect when simulating a galaxy for example).

$ astmkprof apertures.txt --background=flat-ir/xdf-f160w.fits     \
            --clearcanvas --replace --type=int16 --mforflatpix    \

The first thing you might notice in the printed information is that the profiles are not built in order. This is because MakeProfiles works in parallel, and parallel CPU operations are asynchronous. You can try running MakeProfiles with one thread (using --numthreads=1) to see how order is respected in that case.

Open the output apertures.fits file and see the result. Where the apertures overlap, you will notice that one label has replaced the other (because of the --replace option). In the future, MakeCatalog will be able to work with overlapping labels, but currently it doesn’t. If you are interested, please join us in completing Gnuastro with added improvements like this (see task 14750 37).

We can now feed the apertures.fits labeled image into MakeCatalog instead of Segment’s output as shown below. In comparison with the previous MakeCatalog call, you will notice that there is no more --clumpscat option, since each aperture is treated as a separate “object” here.

$ astmkcatalog apertures.fits -h1 --zeropoint=26.27        \
               --valuesfile=nc/xdf-f105w.fits              \
               --ids --ra --dec --magnitude --sn           \

This catalog has the same number of rows as the catalog produced from clumps, therefore similar to how we found colors, you can compare the aperture and clump magnitudes for example. You can also change the filter name and zeropoint magnitudes and run this command again to have the fixed aperture magnitude in the F160W filter and measure colors on apertures.

As a final step, let’s go back to the original clumps-based catalogs we generated before. We’ll find the objects with the strongest color and make a cutout to inspect them visually and finally, we’ll see how they are located on the image.

First, let’s see what the objects with a color more than two magnitudes look like. As you see, this is very much like the command above for selecting the colors, only instead of printing the color, we’ll print the RA and Dec. With the command below, the positions of all lines with a color more than 1.5 will be put in reddest.txt

$ asttable cat/xdf-f160w-f105w.fits                                \
           -cMAG_F160W,MAG_F105W,SN_F105W,RA,DEC                   \
           | awk '$1!="nan" && $2!="nan" && $2-$1>1.5 && $3>7      \
                  {print $4,$5}' > reddest.txt

We can now feed reddest.txt into Gnuastro’s crop to see what these objects look like. To keep things clean, we’ll make a directory called crop-red and ask Crop to save the crops in this directory. We’ll also add a -f160w.fits suffix to the crops (to remind us which image they came from). The width of the crops will be 15 arcseconds.

$ mkdir crop-red
$ astcrop --mode=wcs --coordcol=3 --coordcol=4 flat-ir/xdf-f160w.fits \
          --catalog=reddest.txt --width=15/3600,15/3600               \
          --suffix=-f160w.fits --output=crop-red

Like the MakeProfiles command above, you might notice that the crops aren’t made in order. This is because each crop is independent of the rest, therefore crops are done in parallel, and parallel operations are asynchronous. In the command above, you can change f160w to f105w to make the crops in both filters.

To view the crops more easily (not having to open ds9 for each image), you can convert the FITS crops into the JPEG format with a shell loop like below.

$ cd crop-red
$ for f in *.fits; do                                                  \
    astconvertt $f --fluxlow=-0.001 --fluxhigh=0.005 --invert -ojpg;   \
$ cd ..

You can now use your general graphic user interface image viewer to flip through the images more easily. On GNOME, you can use the “Eye of GNOME” image viewer (with executable name of eog). Run the command below to open the first one (if you aren’t using GNOME, use the command of your image viewer instead of eog):

$ eog 1-f160w.jpg

In Eye of GNOME, you can flip through the images and compare them visually more easily by pressing the <SPACE> key. Of course, the flux ranges have been chosen generically here for seeing the fainter parts. Therefore, brighter objects will be fully black.

The for loop above to convert the images will do the job in series: each file is converted only after the previous one is complete. If you have GNU Parallel, you can greatly speed up this conversion. GNU Parallel will run the separate commands simultaneously on different CPU threads in parallel. For more information on efficiently using your threads, see Multi-threaded operations. Here is a replacement for the shell for loop above using GNU Parallel.

$ cd crop-red
$ parallel astconvertt --fluxlow=-0.001 --fluxhigh=0.005 --invert   \
           -ojpg ::: *.fits
$ cd ..

As the final action, let’s see how these objects are positioned over the dataset. DS9 has the “Region”s concept for this purpose. You just have to convert your catalog into a “region file” to feed into DS9. To do that, you can use AWK again as shown below.

$ awk 'BEGIN{print "# Region file format: DS9 version 4.1";     \
             print "global color=green width=2";                \
             print "fk5";}                                      \
       {printf "circle(%s,%s,1\")\n", $1, $2;}' reddest.txt     \
       > reddest.reg

This region file can be loaded into DS9 with its -regions option to display over any image (that has world coordinate system). In the example below, we’ll open Segment’s output and load the regions over all the extensions (to see the image and the respective clump):

$ ds9 -mecube seg/xdf-f160w.fits -zscale -zoom to fit    \
      -regions load all reddest.reg

In conclusion, we hope this extended tutorial has been a good starting point to help in your exciting research. If this book or any of the programs in Gnuastro have been useful for your research, please cite the respective papers and share your thoughts and suggestions with us (it can be very encouraging). All Gnuastro programs have a --cite option to help you cite the authors’ work more easily. Just note that it may be necessary to cite additional papers for different programs, so please try it out for any program you used.

$ astmkcatalog --cite
$ astnoisechisel --cite



See SAO ds9, available at




GNU Info is already available on almost all Unix-like operating systems.


Note that you only have one port to the internet, so downloading in parallel will actually be slower than downloading in series.


You can configure your graphic user interface to open DS9 in multi-extension cube mode by default when using the GUI (double clicking on the file). If your graphic user interface is GNOME (another GNU software, it is most common in GNU/Linux operating systems), a full description is given in Viewing multiextension FITS images


The other extensions don’t matter much for the check here. They show the --qthresh values for the other thresholds (on the same tiles), followed by the interpolated values for all the thresholds and afterwards the smoothed values that are used for the next steps. The final extension (QTHRESH-APPLIED, shows the result of applying the erode and no-erode thresholds.


MakeCatalog can also output plain text tables. However, in the plain text format you can only have one table per file. Therefore, if you also request measurements on clumps, two plain text tables will be created (suffixed with _o.txt and _c.txt).


Column meta-data (including a name) can also be specified in plain text tables, see Gnuastro text table format.


To only measure the center based on the labeled pixels (and ignore the pixel values), you can ask for the columns that contain geo (for geometric) in them. For example --geow1 or --geow2 for the RA and Declination (first and second world-coordinates).


This can happen even on the reference image. It is because of the current way clumps are defined in Segment when they are placed on strong gradients. It is because of high “river” values on such gradients. See Segment changes after publication. To avoid this problem, you can currently ask for the --brighntessnoriver output column.


The value of 7 is taken from the clump S/N threshold in F160W (where the clumps were defined).


Next: , Previous: , Up: Tutorials   [Contents][Index]