GNU Astronomy Utilities


Previous: , Up: Tutorials   [Contents][Index]


2.3 General program usage tutorial

Measuring colors of astronomical objects in broad-band or narrow-band images is one of the most basic and common steps in astronomical analysis. Here, we will use Gnuastro’s programs to detect objects in a Hubble Space Telescope (HST) image and measure their colors. We will use the eXtreme Deep Field dataset. Like almost all astronomical surveys, this dataset is free for download and usable by the public. This tutorial was first prepared for the “Exploring the Ultra-Low Surface Brightness Universe” workshop (November 2017) at the International Space Science Institute (ISSI) in Bern, Switzerland. We would like to thank them and the attendees for a very fruitful week.

You will need the following tools in this tutorial: Gnuastro, SAO DS9 21, GNU Wget22, and AWK (most common implementation is GNU AWK23).

Type the example commands: Try to type the example commands on your terminal and don’t simply copy and paste them. This will help simulate future situations when you are processing your own datasets.

A handy feature of Gnuastro is that all program names start with ast. This will allow your command-line processor to easily list and auto-complete Gnuastro’s programs for you. Try typing the following command (press TAB key when you see <TAB>) to see the list:

$ ast<TAB><TAB>

Any program that starts with ast (including all Gnuastro programs) will be shown. By choosing the subsequent characters of your desired program and pressing <TAB><TAB> again, the list will narrow down and the program name will auto-complete once your input characters are unambiguous. In short, you often don’t need to type the full name of the program you want to run.

Gnuastro contains a large number of programs and it is natural to forget the details of each program’s options. Therefore, before starting the analysis, let’s review how you can refresh your memory any time you want on the command-line while working (without taking your hands off the keyboard). This book comes with your installation, so it will always correspond to your installed version of Gnuastro. Please see Info for more.

GNU Info24 is the program in charge of displaying the manual on the command-line. To see this whole book on your command-line, please run the following command and subsequent keys. Info has its own mini-environment, therefore we’ll show the keys that must be pressed in the mini-environment after a -> sign. You can also ignore anything after the # sign in the middle of the line, they are only for your information.

$ info gnuastro                # Open the top of the manual.
-> <SPACE>                     # All the book chapters.
-> <SPACE>                     # Continue down: show sections.
-> <SPACE> ...                 # Keep pressing space to go down.
-> q                           # Quit Info, return to the command-line.

The thing that greatly simplifies navigation in Info is the links (regions with an underline). You can immediately go to the next link in the page with the <TAB> key and press <ENTER> on it to go into that part of the manual. Try the commands above again, but this time also use <TAB> to go to the links and press <ENTER> on them to go to the respective section of the book. Then follow a few more links and go deeper into the book. To return to the previous page, press l (small L). If you are searching for a specific phrase in the whole book (for example an option name), press s and type your search phrase and end it with an <ENTER>.

You don’t need to start from the top of the manual every time. For example, to get to Invoking NoiseChisel, run the following command. In general, all programs have such an “Invoking ProgramName” section in this book. These sections are specifically for the description of inputs, outputs and configuration options of each program. You can access them directly for each program by giving its executable name to Info.

$ info astnoisechisel

The other sections don’t have such shortcuts. To directly access them from the command-line, you need to tell Info to look into Gnuastro’s manual, then look for the specific section (an unambiguous title is necessary). For example, if you only want to review/remember NoiseChisel’s Detection options or Segmentation options), just run any of these commands. Note how case is irrelevant for Info when calling a title in this manner.

$ info gnuastro "Detection options"
$ info gnuastro "segmentation options"

In general, Info is a wonderful and powerful way to access this whole book with detailed information about the programs you are running very fast. If you are not already familiar with it, please run the following command and just read along and do what it says to learn it. Don’t stop until you feel have become sufficiently fluent in it. Please invest the half an hour’s time necessary to start using Info comfortably. It will greatly improve your productivity and you will start reaping the rewards of this investment very soon.

$ info info

As a good scientist you need to feel comfortable to play with the features/options and avoid (be critical to) using default values as much as possible. On the other hand, our human memory is very limited, so it is important to be able to easily access any part of this book fast and remember the option names, what they do and their acceptable values.

If you just want the option names and a short description, calling the program with the --help option might also be a good solution like the first example below. If you know a few characters of the option name, you can feed the output to grep like the second or third example commands.

$ astnoisechisel --help
$ astnoisechisel --help | grep quant
$ astnoisechisel --help | grep check

Let’s start the processing. First, to keep things clean, let’s create a gnuastro-tutorial directory and continue all future steps in it:

$ mkdir gnuastro-tutorial
$ cd gnuastro-tutorial

We will be using the near infra-red Wide Field Camera dataset. If you already have them in another directory (for example XDFDIR), you can set the download directory to be a symbolic link to XDFDIR with a command like this:

ln -s XDFDIR download

If the following images aren’t already present on your system, you can make a download directory and download them there.

$ mkdir download
$ cd download
$ xdfurl=http://archive.stsci.edu/pub/hlsp/xdf
$ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f105w_v1_sci.fits
$ wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits
$ cd ..

In this tutorial, we’ll just use these two filters. Later, you will probably need to download more filters, you can use the shell’s for loop to download them all in series (one after the other25) with one command like the one below for the WFC3 filters. Put this command instead of the two wget commands above. Recall that all the extra spaces, back-slashes (\), and new lines can be ignored if you are typing on the lines on the terminal.

$ for f in f105w f125w f140w f160w; do                              \
    wget $xdfurl/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits;   \
  done

First, let’s visually inspect the dataset. Let’s take F160W image as an example. Do the steps below with the other image(s) too (and later with any dataset that you want to work on). It is very important to understand your dataset visually. Note how ds9 doesn’t follow the GNU style of options where “long” and “short” options are preceded by -- and - respectively (for example --width and -w, see Options).

Ds9’s -zscale option is a good scaling to highlight the low surface brightness regions, and as the name suggests, -zoom to fit will fit the whole dataset in the window. If the window is too small, expand it with your mouse, then press the “zoom” button on the top row of buttons above the image, then in the row below it, press “zoom fit”. You can also zoom in and out by scrolling your mouse or the respective operation on your touch-pad when your cursor/pointer is over the image.

$ ds9 download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits     \
      -zscale -zoom to fit

The first thing you might notice is that the regions with no data have a value of zero in this image. The next thing might be that the dataset actually has two “depth”s (see Quantifying measurement limits). The exposure time of the inner region is more than 4 times more than the outer parts. Fortunately the XDF survey webpage (above) contains the vertices of the deep flat WFC3-IR field. You can use those vertices in Crop to cutout this deep infra-red region from the larger image. We’ll make a directory called flat-ir and keep the flat infra-red regions in that directory (with a ‘xdf-’ suffix for a shorter and easier filename).

$ mkdir flat-ir
$ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f105w.fits              \
          --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :    \
                     53.134517,-27.787144 : 53.161906,-27.807208"     \
          download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f105w_v1_sci.fits
$ astcrop --mode=wcs -h0 --output=flat-ir/xdf-f160w.fits              \
          --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :    \
                     53.134517,-27.787144 : 53.161906,-27.807208"     \
          download/hlsp_xdf_hst_wfc3ir-60mas_hudf_f160w_v1_sci.fits

The only thing varying in the two calls to Gnuastro’s Crop program is the filter name. Therefore, to simplify the command, and later allow work on more filters, we can use the shell’s for loop. Notice how the two places where the filter names (f105w and f160w) are used above have been replaced with $f (the shell variable that for is in charge of setting) below. To generalize this for more filters later, you can simply add the other filter names in the first line before the semi-colon (;).

$ for f in f105w f160w; do                                            \
    astcrop --mode=wcs -h0 --output=flat-ir/xdf-$f.fits               \
            --polygon="53.187414,-27.779152 : 53.159507,-27.759633 :  \
                       53.134517,-27.787144 : 53.161906,-27.807208"   \
            download/hlsp_xdf_hst_wfc3ir-60mas_hudf_"$f"_v1_sci.fits; \
  done

Please open these images and inspect them with the same ds9 commands you used above. You will see how it is completely flat now and doesn’t have varying depths. Another important result of this crop is that regions with no data now have a NaN (blank) value, not zero. Zero is a meaningful value and especially when using NoiseChisel, the input should have NaN values for pixels with no data, not zero.

This is the deepest image we currently have of the sky. The first thing that comes to mind may be this: “How large is this field?”. Let’s find the answer to this question with the commands below. The lines starting with ## are just comments for you to help in following the steps. Don’t type them on the terminal. The commands are intentionally repetitive in some places to better understand each step and also to demonstrate the beauty of command-line features like variables, pipes and loops. Later, if you would like to repeat this process on another dataset, you can just use commands 3, 7, and 9.

Use shell history: Don’t forget to make effective use of your shell’s history. This is especially convenient when you just want to make a small change to your previous command. Press the “up” key on your keyboard (possibly multiple times) to see your previous command(s).

## (1)  See the general statistics of non-blank pixel values.
$ aststatistics flat-ir/xdf-f160w.fits

## (2)  We only want the number of non-blank pixels.
$ aststatistics flat-ir/xdf-f160w.fits --number

## (3)  Keep the result of the command above in the shell variable `n'.
$ n=$(aststatistics flat-ir/xdf-f160w.fits --number)

## (4)  See what is stored the shell variable `n'.
$ echo $n

## (5)  Show all the FITS keywords of this image.
$ astfits flat-ir/xdf-f160w.fits -h1

## (6)  The resolution (in degrees/pixel) is in the `CDELT' keywords.
##      Only show lines that contain these characters, by feeding
##      the output of the previous command to the `grep' program.
$ astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT

## (7)  Save the resolution (same in both dimensions) in the variable
##      `r'. The last part uses AWK to print the third `field' of its
##      input line. The first two fields were `CDELT1' and `='.
$ r=$(astfits flat-ir/xdf-f160w.fits -h1 | grep CDELT1   \
              | awk '{print $3}')

## (8)  Print the values of `n' and `r'.
$ echo $n $r

## (9)  Use the number of pixels (first number passed to AWK) and
##      length of each pixel's edge (second number passed to AWK)
##      to estimate the area of the field in arc-minutes squared.
$ area=(echo $n $r | awk '{print $1 * ($2^2) * 3600}')

The area of this field is 4.03817 (or 4.04) arc-minutes squared. Just for comparison, this is roughly 175 times smaller than the average moon’s angular area (with a diameter of 30arc-minutes or half a degree).

AWK for table/value processing: AWK is a powerful and simple tool for text processing. Above (and further below) some simple examples are shown. GNU AWK (the most common implementation) comes with a free and wonderful book in the same format as this book which will allow you to master it nicely. Just like this manual, you can also access GNU AWK’s manual on the command-line whenever necessary without taking your hands off the keyboard as described above.

This takes us to the second question that you have probably asked yourself when you saw the field for the first time: “How large is this area at different redshifts?”. To get a feeling of the tangential area that this field covers at redshift 2, you can use CosmicCalculator. In particular, you need the tangential distance covered by 1 arc-second as raw output. Combined with the field’s area, we can then calculate the tangential distance in Mega Parsecs squared (\(Mpc^2\)).

## Print general cosmological properties at redshift 2.
$ astcosmiccal -z2

## When given a "Specific calculation" option, CosmicCalculator
## will just print that particular calculation. See the options
## under this title in the output of `--help' for more.
$ astcosmiccal --help

## Only print the "Tangential dist. covered by 1arcsec at z (kpc)".
## in units of kpc/arc-seconds.
$ astcosmiccal -z2 --arcsectandist

## Convert this distance to kpc^2/arcmin^2 and save in `k'.
$ k=$(astcosmiccal -z2 --arcsectandist | awk '{print ($1*60)^2}')

## Multiply by the area of the field (in arcmin^2) and divide by
## 10^6 to return value in Mpc^2.
$ echo $k $area | awk '{print $1 * $2 / 1e6}'

At redshift 2, this field therefore covers 1.07145 \(Mpc^2\). If you would like to see how this tangential area changes with redshift, you can use a shell loop like below.

$ for z in 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0; do           \
    k=$(astcosmiccal -z$z --arcsectandist);                      \
    echo $z $k $area | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \
  done

Fortunately, the shell has a very useful tool/program to print a sequence of numbers that is nicely called seq. You can use it instead of typing all the different redshifts in this example. For example the loop below will print the same range of redshifts (between 0.5 and 5) but with increments of 0.1.

$ for z in $(seq 0.5 0.1 5); do                                  \
    k=$(astcosmiccal -z$z --arcsectandist);                      \
    echo $z $k $area | awk '{print $1, ($2*60)^2 * $3 / 1e6}';   \
  done

If you want a fast result and commonly need such processing for a larger number of redshifts, the command above can be slow. This is because the CosmicCalculator program has a lot of overhead: it has to parse the command-line and all configuration files (see below). These are both human-readable characters, not computer-readable bits. CosmicCalculator then has to check the sanity of its inputs and check which of its many options you have asked for. It has to do all of these for every redshift. To greatly speed up the processing, Gnuastro gives you direct access to the root work-horse of CosmicCalculator without all that overhead: Gnuastro library.

Using Gnuastro’s library, you can write your own tiny little program for this same calculation, without all that extra overhead of CosmicCalculator (or any of Gnuastro’s programs, see Library). For this particular job, you want Gnuastro’s Cosmology library (cosmology.h). Here is one example: put the following small C program in a file called myprogram.c.

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <gnuastro/cosmology.h>

int
main(void)
{
  double area=4.03817;          /* Area of field (arcmin^2). */
  double z, adist, tandist;     /* Temporary variables.      */

  /* Constants from Plank 2015 (Paper XIII, A&A 594, 2016) */
  double H0=67.74, olambda=0.6911, omatter=0.3089, oradiation=0;

  /* Do the same thing for all redshifts (z) between 0.1 and 5. */
  for(z=0.1; z<5; z+=0.1)
    {
      /* Calculate the angular diameter distance. */
      adist=gal_cosmology_angular_distance(z, H0, olambda,
                                           omatter, oradiation);

      /* Calculate the tangential distance of one arcsecond. */
      tandist = adist * 1000 * M_PI / 3600 / 180;

      /* Print the redshift and area. */
      printf("%-5.2f %g\n", z, pow(tandist * 60,2) * area / 1e6);
    }

  /* Tell the system that everything finished successfully. */
  return EXIT_SUCCESS;
}

To build and run this C program, you can use BuildProgram. It is designed to manage Gnuastro’s dependencies, compile the program you give it and then run it. In short, it hides all the complexities of compiling, linking and running C programs based on Gnuastro’s library. The library will later be usable higher level languages like Python or Julia, for now it is only usable by C and C++ programs.

$ astbuildprog myprogram.c

See how much faster this is compared to the shell loop we wrote above? You might have noticed that a new file called myprogram is also created in the directory. This is the compiled program that was created and run by the command above (its in binary machine code, not human-readable any more). You can run it again to get the same results with a command like this:

$ ./myprogram

The efficiency of myprogram compared to CosmicCalculator is because the requested processing is faster/comparable to the overheads. For other programs that take large input datasets (images for example), the overhead is usually negligible compared to the processing. In such cases, the libraries are only useful if you want a different/new processing compared to the functionalities in Gnuastro’s existing programs.

Gnuastro has a large library which all the programs use for various steps their processing. For the full list of available functions classified by context, please see Gnuastro library. Gnuastro’s library and BuildProgram are created to make it easy for you to use these powerful features and get to your scientific results as efficiently (fast) and accurately as possible. Several other complete working examples (involving images and tables) of Gnuastro’s libraries can be see in Library demo programs. We’ll stop with the libraries at this point in this tutorial and get back to Gnuastro’s already built programs which ready to be used on the command-line.

None of Gnuastro’s programs keep a default value internally within their code. However, when you ran CosmicCalculator with the -z2 option above, it completed its processing and printed results. So where did the “default” cosmological parameter values (like the matter density and etc) come from? The values come from the command-line or a configuration file (see Configuration file precedence).

CosmicCalculator has a very limited set of parameters and doesn’t need any particular file inputs. Therefore, we’ll use it to discuss configuration files which are an important part of all Gnuastro’s programs (see Configuration files).

Once you get comfortable with configuration files, you can easily do the same for the options of all Gnuastro programs. For example, NoiseChisel has the largest number of options in the programs. Therefore configuration files will be very useful for it when you use different datasets (with different noise properties or in different research contexts). The configuration of each program (besides its version) is vital for the reproducibility of your results, so it is very important to manage them properly.

As we saw above, the full list of the options in all Gnuastro programs can be seen with the --help option. Try calling it with CosmicCalculator as shown below. Note how options are grouped by context to make it easier to find your desired option. However, in each group, options are ordered alphabetically.

$ astcosmiccal --help

The options that need a value have an = sign after their long version and FLT, INT or STR for floating point numbers, integer numbers and strings (filenames for example) respectively. All options have a long format and some have a short format (a single character), for more see Options.

When you are using a program, it is often necessary to check the value the option has just before the program starts its processing. In other words, after it has parsed the command-line options and all configuration files. You can see the values of all options that need one with the --printparams or -P option that is common to all programs (see Common options). In the command below, try replacing -P with --printparams to see how both do the same operation.

$ astcosmiccal -P

Let’s say you want a different Hubble constant. Try running the following command to see how the Hubble constant in the output of the command above has changed. Afterwards, delete the -P and add a -z2 to see the results with the new cosmology (or configuration).

$ astcosmiccal -P --H0=70

From the output of the --help option, note how the option for Hubble constant has both short (-H) and long (--H0) formats. One final note is that the equal (=) sign is not mandatory. In the short format, the value can stick to the actual option (the short option name is just one character after-all and thus easily identifiable) and in the long format, a white-space character is also enough.

$ astcosmiccal -H70    -z2
$ astcosmiccal --H0 70 -z2 --arcsectandisk

Let’s assume that in one project, you want to only use rounded cosmological parameters (H0 of 70km/s/Mpc and matter density of 0.3). You should therefore run CosmicCalculator like this:

$ astcosmiccal --H0=70 --olambda=0.7 --omatter=0.3 -z2

But having to type these extra options every time you run CosmicCalculator will be prone to errors (typos in particular) and also will be frustrating and slow. Therefore in Gnuastro, you can put all the options and their values in a “Configuration file” and tell the programs to read the option values from there.

Let’s create a configuration file. In your favorite text editor, make a file named my-cosmology.conf (or my-cosmology.txt, the suffix doesn’t matter) which contains the following lines. One space between the option value and name is enough, the values are just under each other to help in readability. Also note that you can only use long option names in configuration files.

H0       70
olambda  0.7
omatter  0.3

You can now tell CosmicCalculator to read this file for option values immediately using the --config option as shown below. Do you see how the output of the following command corresponds to the option values in my-cosmology.txt (previous command)?

$ astcosmiccal --config=my-cosmology.conf -z2

If you need this cosmology every time you are working in a specific directory, you can benefit from Gnuastro’s default configuration files to avoid having to call the --config option. Let’s assume that you want any CosmicCalculator call you make in the my-cosmology directory to use these parameters. You just have to copy the above configuration file into a special directory and file:

$ mkdir my-cosmology
$ mkdir my-cosmology/.gnuastro
$ mv my-cosmology.conf my-cosmology/.gnuastro/astcosmiccal.conf

Once you run CosmicCalculator within my-cosmology as shown below, you will see how your cosmology has been implemented without having to type anything extra on the command-line.

$ cd my-cosmology
$ astcosmiccal -z2
$ cd ..

To further simplify the process, you can use the --setdirconf option. If you are already in your desired directory, calling this option with the others will automatically write the final values (along with descriptions) in .gnuastro/astcosmiccal.conf. For example the commands below will make the same configuration file automatically (with one extra call to CosmicCalculator).

$ mkdir my-cosmology2
$ cd my-cosmology2
$ astcosmiccal --H0 70 --olambda=0.7 --omatter=0.3 --setdirconf
$ astcosmiccal -z2
$ cd ..

Gnuastro’s programs also have default configuration files for a specific user (when run in any directory). This allows you to set a special behavior every time a program is run by a specific user. Only the directory and filename differ from the above, the rest of the process is similar to before. Finally, there are also system-wide configuration files that can be used to define the option values for all users on a system. See Configuration file precedence for a more detailed discussion.

We are now ready to start processing the downloaded images. Since these datasets are already aligned, you don’t need to align them to make sure the pixel grid covers the same region in all inputs. Gnuastro’s Warp program has features for such pixel-grid warping (see Warp). Therefore, just for a demonstration, let’s assume one image needs to be rotated by 20 degrees to correspond to the other. To do that, you can run the following command:

$ astwarp flat-ir/xdf-f160w.fits --rotate=20

Open the output and see the output. If your final image is already aligned with RA and Dec, you can simply use the --align option and let Warp calculate the necessary rotation.

Warp can generally be used for any kind of pixel grid manipulation (warping). For example the outputs of the commands below will respectively have larger pixels (new resolution being one quarter the original resolution), get shifted by 2.8 (by sub-pixel), get a shear of 2, and be tilted (projected). After running each, please open the output file and see the effect.

$ astwarp flat-ir/xdf-f160w.fits --scale=0.25
$ astwarp flat-ir/xdf-f160w.fits --translate=2.8
$ astwarp flat-ir/xdf-f160w.fits --shear=2
$ astwarp flat-ir/xdf-f160w.fits --project=0.001,0.0005

You can also combine multiple warps in one command. For example to first rotate the image, then scale it, run this command:

$ astwarp flat-ir/xdf-f160w.fits --rotate=20 --scale=0.25

If you have multiple warps, do them all in one command. Don’t warp them in separate commands because the correlated noise will become too strong. As you see in the matrix that is printed when you run Warp, it merges all the warps into a single warping matrix (see Warping basics and Merging multiple warpings) and simply applies that just once. Recall that since this is done through matrix multiplication, order matters in the separate operations. In fact through Warp’s --matrix option, you can directly request your desired final warp and don’t have to break it up into different warps like above (see Invoking Warp).

Fortunately these datasets are already aligned to the same pixel grid, so you don’t actually need the files that were just generated. You can safely delete them all with the following command. Here, you see why we put the processed outputs that we need later into a separate directory. In this way, the top directory can be used for temporary files for testing that you can simply delete with a generic command like below.

$ rm *.fits

To detect the objects in the image, we’ll run NoiseChisel:

$ mkdir noisechisel
$ astnoisechisel flat-ir/xdf-f160w.fits -onoisechisel/xdf-f160w.fits

Read what NoiseChisel writes on the command-line. The curious thing you will notice is that while there were more than 3000 pseudo detections to find the pseudo-detection S/N, but there were only slightly more than 100 clumps to find the false clump S/N. We will see what caused this after a short review on the output of NoiseChisel.

NoiseChisel’s output is a single file containing multiple extensions. You can get basic information about the extensions in a FITS file with Gnuastro’s Fits program (see Fits) as shown below. It contains 6 extensions and the first (counted as zero) is blank (has no data).

$ astfits noisechisel/xdf-f160w.fits

NoiseChisel puts some general information on its outputs in the FITS header of the respective extension. To see the full list of keywords, you can again use the Fits program like above, but also give it your desired extension/HDU. You can also give the extension number (as listed in the output above), for example -h2 instead of -hOBJECTS.

$ astfits noisechisel/xdf-f160w.fits -hOBJECTS

The NUMLABS keyword in NoiseChisel’s OBJECTS extension contains the number of objects that was found by NoiseChisel in that run. Try to visually find it in the header keywords you saw above.

To simplify the process, you can pipe the output of the command above into grep (a program for matching lines which is available on almost all Unix-like operating systems).

$ astfits noisechisel/xdf-f160w.fits -hOBJECTS | grep NUMLABS

If you just want the value of the keyword and not the full FITS keyword line, you can use AWK. In the example below, AWK will print the third word (separated by white space characters) in any line that has a first column value of NUMLABS. You can also try this for the third HDU (called CLUMPS) to see the number of clumps.

$ astfits noisechisel/xdf-f160w.fits -h2                             \
          | awk '$1=="NUMLABS" {print $3}'

Grep and AWK are simple, but very powerful command-line software for processing text files. Learning them properly can greatly simplify your processing, while improve your creativity, productivity and speed. When you get time, it is highly recommended to master them. The most common implementation of both is from GNU. Like almost all GNU software, both GNU Grep and GNU AWK have wonderful manuals which come with the program for free. You don’t have to read the whole manual at once, they both start with great generic introductions to get you going fast. As described above, you can read both manuals or refresh your memory on your command-line with these commands:

$ info awk
$ info grep

You can now open NoiseChisel’s output with SAO DS9 and visually inspect it. Just note that since there are multiple extensions, the easiest way to view the whole file is to open it as a “Multi-extension data cube” with the -mecube option as shown below. If you use GNOME (another GNU software, most common graphic user interface in GNU/Linux operating systems), please see Viewing multiextension FITS images to open DS9 in multi-extension cube mode by default when using the GUI (double clicking on the file).

$ ds9 -mecube noisechisel/xdf-f160w.fits -zscale -zoom to fit

Using Gnuastro’s Fits program, you can also copy a HDU from one FITS file to another (for more, see HDU manipulation). So if you want to have a FITS file with only the detected objects, you can run this command:

$ astfits noisechisel/xdf-f160w.fits --copy=OBJECTS -oobjects.fits

One good way to see if you have missed any signal is to mask all the detected pixels and inspect the noise pixels. For this, you can use Gnuastro’s Arithmetic program (in particular its where operator as shown below, see Arithmetic operators). With this command, all input pixels that have a value larger than zero in the OBJECTS extension will be set to NaN in the output (written in det-masked.fits). If you change the gt (for “greater than”) operator to eq (for “equal”), all the un-detected (sky) pixels will be masked and you can see the detections.

$ astarithmetic noisechisel/xdf-f160w.fits                         \
                noisechisel/xdf-f160w.fits 0 gt nan where -h1 -h2  \
                --output=nc-masked.fits

In some cases, you might want to use a different kernel with NoiseChisel (not the default one). To do that, you can use MakeProfiles (see MakeProfiles) in the following manner to build a 2D Gaussian kernel with a FWHM of 3 pixels that extends 5 times the FWHM. This new kernel can later be fed into NoiseChisel with the --kernel option.

$ astmkprof --kernel=gaussian,3,5 --oversample=1 -okernel-g-3-5.fits
$ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel-g-3-5.fits   \
                 --output=nc-my-kernel.fits

NoiseChisel can produce “Check images” to help you visualize how each step is completed. You can see all the check images it can produce with this command.

$ astnoisechisel --help | grep check

The check images are also multi-extension FITS files. After each check image is produced, open it with ds9 like NoiseChisel’s output above and flip through the extensions to see each processing in detail. It is strongly encouraged to play with the different parameters and use the respective check images to see which step is affected by your change. The three most useful check images are --checkqthresh, --checkdetection, and --checksegmentation.

We can now get back to the curious situation we noticed after running NoiseChisel: the number of false clumps to find an S/N limit was very small (given the extent of this image). This is bad, because we use quantiles in NoiseChisel and such a small number will result in a relatively large scatter. Since this is a segmentation issue, let’s see why this happens with --checksegmentation.

$ astnoisechisel flat-ir/xdf-f160w.fits --checksegmentation

To help you get a result faster, when check images are requested, NoiseChisel doesn’t finish its processing (unless you also call --continueaftercheck). NoiseChisel aborts with an explanation of why it stopped without finishing and the file name of the check image that it produced. The first five extensions are: the input image, the convolved image, the initially labeled detected regions, all the sky region (false) clumps, and those that will be used for S/N. The sky clumps are found over NoiseChisel’s “large tiles” independently. When inspecting the fourth extension of the check image, it is interesting how NoiseChisel has ignored most large tiles and only used the few that we see, mostly on the edge.

The reason that almost all internal large tiles are ignored is that galaxies are extended and this is a very deep (and small) image. Thanks to the PSF (see Point Spread Function), no object will have a sharp truncation. We have not seen any abrupt break in the light profile of any galaxy: galaxies are always larger when we get deeper datasets. Therefore, in a noisy image, some light will always be left un-detected. To be less affected by this un-detected light, NoiseChisel has the --minskyfrac option (see General NoiseChisel options): any tile that has a larger fraction of detected pixels will be ignored. So let’s see what the default value of this option is (recall that with -P, you can list all the options with their values in Gnuastro’s programs):

$ astnoisechisel -P | grep minskyfrac

Try decreasing this value and re-running NoiseChisel to see the effect on the fraction of large tiles that will be used for finding false/sky clumps. Play a little with this parameter (give it different values and check the result).

$ astnoisechisel flat-ir/xdf-f160w.fits --minskyfrac=0.5           \
                 --checksegmentation

The smaller the value to --minskyfrac, the more probable that un-detected light in the wings of galaxies will bias/affect the derived false clump S/N. So it is always important to find a good balance between a larger --minskyfrac while having a sufficient number of resulting clumps (to avoid scatter).

NoiseChisel doesn’t just find the labeled pixels, it also finds the Sky value and the Sky standard deviation in the final two extensions of its output. To generate a catalog of the colors, we will be using the NoiseChisel labeled image from the F160W image. But the Sky and Sky standard deviation values for each different filter will also be necessary. So we’ll run NoiseChisel with our finalized parameters value on both filters (you can also put this option’s value in a configuration file to avoid repeating it).

$ astnoisechisel flat-ir/xdf-f105w.fits -onoisechisel/xdf-f105w.fits \
                 --minskyfrac=0.5
$ astnoisechisel flat-ir/xdf-f160w.fits -onoisechisel/xdf-f160w.fits \
                 --minskyfrac=0.5

Now, we are ready to make a catalog. We want the object and clump labels from the F160W image. But the input, Sky and Sky standard deviation images should come from each filter. So, we’ll run MakeCatalog on NoiseChisel’s output differently for each filter. When making the F105W catalog, we’ll use the --objectsfile and --clumpsfile options to tell MakeCatalog to read the object and clump labels from the F160W NoiseChisel output. When these options aren’t given, MakeCatalog will look into the same input file for object and clump labels.

For both filters, we’ll ask for the ID, RA, Dec, Magnitude and signal-to-noise ratio (see Quantifying measurement limits). To see a list of all the parameters that MakeCatalog can measure for you, run it with --help option.

$ mkdir catalog

$ astmkcatalog noisechisel/xdf-f160w.fits --zeropoint=25.94     \
               --ids --ra --dec --magnitude --sn                \
               --output=catalog/xdf-f160w.fits

$ astmkcatalog noisechisel/xdf-f105w.fits --zeropoint=26.27     \
               --objectsfile=noisechisel/xdf-f160w.fits         \
               --clumpsfile=noisechisel/xdf-f160w.fits          \
               --ids --ra --dec --magnitude --sn                \
               --output=catalog/xdf-f105w.fits

MakeCatalog can also produce catalogs in plain text format. Please run the MakeCatalog commands above again and replace the .fits suffix, in the value to --output, with .txt (we will also be using the text file outputs later).

When a clumps image is also given26, like this example, two catalogs will be made. If you asked for a plain text file output, two files will be made with the _c.txt and _o.txt suffixes. If MakeCatalog’s output is requested to be FITS, the two catalogs will be in separate extensions of a single file. You can inspect the separate extensions with the Fits program like before (as shown below). Afterwards, you can inspect the table in each extension with Gnuastro’s Table program (see Table) as shown below.

$ astfits catalog/xdf-f105w.fits             # Extension information
$ asttable catalog/xdf-f105w.fits -h1 --info # Objects catalog info.
$ asttable catalog/xdf-f105w.fits -h1        # Objects catalog columns.
$ asttable catalog/xdf-f105w.fits -h2 -i     # Clumps catalog info.
$ asttable catalog/xdf-f105w.fits -h2        # Clumps catalog columns.

As you see above, to see the column contents of each table, you can just remove the --info (or -i) option. If you want to print the contents of special column(s), just specify the column number(s) (counting from 1, same as output of the command above) or the column name(s) (if they have one). For example, if you just want the objects and clumps magnitude and signal-to-noise ratio in the F160W filter. You can get it with the following commands.

$ asttable catalog/xdf-f160w.fits -h1 -cMAGNITUDE -cSN
$ asttable catalog/xdf-f160w.fits -h2 -cMAGNITUDE -cSN

The clumps catalog has two ID columns (one for the over-all clump ID and one for the ID of the clump in its host object), the magnitude column numbers differ between the object and clumps catalog. So if you want to specify the columns by number, you will need to change the numbers when viewing the clump and objects catalogs. This is a useful advantage of having/using column names.

$ asttable catalog/xdf-f160w.fits -h1 -c4 -c5
$ asttable catalog/xdf-f160w.fits -h2 -c5 -c6

Finally, the comments in MakeCatalog’s output (in FITS headers or lines starting with # in plain text) contain some important information about the dataset that can be useful. Open catalog/xdf-f160w_o.txt in a text editor to see them.

Since we used the same labeled image on both filters, the number of rows in both catalogs are the same. So, let’s measure the colors of the objects in this image. We’ll merge the two clump catalogs together into one using the paste program on the command-line. The output file will have each line of both catalogs merged into a single line.

$ paste catalog/xdf-f160w_c.txt catalog/xdf-f105w_c.txt           \
        > xdf-f160w-f105w_c_p.txt

Open xdf-f160w-f105w_c_p.txt after making it to see how paste has operated. We can now use AWK to find the colors. We’ll ask AWK to only use lines that don’t start with # and don’t have a NaN magnitude in the 9th column (F105W magnitude27). We will also ignore columns which don’t have reliable F105W magnitudes (with a S/N less than 728). For the other lines, AWK will print the ID, positional columns and the difference between the respective magnitude columns.

$ awk '!/^#/ && $11!="nan" && $12>7  {print $1, $2, $3, $4, $11-$5}' \
      xdf-f160w-f105w_c_p.txt > catalog/xdf-f105w-f160w_c.txt

Gnuastro has a simple program for basic statistical analysis. The command below will print some basic information about the distribution (minimum, maximum, median and etc), along with a cute little ASCII histogram to visually help you understand the distribution on the command-line without the need for a graphic user interface (see Invoking Statistics). This ASCII histogram can be useful when you just want some coarse and general information on the input dataset. It is also useful when working on a server (where you may not have graphic user interface), and finally, its fast.

$ aststatistics catalog/xdf-f105w-f160w_c.txt -c5

You can later use Gnuastro’s Statistics program with the --histogram option to build a much more fine-grained histogram as a table to feed into your favorite plotting program for a much more accurate/appealing plot. If you just want a specific measure, for example the mean, median and standard deviation, you can ask for them specifically:

$ aststatistics catalog/xdf-f105w-f160w_c.txt -c5 --mean --median --std

Some researchers prefer to have colors in a fixed aperture for all the objects. The colors we calculated above used a different segmentation map for each object. This might not satisfy some science cases. To make a fixed aperture catalog, we should make a labeled image which has a fixed label for each aperture. That labeled image can be given to MakeCatalog instead of NoiseChisel’s labeled detection image.

We’ll use the objects catalog in the F160W catalog we generated before for the positions and set the other parameters of each profile to be a fixed circle of radius 5 pixels (we want all apertures to be fixed after all). AWK is a wonderful tool for such jobs as the command below shows.

$ awk '!/^#/{print $1, $2, $3, 5, 5, 0, 0, 1, $1, 1}'                \
      catalog/xdf-f160w_c.txt > catalog/apertures.txt

We can now feed this catalog into MakeProfiles to build the apertures for us. See Invoking MakeProfiles for a description of the options. The most important for this particular job is --mforflatpix, it tells MakeProfiles that the values in the magnitude column should be used for each pixel of a flat profile. Without it, MakeProfiles would build the profiles such that the sum of the pixels of each profile would have a magnitude (in log-scale) of the value given in that column (what you would expect when simulating a galaxy for example).

$ astmkprof catalog/apertures.txt --background=flat-ir/xdf-f160w.fits \
            --clearcanvas --replace --type=int16 --mforflatpix        \
            --mode=wcs

The first thing you might notice in the printed information is that the profiles are not built in order. This is because MakeProfiles works in parallel and parallel CPU operations are asynchronous. Without --replace, the output is the same in any case. You can try running MakeProfiles with one thread (using --numthreads=1 to see how order is respected in that case.

Open the output apertures.fits file and see the result. Where the apertures overlap, you will notice that one label has replaced the other (because of the --replace option). In the future, MakeCatalog will be able to work with overlapping labels, but currently it doesn’t. If you are interested, please join us in completing Gnuastro with added improvements like this (see task 14750 29).

apertures.fits labeled image can now be fed input into MakeCatalog. Similar to how we used the F160W labels for the F105W catalog: the labels don’t have to be produced by NoiseChisel. In comparison with the previous MakeCatalog call, notice how 1) we have no more clumps image, 2) that we have set --objectshdu=1 (since apertures.fits is not the output of NoiseChisel where the labeled image is in the third extension), and 3) that we are now using --objid instead of --ids to avoid warnings that some ID columns are only for clumps.

$ astmkcatalog noisechisel/xdf-f105w.fits --zeropoint=26.27        \
               --objectsfile=apertures.fits --objectshdu=1         \
               --objid --ra --dec --magnitude --sn                 \
               --output=catalog/xdf-f105w-aper.fits

Change the filter name and zeropoint magnitudes and run this command again to have the fixed aperture magnitude in the F160W filter also. From this point on, you can follow the previous steps to derive the color in a fixed aperture.

We will now find some of the objects with the strongest color difference and make a cutout to inspect them visually: let’s see what the objects with a color more than two magnitudes look like.

We’ll use the catalog/xdf-f105w-f160w_c.txt file that we produced above. With the command below, all lines with a color value more than 2 will be put in reddest.txt and inspect it using cat.

$ awk '$5>1' catalog/xdf-f105w-f160w_c.txt > reddest.txt
$ cat reddest.txt

We can now feed reddest.txt into Gnuastro’s crop to see what these objects look like. To keep things clean, we’ll make a directory called reddest and ask Crop to save the crops in this directory. We’ll also add a -f160w.fits suffix to the crops.

$ mkdir crop
$ astcrop --mode=wcs --coordcol=3 --coordcol=4 flat-ir/xdf-f160w.fits  \
          --catalog=reddest.txt --width=15/3600,15/3600                \
          --suffix=-f160w.fits --output=crop

Like the MakeProfiles command above, you might notice that the crops aren’t made in order. Since each crop is independent of the rest, the crops are done in parallel and parallel operations are asynchronous. In the command above, change f160w to f105w to make the crops in both filters.

To view the crops more easily (not having to open ds9 for each image), you can convert the FITS crops into the JPEG format.

$ for f in *.fits; do                                                  \
    astconvertt $f --fluxlow=-0.001 --fluxhigh=0.005 --invert -ojpg;   \
  done

The loop above is in series: each file is processed only after the previous ones are complete. If you have GNU Parallel, you can greatly speed up this conversion. GNU Parallel will run the separate commands simultaneously on different CPU threads in parallel. For more information on efficiently using your threads, see Multi-threaded operations.

$ parallel astconvertt --fluxlow=-0.001 --fluxhigh=0.005 --invert      \
           -ojpg ::: *.fits

You can now easily use your general GUI image viewer to flip through the images more easily. On GNOME, you can use the “Eye of GNOME” image viewer (with executable name of eog). Run the command below and by pressing the <SPACE> key, you can flip through the images and compare them visually more easily. Of course, the flux ranges have been chosen generically here for seeing the fainter parts. Therefore, brighter objects will be fully black.

$ eog 1-f105w.jpg

Another thing that is commonly needed is to visually mark these objects on the image. DS9 has “Region”s for this purpose. You just have to convert your catalog into a “region file” to feed into DS9. To do that, you can use AWK again as shown below.

$ cd ..
$ awk 'BEGIN{print "# Region file format: DS9 version 4.1";     \
             print "global color=green width=2";                \
             print "fk5";}                                      \
       {printf "circle(%s,%s,1\")\n", $3, $4;}' reddest.txt     \
       > reddest.reg

This region file can be loaded into DS9 with its -regions option as shown below (see above for the other options):

$ ds9 -mecube noisechisel/xdf-f160w.fits -zscale -zoom to fit    \
      -regions load all reddest.reg

Finally, if this book or any of the programs in Gnuastro have been useful for your research, please cite the respective papers. All Gnuastro programs have a --cite option to help you cite the authors’ work more easily. For example:

$ astmkcatalog --cite

Footnotes

(21)

See SAO ds9, available at http://ds9.si.edu/site/Home.html.

(22)

https://www.gnu.org/software/wget.

(23)

https://www.gnu.org/software/gawk.

(24)

GNU Info is already available on almost all Unix-like operating systems.

(25)

Note that you only have one port to the internet, so downloading in parallel will actually be slower than downloading in series.

(26)

MakeCatalog will look at the WCLUMPS keyword in the objects image to see if it should also use a clumps image or not.

(27)

Recall that the objects and clumps labels were made on the F160W image. On the F105W image, there might not be enough signal, so random scatter may give a negative total brightness and thus a NaN magnitude.

(28)

The value of 7 is taken from the clump S/N threshold in F160W (where the clumps were defined).

(29)

https://savannah.gnu.org/task/index.php?14750


Previous: , Up: Tutorials   [Contents][Index]