## GNU Astronomy Utilities

#### 5.3.1 Printing floating point numbers

Many of the columns containing astronomical data will contain floating point numbers (those that aren’t an integer, like $$1.23$$ or $$4.56\times10^{-7}$$). However, printing (for human readability) of floating point numbers has some intricacies that we will explain in this section. For a basic introduction to different types of integers or floating points, see Numeric data types.

It may be tempting to simply use 64-bit floating points all the time and avoid this section over all. But have in mind that compared to 32-bit floating point type, a 64-bit floating point type will consume double the storage, double the RAM and will take almost double the time for processing. So when the statistical precision of your numbers is less than that offered by 32-bit floating point precision, it is much better to store them in this format.

Within almost all commonly used CPUs of today, numbers (including integers or floating points) are stored in binary base-2 format (where the only digits that can be used to represent the number are 0 and 1). However, we (humans) are use to numbers in base-10 (where we have 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9). For integers, there is a one-to-one correspondence between a base-2 and base-10 representation. Therefore, converting a base-10 integer (that you will be giving as an option value when running a Gnuastro program, for example) to base-2 (that the computer will store in memory), or vice-versa, will not cause any loss of information for integers.

The problem is that floating point numbers don’t have such a one-to-one correspondence between the two notations. The full discussion on how floating point numbers are stored in binary format is beyond the scope of this book. But please have a look at the corresponding Wikipedia article to get a rough feeling about the complexity. Of course, if you are interested in the details, that Wikipedia article should be a good starting point for further reading.

The most common convention for storing floating point numbers in digital storage is IEEE Standard for Floating-Point Arithmetic; IEEE 754. In short, the full width (in bits) assigned to that type (for example the 32 bits allocated for 32-bit floating point types) is divided into separate components: The first bit is the “sign” (specifying if the number is negative or positive). In 32-bit floats, the next 8 bits are the “exponent” and finally (again, in 32-bit floats), the “fraction” is stored in the next 23 bits. For example see this image on Wikipedia.

In IEEE 754, around zero, the base-2 and base-10 representations approximately match. However, as we go away from 0, you will loose precision. The important concept in understanding the precision of floating point numbers is “decimal digits”, or the number of digits in the number, independent of where the decimal point is. For example $$1.23$$ has three decimal digits and $$4.5678\times10^9$$ has 5 decimal digits. According to IEEE 754135, 32-bit and 64-bit floating point numbers can accurately (statistically) represent a floating point with 7.22 and 15.95 decimal digits respectively.

 Should I store my columns as 32-bit or 64-bit floating point type? If your floating point numbers have 7 decimal digits or less (for example noisy image pixel values, measured star or galaxy magnitudes, and anything that is derived from them like galaxy mass and etc), you can safely use 32-bit precision (the statistical error on the measurements is usually significantly larger than 7 digits!). However, some columns require more digits; thus 64-bit precision. For example, RA or Dec with more than one arcsecond accuracy: the degrees can have 3 digits, and 1 arcsecond is $$1/3600\sim0.0003$$ of a degree, requiring 4 more digits). You can use the Numerical type conversion operators of Column arithmetic to convert your columns to a certain type for storage.

The discussion above was for the storage of floating point numbers. When printing floating point numbers in a human-friendly format (for example, in a plain-text file or on standard output in the command-line), the computer has to convert its internal base-2 representation to a base-10 representation. This second conversion may cause a small discrepancy between the stored and printed values.

 Use FITS tables as output of measurement programs: When you are doing a measurement to produce a catalog (for example with MakeCatalog) set the output to be a FITS table (for example --output=mycatalog.fits). A FITS binary table will store the same the base-2 number that was measured by the CPU. However, if you choose to store the output table as a plain-text table, you risk loosing information due to the human friendly base-10 floating point conversion (which is necessary in a plain-text output).

To customize how columns containing floating point values are printed (in a plain-text output file, or in the standard output in your terminal), Table has four options for the two different types: --txtf32format, --txtf32precision, --txtf64format and --txtf64precision. They are fully described in Invoking Table.

 Summary: it is therefore recommended to always store your tables as FITS (binary) tables. To view the contents of the table on the command-line or to feed it to a program that doesn’t recognize FITS tables, you can use the four options above for a custom base-10 conversion that will not cause any loss of data.

#### Footnotes

##### (135)

https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats