## GNU Astronomy Utilities

#### 6.2.2 Integer benefits and pitfalls ¶

Integers are the simplest numerical data types (Numeric data types). Because of this, their storage space is much less, and their processing is much faster than floating point types. You can confirm this on your computer with the series of commands below. You will make four 5000 by 5000 pixel images filled with random values. Two of them will be saved as signed 8-bit integers, and two with 64-bit floating point types. The last command prints the size of the created images.

$astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-1.fits$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8    -oint-2.fits
$astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-1.fits$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-2.fits
$ls -lh int-*.fits flt-*.fits The 8-bit integer images are only 24MB, while the 64-bit floating point images are 191 MB! Besides helping in storage (on your disk, or in RAM, while the program is running), the small size of these files also helps in faster reading of the inputs. Furthermore, CPUs can process integer operations much faster than floating points. In the integers, the ones with a smaller width (number of bits) can be processed much faster. You can see this with the two commands below where you will add the integer images with each other and the floats with each other:$ astarithmetic flt-1.fits flt-2.fits + -oflt-sum.fits -g1
$astarithmetic int-1.fits int-2.fits + -oint-sum.fits -g1 Have a look at the running time of the two commands above (that is printed on their last line). On the system that this paragraph was written on, the floating point and integer image sums were respectively done in 0.481 and 0.089 seconds (the integer operation was almost 5 times faster!).  If your data does not have decimal points, use integer types: integer types are much faster and can take much less space in your storage or RAM (while the program is running).  Select the smallest width that can host the range/precision of values: for example, if the largest possible value in your dataset is 1000 and all numbers are integers, store it as a 16-bit integer. Also, if you know the values can never become negative, store it as an unsigned 16-bit integer. For floating point types, if you know you will not need a precision of more than 6 significant digits, use the 32-bit floating point type. For more on the range (for integers) and precision (for floats), see Numeric data types. There is a price to be paid for this improved efficiency in integers: your wisdom! If you have not selected your types wisely, strange situations may happen. For example, try the command below:$ astarithmetic 125 10 +

You expect the output to be $$135$$, but it will be $$-121$$! The reason is that when Arithmetic (or column-arithmetic in Table) confronts a number on the command-line, it use the principles above to select the most efficient type for each number. Both $$125$$ and $$10$$ can safely fit within a signed, 8-bit integer type, so arithmetic will store both as an 8-bit integer. However, the sum ($$135$$) is larger than the maximum possible value of an 8-bit signed integer ($$127$$). Therefore an integer overflow will occur, and the bits will be over-written. As a result, the value will be $$135-128=7$$ more than the minimum value of this type ($$-128$$), which is $$-128+7=-121$$.

When you know situations like this may occur, you can simply use Numerical type conversion operators, to set just one of the inputs to a wider data type (the smallest, wider type to avoid wasting resources). In the example above, this would be uint16:

$astarithmetic 125 uint16 10 + The reason this worked is that $$125$$ is now converted into an unsigned 16-bit integer before the + operator. Since this is larger than an 8-bit integer, the C programming language’s automatic type conversion will treat both as the wider type and store the result of the binary operation (+) in that type. For such a basic operation like the command above, a faster hack would be any of the two commands below (which are equivalent). This is because 125.0 or 125. are interpreted as floating-point types and they do not suffer from such issues (converting only on one input is enough):$ astarithmetic 125.  10 +
\$ astarithmetic 125.0 10 +

For this particular command, the fix above will be as fast as the uint16 solution. This is because there are only two numbers, and the overhead of Arithmetic (reading configuration files, etc.) dominates the running time. However, for large datasets, the uint16 solution will be faster (as you saw above), Arithmetic will consume less RAM while running, and the output will consume less storage in your system (all major benefits)!

It is possible to do internal checks in Gnuastro and catch integer overflows and correct them internally. However, we have not opted for this solution because all those checks will consume significant resources and slow down the program (especially with large datasets where RAM, storage and running time become important). To be optimal, we therefore trust that you (the wise Gnuastro user!) make the appropriate type conversion in your commands where necessary (recall that the operators are available in Numerical type conversion operators).