Next: Gawk and MPFR, Previous: General Arithmetic, Up: Arbitrary Precision Arithmetic [Contents][Index]

Numerical programming is an extensive area; if you need to develop
sophisticated numerical algorithms then `gawk`

may not be
the ideal tool, and this documentation may not be sufficient.
It might require digesting a book or two^{91}
to really internalize how to compute
with ideal accuracy and precision,
and the result often depends on the particular application.

NOTE:A floating-point calculation’saccuracyis how close it comes to the real value. This is as opposed to theprecision, which usually refers to the number of bits used to represent the number (see the Wikipedia article for more information).

There are two options for doing floating-point calculations:
hardware floating-point (as used by standard `awk`

and
the default for `gawk`

), and *arbitrary-precision*
floating-point, which is software based.
From this point forward, this chapter
aims to provide enough information to understand both, and then
will focus on `gawk`

’s facilities for the latter.^{92}

Binary floating-point representations and arithmetic are inexact. Simple values like 0.1 cannot be precisely represented using binary floating-point numbers, and the limited precision of floating-point numbers means that slight changes in the order of operations or the precision of intermediate storage can change the result. To make matters worse, with arbitrary precision floating-point, you can set the precision before starting a computation, but then you cannot be sure of the number of significant decimal places in the final result.

Sometimes, before you start to write any code, you should think more about what you really want and what’s really happening. Consider the two numbers in the following example:

x = 0.875 # 1/2 + 1/4 + 1/8 y = 0.425

Unlike the number in `y`

, the number stored in `x`

is exactly representable
in binary since it can be written as a finite sum of one or
more fractions whose denominators are all powers of two.
When `gawk`

reads a floating-point number from
program source, it automatically rounds that number to whatever
precision your machine supports. If you try to print the numeric
content of a variable using an output format string of `"%.17g"`

,
it may not produce the same number as you assigned to it:

$gawk 'BEGIN { x = 0.875; y = 0.425>printf("%0.17g, %0.17g\n", x, y) }'-| 0.875, 0.42499999999999999

Often the error is so small you do not even notice it, and if you do,
you can always specify how much precision you would like in your output.
Usually this is a format string like `"%.15g"`

, which when
used in the previous example, produces an output identical to the input.

Because the underlying representation can be a little bit off from the exact value, comparing floating-point values to see if they are equal is generally not a good idea. Here is an example where it does not work like you expect:

$gawk 'BEGIN { print (0.1 + 12.2 == 12.3) }'-| 0

The loss of accuracy during a single computation with floating-point numbers
usually isn’t enough to worry about. However, if you compute a value
which is the result of a sequence of floating point operations,
the error can accumulate and greatly affect the computation itself.
Here is an attempt to compute the value of the constant
*pi* using one of its many series representations:

BEGIN { x = 1.0 / sqrt(3.0) n = 6 for (i = 1; i < 30; i++) { n = n * 2.0 x = (sqrt(x * x + 1) - 1) / x printf("%.15f\n", n * x) } }

When run, the early errors propagating through later computations cause the loop to terminate prematurely after an attempt to divide by zero.

$gawk -f pi.awk-| 3.215390309173475 -| 3.159659942097510 -| 3.146086215131467 -| 3.142714599645573 … -| 3.224515243534819 -| 2.791117213058638 -| 0.000000000000000 error→ gawk: pi.awk:6: fatal: division by zero attempted

Here is an additional example where the inaccuracies in internal representations yield an unexpected result:

$gawk 'BEGIN {>for (d = 1.1; d <= 1.5; d += 0.1) # loop five times (?)>i++>print i>}'-| 4

Can computation using arbitrary precision help with the previous examples? If you are impatient to know, see Exact Arithmetic.

Instead of arbitrary precision floating-point arithmetic,
often all you need is an adjustment of your logic
or a different order for the operations in your calculation.
The stability and the accuracy of the computation of the constant *pi*
in the earlier example can be enhanced by using the following
simple algebraic transformation:

(sqrt(x * x + 1) - 1) / x = x / (sqrt(x * x + 1) + 1)

After making this, change the program does converge to
*pi* in under 30 iterations:

$gawk -f pi2.awk-| 3.215390309173473 -| 3.159659942097501 -| 3.146086215131436 -| 3.142714599645370 -| 3.141873049979825 … -| 3.141592653589797 -| 3.141592653589797

There is no need to be unduly suspicious about the results from floating-point arithmetic. The lesson to remember is that floating-point arithmetic is always more complex than arithmetic using pencil and paper. In order to take advantage of the power of computer floating-point, you need to know its limitations and work within them. For most casual use of floating-point arithmetic, you will often get the expected result in the end if you simply round the display of your final results to the correct number of significant decimal digits.

As general advice, avoid presenting numerical data in a manner that implies better precision than is actually the case.

• Floating-point Representation: | Binary floating-point representation. | |

• Floating-point Context: | Floating-point context. | |

• Rounding Mode: | Floating-point rounding mode. |

One recommended title is Numerical Computing with IEEE Floating Point Arithmetic, Michael L. Overton, Society for Industrial and Applied Mathematics, 2004. ISBN: 0-89871-482-6, ISBN-13: 978-0-89871-482-1. See http://www.cs.nyu.edu/cs/faculty/overton/book.

If you
are interested in other tools that perform arbitrary precision arithmetic,
you may want to investigate the POSIX `bc`

tool. See
the POSIX specification for it, for more information.