4.3 Floats

A floating-point number or float is a number stored in scientific notation. The number of significant digits in the fractional part is governed by the current floating precision (see Precision). The range of acceptable values is from ‘10^-3999999’ (inclusive) to ‘10^4000000’ (exclusive), plus the corresponding negative values and zero.

Calculations that would exceed the allowable range of values (such as ‘exp(exp(20))’) are left in symbolic form by Calc. The messages “floating-point overflow” or “floating-point underflow” indicate that during the calculation a number would have been produced that was too large or too close to zero, respectively, to be represented by Calc. This does not necessarily mean the final result would have overflowed, just that an overflow occurred while computing the result. (In fact, it could report an underflow even though the final result would have overflowed!)

If a rational number and a float are mixed in a calculation, the result will in general be expressed as a float. Commands that require an integer value (such as k g [gcd]) will also accept integer-valued floats, i.e., floating-point numbers with nothing after the decimal point.

Floats are identified by the presence of a decimal point and/or an exponent. In general a float consists of an optional sign, digits including an optional decimal point, and an optional exponent consisting of an ‘e’, an optional sign, and up to seven exponent digits. For example, ‘23.5e-2’ is 23.5 times ten to the minus-second power, or 0.235.

Floating-point numbers are normally displayed in decimal notation with all significant figures shown. Exceedingly large or small numbers are displayed in scientific notation. Various other display options are available. See Float Formats.

Floating-point numbers are stored in decimal, not binary. The result of each operation is rounded to the nearest value representable in the number of significant digits specified by the current precision, rounding away from zero in the case of a tie. Thus (in the default display mode) what you see is exactly what you get. Some operations such as square roots and transcendental functions are performed with several digits of extra precision and then rounded down, in an effort to make the final result accurate to the full requested precision. However, accuracy is not rigorously guaranteed. If you suspect the validity of a result, try doing the same calculation in a higher precision. The Calculator’s arithmetic is not intended to be IEEE-conformant in any way.

While floats are always stored in decimal, they can be entered and displayed in any radix just like integers and fractions. Since a float that is entered in a radix other that 10 will be converted to decimal, the number that Calc stores may not be exactly the number that was entered, it will be the closest decimal approximation given the current precision. The notation ‘radix#ddd.ddd’ is a floating-point number whose digits are in the specified radix. Note that the ‘.’ is more aptly referred to as a “radix point” than as a decimal point in this case. The number ‘8#123.4567’ is defined as ‘8#1234567 * 8^-4’. If the radix is 14 or less, you can use ‘e’ notation to write a non-decimal number in scientific notation. The exponent is written in decimal, and is considered to be a power of the radix: ‘8#1234567e-4’. If the radix is 15 or above, the letter ‘e’ is a digit, so scientific notation must be written out, e.g., ‘16#123.4567*16^2’. The first two exercises of the Modes Tutorial explore some of the properties of non-decimal floats.