Scheme defines a “numerical tower” of numerical types:
number, complex, real, rational, and integer.
Java has primitive “unboxed” number types (such as
int
), just like C, and also has some
“wrapper” classes that are basically boxed versions
of the unboxed number types. Specifically, the standard Java
number classes are not organized in any particularly useful hierarchy,
except that they all inherit from Number
.
Kawa implements the full “tower” of Scheme number types,
using its own set of sub-classes of the abstract
class Quantity
, a sub-class of
Number
we will discuss later.
public class Complex extends Quantity
{ ...;
public abstract RealNum re();
public abstract RealNum im();
}
Complex
is the class of abstract complex numbers.
It has three subclasses: the abstract class RealNum
of real numbers; the general class CComplex
where the
components are arbitrary RealNum
fields; and the
optimized DComplex
where the components
are represented by double
fields.
public class RealNum extends Complex
{ ...;
public final RealNum re()
{ return this; }
public final RealNum im()
{ return IntNum.zero(); }
public abstract boolean isNegative();
}
public class DFloNum extends RealNum
{ ...;
double value;
}
Concrete class for double-precision (64-bit) floating-point real numbers.
public class RatNum extends RealNum
{ ...;
public abstract IntNum numerator();
public abstract IntNum denominator();
}
RatNum
, the abstract class for exact rational numbers,
has two sub-classes: IntFraction
and IntNum
.
public class IntFraction extends RatNum
{ ...;
IntNum num;
IntNum den;
}
The IntFraction
class implements fractions in the
obvious way. Exact real infinities are identified with the
fractions 1/0
and -1/0
.
public class IntNum extends RatNum
{ ...;
int ival;
int[] words;
}
The IntNum
concrete class implements infinite-precision integers.
The value is stored in the first ival
elements of words
,
in 2's complement form (with the low-order bits in word[0]
).
There are already many bignum packages, including one that Sun added for JDK 1.1. What are the advantages of this one?
A complete set of operations, including gcd and lcm; logical, bit,
and shift operations; power by repeated squaring; all of the
division modes from Common Lisp (floor, ceiling, truncate, and round);
and exact conversion to double
.
Consistency and integration with a complete “numerical tower.” Specifically, consistency and integration with “fixnum” (see below).
Most bignum packages use a signed-magnitude representation, while Kawa uses 2's complement. This makes for easier integration with fixnums, and also makes it cheap to implement logical and bit-fiddling operations.
Use of all 32 bits of each “big-digit” word, which
is the “expected” space-efficient representation.
More importantly, it is compatible with the mpn
routines
from the Gnu Multi-Precision library [gmp].
The mpn
routines are low-level algorithms that work
on unsigned pre-allocated bignums; they have been transcribed
into Java in the MPN
class. If better efficiency is
desired, it is straight-forward to replace the MPN
methods with native ones that call the highly-optimized mpn
functions.
If the integer value fits within a signed 32-bit int
,
then it is stored in ival
and words
is null.
This avoids the need for extra memory allocation for the words
array, and also allows us to special-case the common case.
As a further optimization, the integers in the range -100 to 1024 are pre-allocated.
Many operations are overloaded to have different
definitions depending on the argument types.
The classic examples are the functions of arithmetic
such as “+
”, which needs to use different
algorithms depending on the argument types.
If there is a fixed and reasonably small set of number types
(as is the case with standard Scheme), then we can just
enumerate each possibility. However, the Kawa system is
meant to be more extensible and support adding new
number types.
The solution is straight-forward in the case of a one-operand
function such as “negate”, since we can use method
overriding and virtual method calls to dynamically
select the correct method. However, it is more difficult
in the case of a binary method like “+
,” since
classic object-oriented languages (including Java) only
support dynamic method selection using the type of the
first argument (“this
”).
Common Lisp and some Scheme dialects support dynamic method
selection using all the arguments, and in fact the
problem of binary arithmetic operations is probably the
most obvious example where “multi-dispatch” is useful.
Since Java does not have multi-dispatch, we have to solve the problem in other ways. Smalltalk has the same problems, and solved it using “coercive generality”: Each number class has a generality number, and operands of lower generality are converted to the class with the higher generality. This is inefficient because of all the conversions and temporary objects (see [Budd91Arith]), and it is limited to what extent you can add new kinds of number types.
In “double dispatch” [Ingalls86]
the expression x-y
is implemented as x.sub(y)
. Assuming the (run-time)
class of x
is Tx
and that of y
is Ty
,
this causes the sub
method defined in Tx
to be
invoked, which just does y.subTx(x)
. That invokes
the subTx
method defined in Ty
which can without
further testing do the subtraction for types Tx
and Ty
.
The problem with this approach is that it is difficult to add
a new Tz
class, since you have to also add subTz
methods in all the existing number classes, not to mention
addTz
and all the other operations.
In Kawa, x-y
is also implemented by x.sub(y)
.
The sub
method of Tx
checks if Ty
is one
of the types it knows how to handle. If so, it does the
subtraction and returns the result itself. Otherwise,
Tx.sub
does y.subReversed(x)
.
This invokes Ty.subReversed
(or
subReversed
as defined in
a super-class of Ty
).
Now Ty
(or one of its
super-classes) gets a chance to see if it knows how to
subtract itself from a Tx
object.
The advantage of this scheme is flexibility. The knowledge of
how to handle a binary operation for types Tx
and
Ty
can be in either of Tx
or
Ty
or either of their
super-classes. This makes is easier to add new classes without
having to modify existing ones.
The DSSSL language [DSSSL] is a dialect of Scheme used
to process SGML documents. DSSSL has “quantities” in addition
to real and integer numbers. Since DSSSL is used to format documents,
it provides length values that are a multiple of a meter
(e.g. 0.2m
),
as well as derived units like cm
and
pt
(point).
A DSSSL quantity is a product of a dimension-less number with an integral
power of a length unit (the meter). A (pure) number is a quantity where
the length power is zero.
For Kawa, I wanted to merge the Scheme number types with the DSSSL
number types, and also generalize the DSSSL quantities to support
other dimensions (such as mass and time) and units (such as kg
and seconds). Quantities are implemented by the abstract class
Quantity
. A quantity is a
product of a Unit
and a pure number.
The number can be an arbitrary complex number.
public class Quantity extends Number
{ ...;
public Unit unit()
{ return Unit.Empty; }
public abstract Complex number();
}
public class CQuantity extends Quantity
{ ...;
Complex num;
Unit unt;
public Complex number()
{ return num; }
public Unit unit()
{ return unt; }
}
A CQuantity
is a concrete class that implements general
Quantities
. But usually we don't need that much generality,
and instead use DQuanity
.
public class DQuantity extends Quantity
{ ...;
double factor;
Unit unt;
public final Unit unit()
{ return unt; }
public final Complex number()
{ return new DFloNum(factor); }
}
public class Unit
{ ...;
String name; // Optional.
Dimensions dims;
double factor;
}
A Unit
is a product of a floating-point factor
and one or more primitive units, combined into a Dimensions
object.
The Unit
may have a name (such as “kg
”), which is
used for printing, and when parsing literals.
public class BaseUnit extends Unit
{ ...;
int index;
}
A BaseUnit
is a primitive unit that is not
defined in terms of any other Unit
, for example the meter.
Each BaseUnit
has a different index
, which is used
for identification and comparison purposes. Two BaseUnit
s
have the same index
if and only if they are the same BaseUnit
.
public class Dimensions
{
BaseUnit[] bases;
short[] powers;
}
A Dimensions
object is a product and/or ratio of BaseUnit
s.
You can think of it as a data structure that maps every BaseUnit
to an integer power. The bases
array is a list of the BaseUnit
s
that have a non-zero power, in order of the index
of the BaseUnit
.
The powers
array gives the power (exponent) of the BaseUnit
that has the same index in the bases
array.
Two Dimensions
objects are equal if they have the same list of
bases
and powers
. Dimensions
objects are “interned”
(using a global hash table) so that they are equal only if they are
the same object. This makes it easy to implement addition and
subtraction:
public static DQuantity add (DQuantity x, DQuantity y) { if (x.unit().dims != y.unit().dims) throw new ArithmeticException ("units mis-match"); double r = y.unit().factor / x.unit().factor; double s = x.factor + r * y.factor; return new DQuantity (s, x.unit()); }
The Unit
of the result of an addition or subtraction
is the Unit
of the first operand.
This makes it easy to convert units:
(+ 0cm 2.5m) ==> 250cm
Because Kawa represents quantities relative to user-specified units,
instead of representing them relative to primitive base units,
it can display quantities using the user's preferred units,
rather than having to use prmitive units.
However, this does make multiplication and division a problem
The actual calculation (finding the right Dimensions
and
multiplying the constant factors) is straight-forward.
The difficulty is that we have to generate a new compound
Unit
, and print it out in a reasonable fashion.
Exactly how this should best be done is not obvious.