Most of jAgg's Aggregators perform calculations on numeric values. These values are generally represented as the Java built-in type double, which represents a wide range of possible values. However, many numbers cannot be represented exactly, due to the binary representation of floating-point numbers. Many numbers can only be approximated, using the binary representation that is the closest possible value to the number desired.
These approximations are usually insignificant, but they become problematic in a few ways. When an intermediate result is used to calculate the next result, usually in iterations involving a large set of figures, the error resulting from the approximations can become more signficant, i.e. floating-point errors are compounded. In addition, numerical instabilities in normally sound mathematical algorithms can render floating-point accuracy non-existent, such as when two nearly equal quantities are subtracted.
However, because jAgg supports arbitrarily large datasets to aggregate, these approximation errors can easily become significant. This can manifest itself in an answer that is "off" by a little bit. E.g. the answer should be 2.18, but the printed answer is 2.180000000000001.
Such errors can be minimized by using in-memory representations of numbers that have a higher precision. Java has two levels of precision for floating-point numbers, float and double, with double having more precision, and thus a far smaller approximation error. Arbitrary-precision libraries are available that can virtually eliminate these kinds of errors. However, they can suffer from large peformance penalties, especially when a high degree of precision is desired.
The solution that jAgg employs is to use "Double-Double" precision. A DoubleDouble is an object that consists of two double instance variables - one of "high" significance, and one of "low" significance. Normally, one double contains 52 bits in its "signficand", plus the implicit "one", yielding 53 bits of precision. In a DoubleDouble, the "high" double represents the best double approximation to the desired number. To improve precision, the "low" double represents an adjustment to the total value of the "high" number, with 54 additional bits of precision (52 bits of the significand, plus its implicit "one" and the sign bit is also used here), yielding a total of 107 bits of precision. This greatly reduces, but does not eliminate, the problem of binary approximation of real numbers. However, in testing, this appears to be sufficient to yield highly accurate and precise double results for aggregation tasks.
All Aggregators that use numeric calculations employ DoubleDoubles behind the scenes to maintain a high level of precision and to minimize floating-point errors in calculations. Also, numerically stable algorithms are utilized wherever needed to maintain precision.
Additionally, some numeric Aggregators use other Aggregator results to calculate their own results. Normally, the result of a numeric Aggregator is a double, as a result of calling terminate. For some Aggregators to use other Aggregators in their calculations, they call terminateDoubleDouble, an internally used method that provides intermediate results at DoubleDouble precision.
The DoubleDouble class supplies many operations that represent mathematical operations on the represented value. Such operations can typically, but not always, handle both a double and another DoubleDouble. DoubleDoubles are not immutable; in fact, operations do not return separate DoubleDoubles -- they modify their own object. However, two constants are defined in the class which are immutable, NaN and ZERO. (Any operations that would modify those constants' contents throw UnsupportedOperationExceptions.)
Computational algorithms for DoubleDouble precision are based on "Algorithms for Quad-Double Precision Floating Point Arithmetic" by Hida, Li, and Bailey, 2000, Berkeley.
Here are the methods defined on the DoubleDouble class: