jAgg Aggregations Algorithm
The Aggregation class uses the same general algorithm no matter which method is
called. This may help to understand what the AggregateFunction interface's
abstract methods are meant to accomplish.
- Create a shallow copy of the given List of Ts.
- Sort the list copy according to either the “compare” method if T is Comparable,
using an internal adapter Comparator class ("ComparableComparator"), or by
using another internal Comparator class (“PropertiesComparator”) based on the
supplied “group by” properties. Call Collections.sort(Collection, Comparator).
Alternatively, if so specified, instead of sorting, perform
Multiset Discrimination based on the supplied "group by"
properties.
- If parallel processing is indicated and desired, then break down the list of
Ts into multiple sections that different Threads will process. Different
Threads will get different Aggregators, even for the same property, so that
Aggregators do not need to be thread-safe.
- For each run of Ts that compare equal, do the following:
- Get an Aggregator for each caller-specified Aggregator using “getAggregator”.
- Initialize each Aggregator by calling “init”.
- For each T in the run, call “iterate” on each Aggregator, passing in the T
object.
- If parallel processing is indicated and desired, call “merge” on some
Aggregators to merge other Aggregators’ states into one. This occurs
when a run of T objects that compare the same is split between threads.
- Call “terminate” on each Aggregator that’s left, to get the final aggregate
results.
- Create an object of type AggregateValue<T> and pass it the first T of
the run. Add aggregate results into the AggregateValue’s internal HashMap,
keying on the Aggregator object.
- If indicated, perform "super aggregation" by merging the results from previous
Aggregators to create new AggregateValues. This is accomplished by calling "merge"
on the Aggregators used in the previous step.
- Return a List of AggregateValues back to the caller.
jAgg Analytics Algorithm
The Analytic class uses the same general algorithm no matter which method is
called. This may help to understand what the AnalyticFunction interfaces's
abstract methods are meant to accomplish.
- For every analytic function, create a shallow copy of the given list of Ts.
- Sort each of the list copies according to the specified partition and/or
order by clauses, using an internal Comparator class
("PartitionAndOrderByComparator"). List copies may be re-used if one sort is
good enough for another sort.
- For each item to be processed, and each analytic function:
- At the start, determine the next end of the partition for this particular
analytic function.
- Call "init" on the analytic function.
- While the next item to be iterated is not in the window for the
next item to get a value, call "terminate" on the analytic function and
associate the value with the row terminated with an
AnalyticValue.
- If there was a value terminated, then while the start of the current
window is not in the window of the next value to be terminated, call
"delete" on the analytic function to remove the start of the window
from the window, sliding the start of the window forward.
- Call "iterate" on the analytic function, passing in the current value,
sliding the end of the window forward.
- If the end of the partition has been reached, then do the following
repeatedly until all rows from the completed partition have values:
- If the start of the window is within the next row to be terminated's
window, then call "terminate" on the analytic function and
associate the value with the row terminated with an
AnalyticValue.
- Else, call "delete" on the analytic function to remove the start of the
window from the window, sliding the start of the window forward.
- If there is another partition to process, then call "init" on the analytic
function and determine the end of the next partition.
- Return a List of AnalyticValues back to the caller.