jagg - jAgg Aggregations Algorithm

jAgg Aggregations Algorithm

The Aggregation class uses the same general algorithm no matter which method is called. This may help to understand what the AggregateFunction interface's abstract methods are meant to accomplish.

Create a shallow copy of the given List of Ts.
Sort the list copy according to either the “compare” method if T is Comparable, using an internal adapter Comparator class ("ComparableComparator"), or by using another internal Comparator class (“PropertiesComparator”) based on the supplied “group by” properties. Call Collections.sort(Collection, Comparator). Alternatively, if so specified, instead of sorting, perform Multiset Discrimination based on the supplied "group by" properties.
If parallel processing is indicated and desired, then break down the list of Ts into multiple sections that different Threads will process. Different Threads will get different Aggregators, even for the same property, so that Aggregators do not need to be thread-safe.
For each run of Ts that compare equal, do the following:
1. Get an Aggregator for each caller-specified Aggregator using “getAggregator”.
2. Initialize each Aggregator by calling “init”.
3. For each T in the run, call “iterate” on each Aggregator, passing in the T object.
4. If parallel processing is indicated and desired, call “merge” on some Aggregators to merge other Aggregators’ states into one. This occurs when a run of T objects that compare the same is split between threads.
5. Call “terminate” on each Aggregator that’s left, to get the final aggregate results.
6. Create an object of type AggregateValue<T> and pass it the first T of the run. Add aggregate results into the AggregateValue’s internal HashMap, keying on the Aggregator object.
If indicated, perform "super aggregation" by merging the results from previous Aggregators to create new AggregateValues. This is accomplished by calling "merge" on the Aggregators used in the previous step.
Return a List of AggregateValues back to the caller.

jAgg Analytics Algorithm

The Analytic class uses the same general algorithm no matter which method is called. This may help to understand what the AnalyticFunction interfaces's abstract methods are meant to accomplish.

For every analytic function, create a shallow copy of the given list of Ts.
Sort each of the list copies according to the specified partition and/or order by clauses, using an internal Comparator class ("PartitionAndOrderByComparator"). List copies may be re-used if one sort is good enough for another sort.
For each item to be processed, and each analytic function:

At the start, determine the next end of the partition for this particular analytic function.
Call "init" on the analytic function.
While the next item to be iterated is not in the window for the next item to get a value, call "terminate" on the analytic function and associate the value with the row terminated with an AnalyticValue.

If there was a value terminated, then while the start of the current window is not in the window of the next value to be terminated, call "delete" on the analytic function to remove the start of the window from the window, sliding the start of the window forward.

Call "iterate" on the analytic function, passing in the current value, sliding the end of the window forward.
If the end of the partition has been reached, then do the following repeatedly until all rows from the completed partition have values:

If the start of the window is within the next row to be terminated's window, then call "terminate" on the analytic function and associate the value with the row terminated with an AnalyticValue.
Else, call "delete" on the analytic function to remove the start of the window from the window, sliding the start of the window forward.

If there is another partition to process, then call "init" on the analytic function and determine the end of the next partition.

Return a List of AnalyticValues back to the caller.

Overview

Quick User Guide

API

Aggregators

Super Aggregation

Analytics

Double-Double Precision

Multiset Discrimination

Project Documentation

jAgg Aggregations Algorithm

jAgg Analytics Algorithm