Cube

One of the ways that jAgg supports super aggregation is by supporting data cubes. Normally, jAgg calculates aggregate values based on all "group by" properties specified in the Builder object. However, sometimes different levels of aggregation are desired. For example, different "slices" of the data can be desirable.

Once can simply call jAgg again, with less properties, to obtain the desired slices, but that would mean a separate pass through the data for each level of super aggregation. With cubes, and super aggregation in general, jAgg reuses the Aggregators that were used to calculate the original, normal aggregate values, in order to calculate the new super aggregate values.

Understanding Cube

When a cube is specified, jAgg needs to know which properties by which to create all different slices. jAgg expects 0-based indices to specify which properties. These indices refer to the original list of property names that was supplied to the Builder object. jAgg will create subtotals for every possible combination of grouping properties. Unlike rollups, a cube of n properties produces 2n - 1 extra grouping set combinations.

In this example, four property names were originally specified to jAgg. Without a cube, normal aggregation proceeds and here are the results:

List<String> properties = Arrays.asList("property1", "property2", "property3", "property4");
List<Aggregator> aggs = Arrays.asList(Aggregator.getAggregator("Sum(value)"));
Aggregation agg = new Aggregation.Builder()
   .setProperties(properties)
   .setAggregators(aggs)
   .build();
List<AggregateValue<TestRec>> aggValues = agg.groupBy(testRecords);
            
property1 property2 property3 property4 Sum(value)
A 1 red true 2
A 1 red false 3
A 1 green true 6
A 1 green false 10
A 2 red true 102
A 2 red false 103
A 2 green true 106
A 2 green false 110
B 1 red true 1002
B 1 red false 1003
B 1 green true 1006
B 1 green false 1010
B 2 red true 1102
B 2 red false 1103
B 2 green true 1106
B 2 green false 1110

Here are the new results when the following cube is specified: {2, 3}, corresponding to "property3", and "property4", respectively.

List<String> properties = Arrays.asList("property1", "property2", "property3", "property4");
List<Aggregator> aggs = Arrays.asList(Aggregator.getAggregator("Sum(value)"));
List<Integer> cube = Arrays.asList(2, 3);
Aggregation agg = new Aggregation.Builder()
   .setProperties(properties)
   .setAggregators(aggs)
   .setCube(cube)
   .build();
List<AggregateValue<TestRec>> aggValues = agg.groupBy(testRecords);
            
property1 property2 property3 property4 Sum(value)
A 1 red true 2
A 1 red false 3
A 1 green true 6
A 1 green false 10
A 2 red true 102
A 2 red false 103
A 2 green true 106
A 2 green false 110
B 1 red true 1002
B 1 red false 1003
B 1 green true 1006
B 1 green false 1010
B 2 red true 1102
B 2 red false 1103
B 2 green true 1106
B 2 green false 1110
A 1 red   5
A 1 green   16
A 2 red   205
A 2 green   216
B 1 red   2005
B 1 green   2016
B 2 red   2205
B 2 green   2216
A 1   true 8
A 1   false 13
A 2   true 208
A 2   false 213
B 1   true 2008
B 1   false 2013
B 2   true 2208
B 2   false 2213
A 1     21
A 2     421
B 1     4021
B 2     4421

Subtotals are calculated with every possible combination of properties. Note here that the order of the cube properties specified is unimportant, because every single combination is supplied, not matter what the order. Also, whenever a property is subtotalled, its value is null, meaning that the particular aggregate value represents "all values" for this property.

It is not possible to specify multiple cubes at once, because the results coming from cube(list1), cube(list2) are the same as those in cube(list1, list2).

Identifying Grouping Sets

If a certain property represents "all values", then the result from getPropertyValue for that property will be null. But what if null is the actual value being aggregated? jAgg tells these cases apart with the help of the methods isGrouping() and getGroupingId.

  • isGrouping(int field) - Determines whether the property referenced by the given 0-based index represents "all values". If true, then getPropertyValue(field) returns null and this is a super aggregate value. If false, then getPropertyValue(field) can return any value, including null, and this aggregate value does not represent "all values" for this property.
  • isGrouping(String propertyName) - Determines whether the given property represents "all values". If true, then getPropertyValue(field) returns null and this is a super aggregate value. If false, then getPropertyValue(field) can return any value, including null, and this aggregate value does not represent "all values" for this property.
  • getGroupingId(List<?> fields) - Creates a distinct integer grouping set ID based on the referenced fields, which may be 0-based integer references or property name strings, or both. Every aggregate value that has the same properties representing "all values" has the same integer ID.

Here is the same example as above, but with the above method call results included.

property1 property2 property3 property4 Sum(value) isGrouping(0) isGrouping(1) isGrouping(2) isGrouping(3) getGroupingId({0, 1}) getGroupingId({0, 1, 2, 3})
A 1 red true 2 false false false false 0 0
A 1 red false 3 false false false false 0 0
A 1 green true 6 false false false false 0 0
A 1 green false 10 false false false false 0 0
A 2 red true 102 false false false false 0 0
A 2 red false 103 false false false false 0 0
A 2 green true 106 false false false false 0 0
A 2 green false 110 false false false false 0 0
B 1 red true 1002 false false false false 0 0
B 1 red false 1003 false false false false 0 0
B 1 green true 1006 false false false false 0 0
B 1 green false 1010 false false false false 0 0
B 2 red true 1102 false false false false 0 0
B 2 red false 1103 false false false false 0 0
B 2 green true 1106 false false false false 0 0
B 2 green false 1110 false false false false 0 0
A 1 red   5 false false false true 0 1
A 1 green   16 false false false true 0 1
A 2 red   205 false false false true 0 1
A 2 green   216 false false false true 0 1
B 1 red   2005 false false false true 0 1
B 1 green   2016 false false false true 0 1
B 2 red   2205 false false false true 0 1
B 2 green   2216 false false false true 0 1
A 1   true 8 false false true false 0 2
A 1   false 13 false false true false 0 2
A 2   true 208 false false true false 0 2
A 2   false 213 false false true false 0 2
B 1   true 2008 false false true false 0 2
B 1   false 2013 false false true false 0 2
B 2   true 2208 false false true false 0 2
B 2   false 2213 false false true false 0 2
A 1     21 false false true true 0 3
A 2     421 false false true true 0 3
B 1     4021 false false true true 0 3
B 2     4421 false false true true 0 3