CumeDistAnalytic

A CumeDistAnalytic is an AnalyticFunction that determines the cumulative distribution over any values, returning the cumulative distribution as a Double. Values range from 0.0 (exclusive) through 1.0 (inclusive). Distinct order by values result in evenly spaced values, e.g. 0.2 0.4 0.6 0.8 1.0. If there are equivalent order by values, then all equivalent values receive the same value, e.g. 0.2 0.6 0.6 0.8 1.0.

Usage

Create and use a CumeDistAnalytic, with one of the following methods:

  • AnalyticAggregator ana = new AnalyticAggregator.Builder()
        .setAnalyticFunction(new CumeDistAnalytic())
        .setPartition(new PartitionClause(Arrays.asList("category1")))
        .setOrderBy(new OrderByClause(Arrays.asList(new OrderByElement("value2", OrderByElement.SortDir.DESC))))
        .build();
    
  • AnalyticAggregator agg = AnalyticAggregator.getAnalytic("CumeDist() partitionBy(category1) orderBy(value2 DESC)");
    

CumeDistAnalytic ignores any property passed in.

No row will receive a value of 0.0, but all values that compare equal to the last row according to the order by clause will receive 1.0.

CumeDistAnalytic is a DependentAnalyticFunction that computes Count(*) range(, 0) / Count(*) range(), and it does not take a user-given window clause.