T
- The type of the aggregated value.@PublicEvolving public interface Aggregator<T extends Value> extends Serializable
ConvergenceCriterion
. In contrast to the
Accumulator
(whose result is available at the
end of a job, the aggregators are computed once per iteration superstep. Their value can be used
to check for convergence (at the end of the iteration superstep) and it can be accessed in the
next iteration superstep.
Aggregators must be registered at the iteration inside which they are used via the function. In the Java API, the method is "IterativeDataSet.registerAggregator(...)" or "IterativeDataSet.registerAggregationConvergenceCriterion(...)" when using the aggregator together with a convergence criterion. Aggregators are always registered under a name. That name can be used to access the aggregator at runtime from within a function. The following code snippet shows a typical case. Here, it count across all parallel instances how many elements are filtered out by a function.
// the user-defined function public class MyFilter extends FilterFunction<Double> { private LongSumAggregator agg; public void open(Configuration parameters) { agg = getIterationRuntimeContext().getIterationAggregator("numFiltered"); } public boolean filter (Double value) { if (value > 1000000.0) { agg.aggregate(1); return false } return true; } } // the iteration where the aggregator is registered IterativeDataSet<Double> iteration = input.iterate(100).registerAggregator("numFiltered", LongSumAggregator.class); ... DataSet<Double> filtered = someIntermediateResult.filter(new MyFilter); ... DataSet<Double> result = iteration.closeWith(filtered); ...
Aggregators must be distributive: An aggregator must be able to pre-aggregate values and it must be able to aggregate these pre-aggregated values to form the final aggregate. Many aggregation functions fulfill this condition (sum, min, max) and others can be brought into that form: One can expressing count as a sum over values of one, and one can express average through a sum and a count.
Modifier and Type | Method and Description |
---|---|
void |
aggregate(T element)
Aggregates the given element.
|
T |
getAggregate()
Gets the aggregator's current aggregate.
|
void |
reset()
Resets the internal state of the aggregator.
|
T getAggregate()
void aggregate(T element)
element
- The element to aggregate.void reset()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.