T
- The data type of set that is the input and feedback of the iteration.@Deprecated @Public public class IterativeDataSet<T> extends SingleInputOperator<T,T,IterativeDataSet<T>>
DataSet.iterate(int)
method.minResources, name, parallelism, preferredResources
Constructor and Description |
---|
IterativeDataSet(ExecutionEnvironment context,
TypeInformation<T> type,
DataSet<T> input,
int maxIterations)
Deprecated.
|
Modifier and Type | Method and Description |
---|---|
DataSet<T> |
closeWith(DataSet<T> iterationResult)
Deprecated.
Closes the iteration.
|
DataSet<T> |
closeWith(DataSet<T> iterationResult,
DataSet<?> terminationCriterion)
Deprecated.
Closes the iteration and specifies a termination criterion.
|
AggregatorRegistry |
getAggregators()
Deprecated.
Gets the registry for aggregators.
|
int |
getMaxIterations()
Deprecated.
Gets the maximum number of iterations.
|
<X extends Value> |
registerAggregationConvergenceCriterion(String name,
Aggregator<X> aggregator,
ConvergenceCriterion<X> convergenceCheck)
Deprecated.
Registers an
Aggregator for the iteration together with a ConvergenceCriterion . |
IterativeDataSet<T> |
registerAggregator(String name,
Aggregator<?> aggregator)
Deprecated.
Registers an
Aggregator for the iteration. |
protected SingleInputOperator<T,T,?> |
translateToDataFlow(Operator<T> input)
Deprecated.
Translates this operation to a data flow operator of the common data flow API.
|
getInput, getInputType
getMinResources, getName, getParallelism, getPreferredResources, getResultType, name, setParallelism
aggregate, checkSameExecutionContext, clean, coGroup, collect, combineGroup, count, cross, crossWithHuge, crossWithTiny, distinct, distinct, distinct, distinct, fillInType, filter, first, flatMap, fullOuterJoin, fullOuterJoin, getExecutionEnvironment, getType, groupBy, groupBy, groupBy, iterate, iterateDelta, join, join, joinWithHuge, joinWithTiny, leftOuterJoin, leftOuterJoin, map, mapPartition, max, maxBy, min, minBy, output, partitionByHash, partitionByHash, partitionByHash, partitionByRange, partitionByRange, partitionByRange, partitionCustom, partitionCustom, partitionCustom, print, print, printOnTaskManager, printToErr, printToErr, project, rebalance, reduce, reduceGroup, rightOuterJoin, rightOuterJoin, runOperation, sortPartition, sortPartition, sortPartition, sum, union, write, write, writeAsCsv, writeAsCsv, writeAsCsv, writeAsCsv, writeAsFormattedText, writeAsFormattedText, writeAsText, writeAsText
public IterativeDataSet(ExecutionEnvironment context, TypeInformation<T> type, DataSet<T> input, int maxIterations)
public DataSet<T> closeWith(DataSet<T> iterationResult)
iterationResult
- The data set that will be fed back to the next iteration.DataSet.iterate(int)
public DataSet<T> closeWith(DataSet<T> iterationResult, DataSet<?> terminationCriterion)
The termination criterion is a means of dynamically signaling the iteration to halt. It is expressed via a data set that will trigger to halt the loop as soon as the data set is empty. A typical way of using the termination criterion is to have a filter that filters out all elements that are considered non-converged. As soon as no more such elements exist, the iteration finishes.
iterationResult
- The data set that will be fed back to the next iteration.terminationCriterion
- The data set that being used to trigger halt on operation once it
is empty.DataSet.iterate(int)
public int getMaxIterations()
@PublicEvolving public IterativeDataSet<T> registerAggregator(String name, Aggregator<?> aggregator)
Aggregator
for the iteration. Aggregators can be used to maintain simple
statistics during the iteration, such as number of elements processed. The aggregators
compute global aggregates: After each iteration step, the values are globally aggregated to
produce one aggregate that represents statistics across all parallel instances. The value of
an aggregator can be accessed in the next iteration.
Aggregators can be accessed inside a function via the AbstractRichFunction.getIterationRuntimeContext()
method.
name
- The name under which the aggregator is registered.aggregator
- The aggregator class.@PublicEvolving public <X extends Value> IterativeDataSet<T> registerAggregationConvergenceCriterion(String name, Aggregator<X> aggregator, ConvergenceCriterion<X> convergenceCheck)
Aggregator
for the iteration together with a ConvergenceCriterion
. For a general description of aggregators, see registerAggregator(String, Aggregator)
and Aggregator
. At the end of each
iteration, the convergence criterion takes the aggregator's global aggregate value and
decided whether the iteration should terminate. A typical use case is to have an aggregator
that sums up the total error of change in an iteration step and have to have a convergence
criterion that signals termination as soon as the aggregate value is below a certain
threshold.name
- The name under which the aggregator is registered.aggregator
- The aggregator class.convergenceCheck
- The convergence criterion.@PublicEvolving public AggregatorRegistry getAggregators()
Aggregator
s and an
aggregator-based ConvergenceCriterion
. This method offers an alternative way to
registering the aggregators via registerAggregator(String, Aggregator)
and registerAggregationConvergenceCriterion(String, Aggregator, ConvergenceCriterion)
.protected SingleInputOperator<T,T,?> translateToDataFlow(Operator<T> input)
SingleInputOperator
translateToDataFlow
in class SingleInputOperator<T,T,IterativeDataSet<T>>
input
- The data flow operator that produces this operation's input data.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.