customPartitioner, inputDataSet, keys
Constructor and Description |
---|
UnsortedGrouping(DataSet<T> set,
Keys<T> keys) |
Modifier and Type | Method and Description |
---|---|
AggregateOperator<T> |
aggregate(Aggregations agg,
int field)
|
<R> GroupCombineOperator<T,R> |
combineGroup(GroupCombineFunction<T,R> combiner)
Applies a GroupCombineFunction on a grouped
DataSet . |
GroupReduceOperator<T,T> |
first(int n)
Returns a new set containing the first n elements in this grouped
DataSet . |
AggregateOperator<T> |
max(int field)
Syntactic sugar for aggregate (MAX, field).
|
ReduceOperator<T> |
maxBy(int... fields)
Applies a special case of a reduce transformation (maxBy) on a grouped
DataSet . |
AggregateOperator<T> |
min(int field)
Syntactic sugar for aggregate (MIN, field).
|
ReduceOperator<T> |
minBy(int... fields)
Applies a special case of a reduce transformation (minBy) on a grouped
DataSet . |
ReduceOperator<T> |
reduce(ReduceFunction<T> reducer)
Applies a Reduce transformation on a grouped
DataSet . |
<R> GroupReduceOperator<T,R> |
reduceGroup(GroupReduceFunction<T,R> reducer)
Applies a GroupReduce transformation on a grouped
DataSet . |
SortedGrouping<T> |
sortGroup(int field,
Order order)
|
<K> SortedGrouping<T> |
sortGroup(KeySelector<T,K> keySelector,
Order order)
Sorts elements within a group on a key extracted by the specified
KeySelector in the specified Order . |
SortedGrouping<T> |
sortGroup(String field,
Order order)
Sorts Pojos within a group on the specified field in the specified
Order . |
AggregateOperator<T> |
sum(int field)
Syntactic sugar for aggregate (SUM, field).
|
UnsortedGrouping<T> |
withPartitioner(Partitioner<?> partitioner)
Uses a custom partitioner for the grouping.
|
getCustomPartitioner, getInputDataSet, getKeys
public UnsortedGrouping<T> withPartitioner(Partitioner<?> partitioner)
partitioner
- The custom partitioner.public AggregateOperator<T> aggregate(Aggregations agg, int field)
Tuple
DataSet
.
Note: Only Tuple DataSets can be aggregated. The transformation applies a built-in
Aggregation
on a specified field of a Tuple group. Additional
aggregation functions can be added to the resulting AggregateOperator
by calling
AggregateOperator.and(Aggregations, int)
.
agg
- The built-in aggregation function that is computed.field
- The index of the Tuple field on which the aggregation function is applied.Tuple
,
Aggregations
,
AggregateOperator
,
DataSet
public AggregateOperator<T> sum(int field)
field
- The index of the Tuple field on which the aggregation function is applied.AggregateOperator
public AggregateOperator<T> max(int field)
field
- The index of the Tuple field on which the aggregation function is applied.AggregateOperator
public AggregateOperator<T> min(int field)
field
- The index of the Tuple field on which the aggregation function is applied.AggregateOperator
public ReduceOperator<T> reduce(ReduceFunction<T> reducer)
DataSet
.
For each group, the transformation consecutively calls a RichReduceFunction
until only a single element for
each group remains. A ReduceFunction combines two elements into one new element of the same
type.
reducer
- The ReduceFunction that is applied on each group of the DataSet.RichReduceFunction
,
ReduceOperator
,
DataSet
public <R> GroupReduceOperator<T,R> reduceGroup(GroupReduceFunction<T,R> reducer)
DataSet
.
The transformation calls a RichGroupReduceFunction
for each group of the DataSet.
A GroupReduceFunction can iterate over all elements of a group and emit any number of output
elements including none.
reducer
- The GroupReduceFunction that is applied on each group of the DataSet.RichGroupReduceFunction
,
GroupReduceOperator
,
DataSet
public <R> GroupCombineOperator<T,R> combineGroup(GroupCombineFunction<T,R> combiner)
DataSet
. A GroupCombineFunction is
similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the
CombineFunction calls the combine method once per partition for combining a group of results.
This operator is suitable for combining values into an intermediate format before doing a
proper groupReduce where the data is shuffled across the node for further reduction. The
GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce
function. The combine method of the RichGroupReduce function demands input and output type to
be the same. The CombineFunction, on the other side, can have an arbitrary output type.combiner
- The GroupCombineFunction that is applied on the DataSet.public GroupReduceOperator<T,T> first(int n)
DataSet
.n
- The desired number of elements for each group.public ReduceOperator<T> minBy(int... fields)
DataSet
.
The transformation consecutively calls a ReduceFunction
until only a single
element remains which is the result of the transformation. A ReduceFunction combines two
elements into one new element of the same type.
fields
- Keys taken into account for finding the minimum.ReduceOperator
representing the minimum.public ReduceOperator<T> maxBy(int... fields)
DataSet
.
The transformation consecutively calls a ReduceFunction
until only a single
element remains which is the result of the transformation. A ReduceFunction combines two
elements into one new element of the same type.
fields
- Keys taken into account for finding the minimum.ReduceOperator
representing the minimum.public SortedGrouping<T> sortGroup(int field, Order order)
Tuple
elements within a group on the specified
field in the specified Order
.
Note: Only groups of Tuple elements and Pojos can be sorted.
Groups can be sorted by multiple fields by chaining sortGroup(int, Order)
calls.
public SortedGrouping<T> sortGroup(String field, Order order)
Order
.
Note: Only groups of Tuple elements and Pojos can be sorted.
Groups can be sorted by multiple fields by chaining sortGroup(String, Order)
calls.
field
- The Tuple or Pojo field on which the group is sorted.order
- The Order in which the specified field is sorted.Order
public <K> SortedGrouping<T> sortGroup(KeySelector<T,K> keySelector, Order order)
KeySelector
in the specified Order
.
Chaining sortGroup(KeySelector, Order)
calls is not supported.
keySelector
- The KeySelector with which the group is sorted.order
- The Order in which the extracted key is sorted.Order
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.