customPartitioner, inputDataSet, keys
Constructor and Description |
---|
SortedGrouping(DataSet<T> set,
Keys<T> keys,
int field,
Order order) |
SortedGrouping(DataSet<T> set,
Keys<T> keys,
Keys.SelectorFunctionKeys<T,K> keySelector,
Order order) |
SortedGrouping(DataSet<T> set,
Keys<T> keys,
String field,
Order order) |
Modifier and Type | Method and Description |
---|---|
<R> GroupCombineOperator<T,R> |
combineGroup(GroupCombineFunction<T,R> combiner)
Applies a GroupCombineFunction on a grouped
DataSet . |
GroupReduceOperator<T,T> |
first(int n)
Returns a new set containing the first n elements in this grouped and sorted
DataSet . |
protected Ordering |
getGroupOrdering() |
protected int[] |
getGroupSortKeyPositions() |
protected Order[] |
getGroupSortOrders() |
protected Keys.SelectorFunctionKeys<T,?> |
getSortSelectionFunctionKey() |
<R> GroupReduceOperator<T,R> |
reduceGroup(GroupReduceFunction<T,R> reducer)
Applies a GroupReduce transformation on a grouped and sorted
DataSet . |
SortedGrouping<T> |
sortGroup(int field,
Order order)
|
SortedGrouping<T> |
sortGroup(String field,
Order order)
|
SortedGrouping<T> |
withPartitioner(Partitioner<?> partitioner)
Uses a custom partitioner for the grouping.
|
getCustomPartitioner, getInputDataSet, getKeys
protected int[] getGroupSortKeyPositions()
protected Order[] getGroupSortOrders()
protected Ordering getGroupOrdering()
public SortedGrouping<T> withPartitioner(Partitioner<?> partitioner)
partitioner
- The custom partitioner.protected Keys.SelectorFunctionKeys<T,?> getSortSelectionFunctionKey()
public <R> GroupReduceOperator<T,R> reduceGroup(GroupReduceFunction<T,R> reducer)
DataSet
.
The transformation calls a RichGroupReduceFunction
for each group of the DataSet.
A GroupReduceFunction can iterate over all elements of a group and emit any
number of output elements including none.
reducer
- The GroupReduceFunction that is applied on each group of the DataSet.RichGroupReduceFunction
,
GroupReduceOperator
,
DataSet
public <R> GroupCombineOperator<T,R> combineGroup(GroupCombineFunction<T,R> combiner)
DataSet
.
A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the
CombineFunction calls the combine method once per partition for combining a group of results. This
operator is suitable for combining values into an intermediate format before doing a proper groupReduce where
the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with
a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function
demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary
output type.combiner
- The GroupCombineFunction that is applied on the DataSet.public GroupReduceOperator<T,T> first(int n)
DataSet
.n
- The desired number of elements for each group.public SortedGrouping<T> sortGroup(int field, Order order)
Tuple
elements within a group on the specified field in the specified Order
.
Note: Only groups of Tuple or Pojo elements can be sorted.
Groups can be sorted by multiple fields by chaining sortGroup(int, Order)
calls.
public SortedGrouping<T> sortGroup(String field, Order order)
Tuple
or POJO elements within a group on the specified field in the specified Order
.
Note: Only groups of Tuple or Pojo elements can be sorted.
Groups can be sorted by multiple fields by chaining sortGroup(String, Order)
calls.
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.