GroupedDataSet (flink 1.0-SNAPSHOT API)

java.lang.Object
- org.apache.flink.api.scala.GroupedDataSet<T>

```
public class GroupedDataSet<T>
extends Object
```
A DataSet to which a grouping key was added. Operations work on groups of elements with the same key (aggregate, reduce, and reduceGroup).
A secondary sort order can be added with sortGroup, but this is only used when using one of the group-at-a-time operations, i.e. reduceGroup.

Constructor Summary

Constructors
Constructor and Description

GroupedDataSet(DataSet<T> set, Keys<T> keys, scala.reflect.ClassTag<T> evidence$1)

Constructors
Constructor and Description
`GroupedDataSet(DataSet<T> set, Keys<T> keys, scala.reflect.ClassTag<T> evidence$1)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`AggregateDataSet<T>`	`aggregate(Aggregations agg, int field)` Creates a new `DataSet` by aggregating the specified field using the given aggregation function.
`AggregateDataSet<T>`	`aggregate(Aggregations agg, String field)` Creates a new `DataSet` by aggregating the specified tuple field using the given aggregation function.
`<R> DataSet<R>`	`combineGroup(scala.Function2<scala.collection.Iterator<T>,Collector<R>,scala.runtime.BoxedUnit> fun, TypeInformation<R> evidence$10, scala.reflect.ClassTag<R> evidence$11)` Applies a CombineFunction on a grouped `DataSet`.
`<R> DataSet<R>`	`combineGroup(GroupCombineFunction<T,R> combiner, TypeInformation<R> evidence$12, scala.reflect.ClassTag<R> evidence$13)` Applies a CombineFunction on a grouped `DataSet`.
`DataSet<T>`	`first(int n)` Creates a new DataSet containing the first `n` elements of each group of this DataSet.
`<K> Partitioner<K>`	`getCustomPartitioner()` Gets the custom partitioner to be used for this grouping, or null, if none was defined.
`AggregateDataSet<T>`	`max(int field)` Syntactic sugar for `aggregate` with `MAX`
`AggregateDataSet<T>`	`max(String field)` Syntactic sugar for `aggregate` with `MAX`
`AggregateDataSet<T>`	`min(int field)` Syntactic sugar for `aggregate` with `MIN`
`AggregateDataSet<T>`	`min(String field)` Syntactic sugar for `aggregate` with `MIN`
`DataSet<T>`	`reduce(scala.Function2<T,T,T> fun)` Creates a new `DataSet` by merging the elements of each group (elements with the same key) using an associative reduce function.
`DataSet<T>`	`reduce(ReduceFunction<T> reducer)` Creates a new `DataSet` by merging the elements of each group (elements with the same key) using an associative reduce function.
`<R> DataSet<R>`	`reduceGroup(scala.Function1<scala.collection.Iterator<T>,R> fun, TypeInformation<R> evidence$4, scala.reflect.ClassTag<R> evidence$5)` Creates a new `DataSet` by passing for each group (elements with the same key) the list of elements to the group reduce function.
`<R> DataSet<R>`	`reduceGroup(scala.Function2<scala.collection.Iterator<T>,Collector<R>,scala.runtime.BoxedUnit> fun, TypeInformation<R> evidence$6, scala.reflect.ClassTag<R> evidence$7)` Creates a new `DataSet` by passing for each group (elements with the same key) the list of elements to the group reduce function.
`<R> DataSet<R>`	`reduceGroup(GroupReduceFunction<T,R> reducer, TypeInformation<R> evidence$8, scala.reflect.ClassTag<R> evidence$9)` Creates a new `DataSet` by passing for each group (elements with the same key) the list of elements to the `GroupReduceFunction`.
`<K> GroupedDataSet<T>`	`sortGroup(scala.Function1<T,K> fun, Order order, TypeInformation<K> evidence$2)` Adds a secondary sort key to this `GroupedDataSet`.
`GroupedDataSet<T>`	`sortGroup(int field, Order order)` Adds a secondary sort key to this `GroupedDataSet`.
`GroupedDataSet<T>`	`sortGroup(String field, Order order)` Adds a secondary sort key to this `GroupedDataSet`.
`AggregateDataSet<T>`	`sum(int field)` Syntactic sugar for `aggregate` with `SUM`
`AggregateDataSet<T>`	`sum(String field)` Syntactic sugar for `aggregate` with `SUM`
`<K> GroupedDataSet<T>`	`withPartitioner(Partitioner<K> partitioner, TypeInformation<K> evidence$3)` Sets a custom partitioner for the grouping.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - GroupedDataSet
```
public GroupedDataSet(DataSet<T> set,
                      Keys<T> keys,
                      scala.reflect.ClassTag<T> evidence$1)
```
- Method Detail
  - sortGroup
```
public GroupedDataSet<T> sortGroup(int field,
                                   Order order)
```
    Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.
    This only works on Tuple DataSets.
  - sortGroup
```
public GroupedDataSet<T> sortGroup(String field,
                                   Order order)
```
    Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.
    This only works on CaseClass DataSets.
  - sortGroup
```
public <K> GroupedDataSet<T> sortGroup(scala.Function1<T,K> fun,
                                       Order order,
                                       TypeInformation<K> evidence$2)
```
    Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.
    This works on any data type.
  - withPartitioner
```
public <K> GroupedDataSet<T> withPartitioner(Partitioner<K> partitioner,
                                             TypeInformation<K> evidence$3)
```
    Sets a custom partitioner for the grouping.
  - getCustomPartitioner
```
public <K> Partitioner<K> getCustomPartitioner()
```
    Gets the custom partitioner to be used for this grouping, or null, if none was defined.
  - aggregate
```
public AggregateDataSet<T> aggregate(Aggregations agg,
                                     String field)
```
    Creates a new DataSet by aggregating the specified tuple field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of tuples with the same key.
    This only works on Tuple DataSets.
  - aggregate
```
public AggregateDataSet<T> aggregate(Aggregations agg,
                                     int field)
```
    Creates a new DataSet by aggregating the specified field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of elements with the same key.
    This only works on CaseClass DataSets.
  - sum
```
public AggregateDataSet<T> sum(int field)
```
    Syntactic sugar for aggregate with SUM
  - max
```
public AggregateDataSet<T> max(int field)
```
    Syntactic sugar for aggregate with MAX
  - min
```
public AggregateDataSet<T> min(int field)
```
    Syntactic sugar for aggregate with MIN
  - sum
```
public AggregateDataSet<T> sum(String field)
```
    Syntactic sugar for aggregate with SUM
  - max
```
public AggregateDataSet<T> max(String field)
```
    Syntactic sugar for aggregate with MAX
  - min
```
public AggregateDataSet<T> min(String field)
```
    Syntactic sugar for aggregate with MIN
  - reduce
```
public DataSet<T> reduce(scala.Function2<T,T,T> fun)
```
    Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
  - reduce
```
public DataSet<T> reduce(ReduceFunction<T> reducer)
```
    Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
  - reduceGroup
```
public <R> DataSet<R> reduceGroup(scala.Function1<scala.collection.Iterator<T>,R> fun,
                                  TypeInformation<R> evidence$4,
                                  scala.reflect.ClassTag<R> evidence$5)
```
    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function. The function must output one element. The concatenation of those will form the resulting DataSet.
  - reduceGroup
```
public <R> DataSet<R> reduceGroup(scala.Function2<scala.collection.Iterator<T>,Collector<R>,scala.runtime.BoxedUnit> fun,
                                  TypeInformation<R> evidence$6,
                                  scala.reflect.ClassTag<R> evidence$7)
```
    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function. The function can output zero or more elements using the Collector. The concatenation of the emitted values will form the resulting DataSet.
  - reduceGroup
```
public <R> DataSet<R> reduceGroup(GroupReduceFunction<T,R> reducer,
                                  TypeInformation<R> evidence$8,
                                  scala.reflect.ClassTag<R> evidence$9)
```
    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction. The function can output zero or more elements. The concatenation of the emitted values will form the resulting DataSet.
  - combineGroup
```
public <R> DataSet<R> combineGroup(scala.Function2<scala.collection.Iterator<T>,Collector<R>,scala.runtime.BoxedUnit> fun,
                                   TypeInformation<R> evidence$10,
                                   scala.reflect.ClassTag<R> evidence$11)
```
    Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
  - combineGroup
```
public <R> DataSet<R> combineGroup(GroupCombineFunction<T,R> combiner,
                                   TypeInformation<R> evidence$12,
                                   scala.reflect.ClassTag<R> evidence$13)
```
    Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
  - first
```
public DataSet<T> first(int n)
```
    Creates a new DataSet containing the first n elements of each group of this DataSet.

Back to Flink Website

Class GroupedDataSet<T>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

GroupedDataSet

Method Detail

sortGroup

sortGroup

sortGroup

withPartitioner

getCustomPartitioner

aggregate

aggregate

sum

max

min

sum

max

min

reduce

reduce

reduceGroup

reduceGroup

reduceGroup

combineGroup

combineGroup

first

Back to Flink Website