class GroupedDataSet[T] extends AnyRef
A DataSet to which a grouping key was added. Operations work on groups of elements with the
same key (aggregate
, reduce
, and reduceGroup
).
A secondary sort order can be added with sortGroup, but this is only used when using one of the
group-at-a-time operations, i.e. reduceGroup
.
- Annotations
- @deprecated @Public()
- Deprecated
(Since version 1.18.0)
- See also
- Alphabetic
- By Inheritance
- GroupedDataSet
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
GroupedDataSet(set: DataSet[T], keys: Keys[T])(implicit arg0: ClassTag[T])
- Deprecated
All Flink Scala APIs are deprecated and will be removed in a future Flink major version. You can still build your application in Scala, but you should move to the Java version of either the DataStream and/or Table API.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
aggregate(agg: Aggregations, field: Int): AggregateDataSet[T]
Creates a new DataSet by aggregating the specified field using the given aggregation function.
Creates a new DataSet by aggregating the specified field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of elements with the same key.
This only works on CaseClass DataSets.
-
def
aggregate(agg: Aggregations, field: String): AggregateDataSet[T]
Creates a new DataSet by aggregating the specified tuple field using the given aggregation function.
Creates a new DataSet by aggregating the specified tuple field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of tuples with the same key.
This only works on Tuple DataSets.
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
def
combineGroup[R](combiner: GroupCombineFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]
Applies a CombineFunction on a grouped DataSet.
Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
-
def
combineGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]
Applies a CombineFunction on a grouped DataSet.
Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
first(n: Int): DataSet[T]
Creates a new DataSet containing the first
n
elements of each group of this DataSet. -
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getCustomPartitioner[K](): Partitioner[K]
Gets the custom partitioner to be used for this grouping, or null, if none was defined.
Gets the custom partitioner to be used for this grouping, or null, if none was defined.
- Annotations
- @Internal()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
max(field: String): AggregateDataSet[T]
Syntactic sugar for aggregate with
MAX
-
def
max(field: Int): AggregateDataSet[T]
Syntactic sugar for aggregate with
MAX
-
def
maxBy(fields: Int*): DataSet[T]
Applies a special case of a reduce transformation
maxBy
on a grouped DataSet The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation.Applies a special case of a reduce transformation
maxBy
on a grouped DataSet The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation. A ReduceFunction combines two elements into one new element of the same type. -
def
min(field: String): AggregateDataSet[T]
Syntactic sugar for aggregate with
MIN
-
def
min(field: Int): AggregateDataSet[T]
Syntactic sugar for aggregate with
MIN
-
def
minBy(fields: Int*): DataSet[T]
Applies a special case of a reduce transformation
minBy
on a grouped DataSet.Applies a special case of a reduce transformation
minBy
on a grouped DataSet. The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation. A ReduceFunction combines two elements into one new element of the same type. -
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
reduce(reducer: ReduceFunction[T], strategy: CombineHint): DataSet[T]
Special reduce operation for explicitly telling the system what strategy to use for the combine phase.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.
- Annotations
- @PublicEvolving()
-
def
reduce(reducer: ReduceFunction[T]): DataSet[T]
Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
-
def
reduce(fun: (T, T) ⇒ T, strategy: CombineHint): DataSet[T]
Special reduce operation for explicitly telling the system what strategy to use for the combine phase.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.
- Annotations
- @PublicEvolving()
-
def
reduce(fun: (T, T) ⇒ T): DataSet[T]
Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
-
def
reduceGroup[R](reducer: GroupReduceFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction.
-
def
reduceGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.
-
def
reduceGroup[R](fun: (Iterator[T]) ⇒ R)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.
-
def
sortGroup[K](fun: (T) ⇒ K, order: Order)(implicit arg0: TypeInformation[K]): GroupedDataSet[T]
Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e.
reduceGroup
.This works on any data type.
-
def
sortGroup(field: String, order: Order): GroupedDataSet[T]
Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e.
reduceGroup
.This only works on CaseClass DataSets.
-
def
sortGroup(field: Int, order: Order): GroupedDataSet[T]
Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e.
reduceGroup
.This only works on Tuple DataSets.
-
def
sum(field: String): AggregateDataSet[T]
Syntactic sugar for aggregate with
SUM
-
def
sum(field: Int): AggregateDataSet[T]
Syntactic sugar for aggregate with
SUM
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
def
withPartitioner[K](partitioner: Partitioner[K])(implicit arg0: TypeInformation[K]): GroupedDataSet[T]
Sets a custom partitioner for the grouping.