Class KeyedStream<T,KEY>
- java.lang.Object
-
- org.apache.flink.streaming.api.datastream.DataStream<T>
-
- org.apache.flink.streaming.api.datastream.KeyedStream<T,KEY>
-
- Type Parameters:
T
- The type of the elements in the Keyed Stream.KEY
- The type of the key in the Keyed Stream.
@Public public class KeyedStream<T,KEY> extends DataStream<T>
AKeyedStream
represents aDataStream
on which operator state is partitioned by key using a providedKeySelector
. Typical operations supported by aDataStream
are also possible on aKeyedStream
, with the exception of partitioning methods such as shuffle, forward and keyBy.Reduce-style operations, such as
reduce(org.apache.flink.api.common.functions.ReduceFunction<T>)
, andsum(int)
work on elements that have the same key.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
KeyedStream.IntervalJoin<T1,T2,KEY>
Perform a join over a time interval.static class
KeyedStream.IntervalJoined<IN1,IN2,KEY>
IntervalJoined is a container for two streams that have keys for both sides as well as the time boundaries over which elements should be joined.-
Nested classes/interfaces inherited from class org.apache.flink.streaming.api.datastream.DataStream
DataStream.Collector<T>
-
-
Field Summary
-
Fields inherited from class org.apache.flink.streaming.api.datastream.DataStream
environment, transformation
-
-
Constructor Summary
Constructors Constructor Description KeyedStream(DataStream<T> dataStream, KeySelector<T,KEY> keySelector)
Creates a newKeyedStream
using the givenKeySelector
to partition operator state by key.KeyedStream(DataStream<T> dataStream, KeySelector<T,KEY> keySelector, TypeInformation<KEY> keyType)
Creates a newKeyedStream
using the givenKeySelector
to partition operator state by key.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description DataStreamSink<T>
addSink(SinkFunction<T> sinkFunction)
Adds the given sink to this DataStream.protected SingleOutputStreamOperator<T>
aggregate(AggregationFunction<T> aggregate)
QueryableStateStream<KEY,T>
asQueryableState(String queryableStateName)
Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.QueryableStateStream<KEY,T>
asQueryableState(String queryableStateName, ReducingStateDescriptor<T> stateDescriptor)
Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.QueryableStateStream<KEY,T>
asQueryableState(String queryableStateName, ValueStateDescriptor<T> stateDescriptor)
Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.WindowedStream<T,KEY,GlobalWindow>
countWindow(long size)
Windows thisKeyedStream
into tumbling count windows.WindowedStream<T,KEY,GlobalWindow>
countWindow(long size, long slide)
Windows thisKeyedStream
into sliding count windows.protected <R> SingleOutputStreamOperator<R>
doTransform(String operatorName, TypeInformation<R> outTypeInfo, StreamOperatorFactory<R> operatorFactory)
PartitionWindowedStream<T>
fullWindowPartition()
Collect records from each partition into a separate full window.KeySelector<T,KEY>
getKeySelector()
Gets the key selector that can get the key by which the stream if partitioned from the elements.TypeInformation<KEY>
getKeyType()
Gets the type of the key by which the stream is partitioned.<T1> KeyedStream.IntervalJoin<T,T1,KEY>
intervalJoin(KeyedStream<T1,KEY> otherStream)
Join elements of thisKeyedStream
with elements of anotherKeyedStream
over a time interval that can be specified withKeyedStream.IntervalJoin.between(Duration, Duration)
.SingleOutputStreamOperator<T>
max(int positionToMax)
Applies an aggregation that gives the current maximum of the data stream at the given position by the given key.SingleOutputStreamOperator<T>
max(String field)
Applies an aggregation that gives the current maximum of the data stream at the given field expression by the given key.SingleOutputStreamOperator<T>
maxBy(int positionToMaxBy)
Applies an aggregation that gives the current element with the maximum value at the given position by the given key.SingleOutputStreamOperator<T>
maxBy(int positionToMaxBy, boolean first)
Applies an aggregation that gives the current element with the maximum value at the given position by the given key.SingleOutputStreamOperator<T>
maxBy(String positionToMaxBy)
Applies an aggregation that gives the current element with the maximum value at the given position by the given key.SingleOutputStreamOperator<T>
maxBy(String field, boolean first)
Applies an aggregation that gives the current maximum element of the data stream by the given field expression by the given key.SingleOutputStreamOperator<T>
min(int positionToMin)
Applies an aggregation that gives the current minimum of the data stream at the given position by the given key.SingleOutputStreamOperator<T>
min(String field)
Applies an aggregation that gives the current minimum of the data stream at the given field expression by the given key.SingleOutputStreamOperator<T>
minBy(int positionToMinBy)
Applies an aggregation that gives the current element with the minimum value at the given position by the given key.SingleOutputStreamOperator<T>
minBy(int positionToMinBy, boolean first)
Applies an aggregation that gives the current element with the minimum value at the given position by the given key.SingleOutputStreamOperator<T>
minBy(String positionToMinBy)
Applies an aggregation that gives the current element with the minimum value at the given position by the given key.SingleOutputStreamOperator<T>
minBy(String field, boolean first)
Applies an aggregation that gives the current minimum element of the data stream by the given field expression by the given key.<R> SingleOutputStreamOperator<R>
process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction)
Applies the givenKeyedProcessFunction
on the input stream, thereby creating a transformed output stream.<R> SingleOutputStreamOperator<R>
process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction, TypeInformation<R> outputType)
Applies the givenKeyedProcessFunction
on the input stream, thereby creating a transformed output stream.SingleOutputStreamOperator<T>
reduce(ReduceFunction<T> reducer)
Applies a reduce transformation on the grouped data stream grouped on by the given key position.protected DataStream<T>
setConnectionType(StreamPartitioner<T> partitioner)
Internal function for setting the partitioner for the DataStream.SingleOutputStreamOperator<T>
sum(int positionToSum)
Applies an aggregation that gives a rolling sum of the data stream at the given position grouped by the given key.SingleOutputStreamOperator<T>
sum(String field)
Applies an aggregation that gives the current sum of the data stream at the given field by the given key.<W extends Window>
WindowedStream<T,KEY,W>window(WindowAssigner<? super T,W> assigner)
Windows this data stream to aWindowedStream
, which evaluates windows over a key grouped stream.-
Methods inherited from class org.apache.flink.streaming.api.datastream.DataStream
assignTimestampsAndWatermarks, broadcast, broadcast, clean, coGroup, collectAsync, collectAsync, connect, connect, countWindowAll, countWindowAll, executeAndCollect, executeAndCollect, executeAndCollect, executeAndCollect, filter, flatMap, flatMap, forward, getExecutionConfig, getExecutionEnvironment, getId, getMinResources, getParallelism, getPreferredResources, getTransformation, getType, global, join, keyBy, keyBy, keyBy, map, map, partitionCustom, print, print, printToErr, printToErr, process, process, project, rebalance, rescale, shuffle, sinkTo, sinkTo, transform, transform, union, windowAll, writeToSocket, writeUsingOutputFormat
-
-
-
-
Constructor Detail
-
KeyedStream
public KeyedStream(DataStream<T> dataStream, KeySelector<T,KEY> keySelector)
Creates a newKeyedStream
using the givenKeySelector
to partition operator state by key.- Parameters:
dataStream
- Base stream of datakeySelector
- Function for determining state partitions
-
KeyedStream
public KeyedStream(DataStream<T> dataStream, KeySelector<T,KEY> keySelector, TypeInformation<KEY> keyType)
Creates a newKeyedStream
using the givenKeySelector
to partition operator state by key.- Parameters:
dataStream
- Base stream of datakeySelector
- Function for determining state partitions
-
-
Method Detail
-
getKeySelector
@Internal public KeySelector<T,KEY> getKeySelector()
Gets the key selector that can get the key by which the stream if partitioned from the elements.- Returns:
- The key selector for the key.
-
getKeyType
@Internal public TypeInformation<KEY> getKeyType()
Gets the type of the key by which the stream is partitioned.- Returns:
- The type of the key by which the stream is partitioned.
-
setConnectionType
protected DataStream<T> setConnectionType(StreamPartitioner<T> partitioner)
Description copied from class:DataStream
Internal function for setting the partitioner for the DataStream.- Overrides:
setConnectionType
in classDataStream<T>
- Parameters:
partitioner
- Partitioner to set.- Returns:
- The modified DataStream.
-
doTransform
protected <R> SingleOutputStreamOperator<R> doTransform(String operatorName, TypeInformation<R> outTypeInfo, StreamOperatorFactory<R> operatorFactory)
- Overrides:
doTransform
in classDataStream<T>
-
addSink
public DataStreamSink<T> addSink(SinkFunction<T> sinkFunction)
Description copied from class:DataStream
Adds the given sink to this DataStream. Only streams with sinks added will be executed once theStreamExecutionEnvironment.execute()
method is called.- Overrides:
addSink
in classDataStream<T>
- Parameters:
sinkFunction
- The object containing the sink's invoke function.- Returns:
- The closed DataStream.
-
process
@PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction)
Applies the givenKeyedProcessFunction
on the input stream, thereby creating a transformed output stream.The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the
DataStream.flatMap(FlatMapFunction)
function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.- Type Parameters:
R
- The type of elements emitted by theKeyedProcessFunction
.- Parameters:
keyedProcessFunction
- TheKeyedProcessFunction
that is called for each element in the stream.- Returns:
- The transformed
DataStream
.
-
process
@Internal public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction, TypeInformation<R> outputType)
Applies the givenKeyedProcessFunction
on the input stream, thereby creating a transformed output stream.The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the
DataStream.flatMap(FlatMapFunction)
function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.- Type Parameters:
R
- The type of elements emitted by theKeyedProcessFunction
.- Parameters:
keyedProcessFunction
- TheKeyedProcessFunction
that is called for each element in the stream.outputType
-TypeInformation
for the result type of the function.- Returns:
- The transformed
DataStream
.
-
intervalJoin
@PublicEvolving public <T1> KeyedStream.IntervalJoin<T,T1,KEY> intervalJoin(KeyedStream<T1,KEY> otherStream)
Join elements of thisKeyedStream
with elements of anotherKeyedStream
over a time interval that can be specified withKeyedStream.IntervalJoin.between(Duration, Duration)
.- Type Parameters:
T1
- Type parameter of elements in the other stream- Parameters:
otherStream
- The other keyed stream to join this keyed stream with- Returns:
- An instance of
KeyedStream.IntervalJoin
with this keyed stream and the other keyed stream
-
countWindow
public WindowedStream<T,KEY,GlobalWindow> countWindow(long size)
Windows thisKeyedStream
into tumbling count windows.- Parameters:
size
- The size of the windows in number of elements.
-
countWindow
public WindowedStream<T,KEY,GlobalWindow> countWindow(long size, long slide)
Windows thisKeyedStream
into sliding count windows.- Parameters:
size
- The size of the windows in number of elements.slide
- The slide interval in number of elements.
-
window
@PublicEvolving public <W extends Window> WindowedStream<T,KEY,W> window(WindowAssigner<? super T,W> assigner)
Windows this data stream to aWindowedStream
, which evaluates windows over a key grouped stream. Elements are put into windows by aWindowAssigner
. The grouping of elements is done both by key and by window.A
Trigger
can be defined to specify when windows are evaluated. However,WindowAssigners
have a defaultTrigger
that is used if aTrigger
is not specified.- Parameters:
assigner
- TheWindowAssigner
that assigns elements to windows.- Returns:
- The trigger windows data stream.
-
reduce
public SingleOutputStreamOperator<T> reduce(ReduceFunction<T> reducer)
Applies a reduce transformation on the grouped data stream grouped on by the given key position. TheReduceFunction
will receive input values based on the key value. Only input values with the same key will go to the same reducer.- Parameters:
reducer
- TheReduceFunction
that will be called for every element of the input values with the same key.- Returns:
- The transformed DataStream.
-
sum
public SingleOutputStreamOperator<T> sum(int positionToSum)
Applies an aggregation that gives a rolling sum of the data stream at the given position grouped by the given key. An independent aggregate is kept per key.- Parameters:
positionToSum
- The field position in the data points to sum. This is applicable to Tuple types, basic and primitive array types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
sum
public SingleOutputStreamOperator<T> sum(String field)
Applies an aggregation that gives the current sum of the data stream at the given field by the given key. An independent aggregate is kept per key.- Parameters:
field
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
min
public SingleOutputStreamOperator<T> min(int positionToMin)
Applies an aggregation that gives the current minimum of the data stream at the given position by the given key. An independent aggregate is kept per key.- Parameters:
positionToMin
- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
min
public SingleOutputStreamOperator<T> min(String field)
Applies an aggregation that gives the current minimum of the data stream at the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream
's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy"
.- Parameters:
field
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
max
public SingleOutputStreamOperator<T> max(int positionToMax)
Applies an aggregation that gives the current maximum of the data stream at the given position by the given key. An independent aggregate is kept per key.- Parameters:
positionToMax
- The field position in the data points to maximize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
max
public SingleOutputStreamOperator<T> max(String field)
Applies an aggregation that gives the current maximum of the data stream at the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream
's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy"
.- Parameters:
field
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
minBy
public SingleOutputStreamOperator<T> minBy(String field, boolean first)
Applies an aggregation that gives the current minimum element of the data stream by the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream
's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy"
.- Parameters:
field
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).first
- If True then in case of field equality the first object will be returned- Returns:
- The transformed DataStream.
-
maxBy
public SingleOutputStreamOperator<T> maxBy(String field, boolean first)
Applies an aggregation that gives the current maximum element of the data stream by the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream
's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy"
.- Parameters:
field
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).first
- If True then in case of field equality the first object will be returned- Returns:
- The transformed DataStream.
-
minBy
public SingleOutputStreamOperator<T> minBy(int positionToMinBy)
Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns the first one by default.- Parameters:
positionToMinBy
- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
minBy
public SingleOutputStreamOperator<T> minBy(String positionToMinBy)
Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns the first one by default.- Parameters:
positionToMinBy
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
minBy
public SingleOutputStreamOperator<T> minBy(int positionToMinBy, boolean first)
Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns either the first or last one, depending on the parameter set.- Parameters:
positionToMinBy
- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).first
- If true, then the operator return the first element with the minimal value, otherwise returns the last- Returns:
- The transformed DataStream.
-
maxBy
public SingleOutputStreamOperator<T> maxBy(int positionToMaxBy)
Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns the first one by default.- Parameters:
positionToMaxBy
- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
maxBy
public SingleOutputStreamOperator<T> maxBy(String positionToMaxBy)
Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns the first one by default.- Parameters:
positionToMaxBy
- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy"
. Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
maxBy
public SingleOutputStreamOperator<T> maxBy(int positionToMaxBy, boolean first)
Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns either the first or last one, depending on the parameter set.- Parameters:
positionToMaxBy
- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).first
- If true, then the operator return the first element with the maximum value, otherwise returns the last- Returns:
- The transformed DataStream.
-
aggregate
protected SingleOutputStreamOperator<T> aggregate(AggregationFunction<T> aggregate)
-
fullWindowPartition
@PublicEvolving public PartitionWindowedStream<T> fullWindowPartition()
Collect records from each partition into a separate full window. The window emission will be triggered at the end of inputs. For this keyed data stream(each record has a key), a partition only contains all records with the same key.- Overrides:
fullWindowPartition
in classDataStream<T>
- Returns:
- The full windowed data stream on partition.
-
asQueryableState
@PublicEvolving @Deprecated public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName)
Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.Publishes the keyed stream as queryable ValueState instance.- Parameters:
queryableStateName
- Name under which to the publish the queryable state instance- Returns:
- Queryable state instance
-
asQueryableState
@PublicEvolving @Deprecated public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName, ValueStateDescriptor<T> stateDescriptor)
Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.Publishes the keyed stream as a queryable ValueState instance.- Parameters:
queryableStateName
- Name under which to the publish the queryable state instancestateDescriptor
- State descriptor to create state instance from- Returns:
- Queryable state instance
-
asQueryableState
@PublicEvolving @Deprecated public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName, ReducingStateDescriptor<T> stateDescriptor)
Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.Publishes the keyed stream as a queryable ReducingState instance.- Parameters:
queryableStateName
- Name under which to the publish the queryable state instancestateDescriptor
- State descriptor to create state instance from- Returns:
- Queryable state instance
-
-