Class KeyedStream<T,​KEY>

  • Type Parameters:
    T - The type of the elements in the Keyed Stream.
    KEY - The type of the key in the Keyed Stream.

    @Public
    public class KeyedStream<T,​KEY>
    extends DataStream<T>
    A KeyedStream represents a DataStream on which operator state is partitioned by key using a provided KeySelector. Typical operations supported by a DataStream are also possible on a KeyedStream, with the exception of partitioning methods such as shuffle, forward and keyBy.

    Reduce-style operations, such as reduce(org.apache.flink.api.common.functions.ReduceFunction<T>), and sum(int) work on elements that have the same key.

    • Constructor Detail

      • KeyedStream

        public KeyedStream​(DataStream<T> dataStream,
                           KeySelector<T,​KEY> keySelector)
        Creates a new KeyedStream using the given KeySelector to partition operator state by key.
        Parameters:
        dataStream - Base stream of data
        keySelector - Function for determining state partitions
    • Method Detail

      • getKeySelector

        @Internal
        public KeySelector<T,​KEY> getKeySelector()
        Gets the key selector that can get the key by which the stream if partitioned from the elements.
        Returns:
        The key selector for the key.
      • getKeyType

        @Internal
        public TypeInformation<KEY> getKeyType()
        Gets the type of the key by which the stream is partitioned.
        Returns:
        The type of the key by which the stream is partitioned.
      • setConnectionType

        protected DataStream<T> setConnectionType​(StreamPartitioner<T> partitioner)
        Description copied from class: DataStream
        Internal function for setting the partitioner for the DataStream.
        Overrides:
        setConnectionType in class DataStream<T>
        Parameters:
        partitioner - Partitioner to set.
        Returns:
        The modified DataStream.
      • process

        @PublicEvolving
        public <R> SingleOutputStreamOperator<R> process​(KeyedProcessFunction<KEY,​T,​R> keyedProcessFunction)
        Applies the given KeyedProcessFunction on the input stream, thereby creating a transformed output stream.

        The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the DataStream.flatMap(FlatMapFunction) function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.

        Type Parameters:
        R - The type of elements emitted by the KeyedProcessFunction.
        Parameters:
        keyedProcessFunction - The KeyedProcessFunction that is called for each element in the stream.
        Returns:
        The transformed DataStream.
      • process

        @Internal
        public <R> SingleOutputStreamOperator<R> process​(KeyedProcessFunction<KEY,​T,​R> keyedProcessFunction,
                                                         TypeInformation<R> outputType)
        Applies the given KeyedProcessFunction on the input stream, thereby creating a transformed output stream.

        The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the DataStream.flatMap(FlatMapFunction) function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.

        Type Parameters:
        R - The type of elements emitted by the KeyedProcessFunction.
        Parameters:
        keyedProcessFunction - The KeyedProcessFunction that is called for each element in the stream.
        outputType - TypeInformation for the result type of the function.
        Returns:
        The transformed DataStream.
      • countWindow

        public WindowedStream<T,​KEY,​GlobalWindow> countWindow​(long size)
        Windows this KeyedStream into tumbling count windows.
        Parameters:
        size - The size of the windows in number of elements.
      • countWindow

        public WindowedStream<T,​KEY,​GlobalWindow> countWindow​(long size,
                                                                          long slide)
        Windows this KeyedStream into sliding count windows.
        Parameters:
        size - The size of the windows in number of elements.
        slide - The slide interval in number of elements.
      • window

        @PublicEvolving
        public <W extends WindowWindowedStream<T,​KEY,​W> window​(WindowAssigner<? super T,​W> assigner)
        Windows this data stream to a WindowedStream, which evaluates windows over a key grouped stream. Elements are put into windows by a WindowAssigner. The grouping of elements is done both by key and by window.

        A Trigger can be defined to specify when windows are evaluated. However, WindowAssigners have a default Trigger that is used if a Trigger is not specified.

        Parameters:
        assigner - The WindowAssigner that assigns elements to windows.
        Returns:
        The trigger windows data stream.
      • reduce

        public SingleOutputStreamOperator<T> reduce​(ReduceFunction<T> reducer)
        Applies a reduce transformation on the grouped data stream grouped on by the given key position. The ReduceFunction will receive input values based on the key value. Only input values with the same key will go to the same reducer.
        Parameters:
        reducer - The ReduceFunction that will be called for every element of the input values with the same key.
        Returns:
        The transformed DataStream.
      • sum

        public SingleOutputStreamOperator<T> sum​(int positionToSum)
        Applies an aggregation that gives a rolling sum of the data stream at the given position grouped by the given key. An independent aggregate is kept per key.
        Parameters:
        positionToSum - The field position in the data points to sum. This is applicable to Tuple types, basic and primitive array types, Scala case classes, and primitive types (which is considered as having one field).
        Returns:
        The transformed DataStream.
      • sum

        public SingleOutputStreamOperator<T> sum​(String field)
        Applies an aggregation that gives the current sum of the data stream at the given field by the given key. An independent aggregate is kept per key.
        Parameters:
        field - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        Returns:
        The transformed DataStream.
      • min

        public SingleOutputStreamOperator<T> min​(int positionToMin)
        Applies an aggregation that gives the current minimum of the data stream at the given position by the given key. An independent aggregate is kept per key.
        Parameters:
        positionToMin - The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).
        Returns:
        The transformed DataStream.
      • min

        public SingleOutputStreamOperator<T> min​(String field)
        Applies an aggregation that gives the current minimum of the data stream at the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of the DataStream's underlying type. A dot can be used to drill down into objects, as in "field1.fieldxy" .
        Parameters:
        field - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        Returns:
        The transformed DataStream.
      • max

        public SingleOutputStreamOperator<T> max​(int positionToMax)
        Applies an aggregation that gives the current maximum of the data stream at the given position by the given key. An independent aggregate is kept per key.
        Parameters:
        positionToMax - The field position in the data points to maximize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).
        Returns:
        The transformed DataStream.
      • max

        public SingleOutputStreamOperator<T> max​(String field)
        Applies an aggregation that gives the current maximum of the data stream at the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of the DataStream's underlying type. A dot can be used to drill down into objects, as in "field1.fieldxy" .
        Parameters:
        field - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        Returns:
        The transformed DataStream.
      • minBy

        public SingleOutputStreamOperator<T> minBy​(String field,
                                                   boolean first)
        Applies an aggregation that gives the current minimum element of the data stream by the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of the DataStream's underlying type. A dot can be used to drill down into objects, as in "field1.fieldxy" .
        Parameters:
        field - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        first - If True then in case of field equality the first object will be returned
        Returns:
        The transformed DataStream.
      • maxBy

        public SingleOutputStreamOperator<T> maxBy​(String field,
                                                   boolean first)
        Applies an aggregation that gives the current maximum element of the data stream by the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of the DataStream's underlying type. A dot can be used to drill down into objects, as in "field1.fieldxy" .
        Parameters:
        field - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        first - If True then in case of field equality the first object will be returned
        Returns:
        The transformed DataStream.
      • minBy

        public SingleOutputStreamOperator<T> minBy​(int positionToMinBy)
        Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns the first one by default.
        Parameters:
        positionToMinBy - The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).
        Returns:
        The transformed DataStream.
      • minBy

        public SingleOutputStreamOperator<T> minBy​(String positionToMinBy)
        Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns the first one by default.
        Parameters:
        positionToMinBy - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        Returns:
        The transformed DataStream.
      • minBy

        public SingleOutputStreamOperator<T> minBy​(int positionToMinBy,
                                                   boolean first)
        Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns either the first or last one, depending on the parameter set.
        Parameters:
        positionToMinBy - The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).
        first - If true, then the operator return the first element with the minimal value, otherwise returns the last
        Returns:
        The transformed DataStream.
      • maxBy

        public SingleOutputStreamOperator<T> maxBy​(int positionToMaxBy)
        Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns the first one by default.
        Parameters:
        positionToMaxBy - The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).
        Returns:
        The transformed DataStream.
      • maxBy

        public SingleOutputStreamOperator<T> maxBy​(String positionToMaxBy)
        Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns the first one by default.
        Parameters:
        positionToMaxBy - In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy" . Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).
        Returns:
        The transformed DataStream.
      • maxBy

        public SingleOutputStreamOperator<T> maxBy​(int positionToMaxBy,
                                                   boolean first)
        Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns either the first or last one, depending on the parameter set.
        Parameters:
        positionToMaxBy - The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).
        first - If true, then the operator return the first element with the maximum value, otherwise returns the last
        Returns:
        The transformed DataStream.
      • fullWindowPartition

        @PublicEvolving
        public PartitionWindowedStream<T> fullWindowPartition()
        Collect records from each partition into a separate full window. The window emission will be triggered at the end of inputs. For this keyed data stream(each record has a key), a partition only contains all records with the same key.
        Overrides:
        fullWindowPartition in class DataStream<T>
        Returns:
        The full windowed data stream on partition.
      • asQueryableState

        @PublicEvolving
        @Deprecated
        public QueryableStateStream<KEY,​T> asQueryableState​(String queryableStateName)
        Deprecated.
        The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.
        Publishes the keyed stream as queryable ValueState instance.
        Parameters:
        queryableStateName - Name under which to the publish the queryable state instance
        Returns:
        Queryable state instance
      • asQueryableState

        @PublicEvolving
        @Deprecated
        public QueryableStateStream<KEY,​T> asQueryableState​(String queryableStateName,
                                                                  ValueStateDescriptor<T> stateDescriptor)
        Deprecated.
        The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.
        Publishes the keyed stream as a queryable ValueState instance.
        Parameters:
        queryableStateName - Name under which to the publish the queryable state instance
        stateDescriptor - State descriptor to create state instance from
        Returns:
        Queryable state instance
      • asQueryableState

        @PublicEvolving
        @Deprecated
        public QueryableStateStream<KEY,​T> asQueryableState​(String queryableStateName,
                                                                  ReducingStateDescriptor<T> stateDescriptor)
        Deprecated.
        The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.
        Publishes the keyed stream as a queryable ReducingState instance.
        Parameters:
        queryableStateName - Name under which to the publish the queryable state instance
        stateDescriptor - State descriptor to create state instance from
        Returns:
        Queryable state instance