Interface KeyedPartitionStream<K,T>
-
- All Superinterfaces:
DataStream
- All Known Subinterfaces:
KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,T>
- All Known Implementing Classes:
KeyedPartitionStreamImpl
,ProcessConfigurableAndKeyedPartitionStreamImpl
@Experimental public interface KeyedPartitionStream<K,T> extends DataStream
This interface represents a kind of partitioned data stream. For this stream, each key is a partition, and the partition to which the data belongs is deterministic.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interface
KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,T>
This interface represents a configurableKeyedPartitionStream
.static interface
KeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreams<K,T1,T2>
This class represents a combination of twoKeyedPartitionStream
.
-
Method Summary
-
-
-
Method Detail
-
process
<OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> process(OneInputStreamProcessFunction<T,OUT> processFunction, KeySelector<OUT,K> newKeySelector)
Apply an operation to thisKeyedPartitionStream
.This method is used to avoid shuffle after applying the process function. It is required that for the same record, the new
KeySelector
must extract the same key as the originalKeySelector
on thisKeyedPartitionStream
. Otherwise, the partition of data will be messy.- Parameters:
processFunction
- to perform operation.newKeySelector
- to select the key after process.- Returns:
- new
KeyedPartitionStream
with this operation.
-
process
<OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> process(OneInputStreamProcessFunction<T,OUT> processFunction)
Apply an operation to thisKeyedPartitionStream
.Generally, apply an operation to a
KeyedPartitionStream
will result in aNonKeyedPartitionStream
, and you can manually generate aKeyedPartitionStream
via keyBy partitioning. In some cases, you can guarantee that the partition on which the data is processed will not change, then you can useprocess(OneInputStreamProcessFunction, KeySelector)
to avoid shuffling.- Parameters:
processFunction
- to perform operation.- Returns:
- new
NonKeyedPartitionStream
with this operation.
-
process
<OUT1,OUT2> KeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreams<K,OUT1,OUT2> process(TwoOutputStreamProcessFunction<T,OUT1,OUT2> processFunction, KeySelector<OUT1,K> keySelector1, KeySelector<OUT2,K> keySelector2)
Apply a two output operation to thisKeyedPartitionStream
.This method is used to avoid shuffle after applying the process function. It is required that for the same record, these new two
KeySelector
s must extract the same key as the originalKeySelector
s on thisKeyedPartitionStream
. Otherwise, the partition of data will be messy.- Parameters:
processFunction
- to perform two output operation.keySelector1
- to select the key of first output.keySelector2
- to select the key of second output.- Returns:
- new
KeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreams
with this operation.
-
process
<OUT1,OUT2> NonKeyedPartitionStream.ProcessConfigurableAndTwoNonKeyedPartitionStream<OUT1,OUT2> process(TwoOutputStreamProcessFunction<T,OUT1,OUT2> processFunction)
Apply a two output operation to thisKeyedPartitionStream
.- Parameters:
processFunction
- to perform two output operation.- Returns:
- new
NonKeyedPartitionStream.ProcessConfigurableAndTwoNonKeyedPartitionStream
with this operation.
-
connectAndProcess
<T_OTHER,OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputNonBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction)
Apply a two input operation to this and otherKeyedPartitionStream
. The two keyed streams must have the same partitions, otherwise it makes no sense to connect them.Generally, concatenating two
KeyedPartitionStream
will result in aNonKeyedPartitionStream
, and you can manually generate aKeyedPartitionStream
via keyBy partitioning. In some cases, you can guarantee that the partition on which the data is processed will not change, then you can useconnectAndProcess(KeyedPartitionStream, TwoInputNonBroadcastStreamProcessFunction, KeySelector)
to avoid shuffling.- Parameters:
other
-KeyedPartitionStream
to perform operation with two input.processFunction
- to perform operation.- Returns:
- new
NonKeyedPartitionStream
with this operation.
-
connectAndProcess
<T_OTHER,OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputNonBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction, KeySelector<OUT,K> newKeySelector)
Apply a two input operation to this and otherKeyedPartitionStream
.The two keyed streams must have the same partitions, otherwise it makes no sense to connect them.This method is used to avoid shuffle after applying the process function. It is required that for the same record, the new
KeySelector
must extract the same key as the originalKeySelector
s on these twoKeyedPartitionStream
s. Otherwise, the partition of data will be messy.- Parameters:
other
-KeyedPartitionStream
to perform operation with two input.processFunction
- to perform operation.newKeySelector
- to select the key after process.- Returns:
- new
KeyedPartitionStream
with this operation.
-
connectAndProcess
<T_OTHER,OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> connectAndProcess(BroadcastStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction)
Apply a two input operation to this and otherBroadcastStream
.Generally, concatenating
KeyedPartitionStream
andBroadcastStream
will result in aNonKeyedPartitionStream
, and you can manually generate aKeyedPartitionStream
via keyBy partitioning. In some cases, you can guarantee that the partition on which the data is processed will not change, then you can useconnectAndProcess(BroadcastStream, TwoInputBroadcastStreamProcessFunction, KeySelector)
to avoid shuffling.- Parameters:
processFunction
- to perform operation.- Returns:
- new stream with this operation.
-
connectAndProcess
<T_OTHER,OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> connectAndProcess(BroadcastStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction, KeySelector<OUT,K> newKeySelector)
Apply a two input operation to this and otherBroadcastStream
.This method is used to avoid shuffle after applying the process function. It is required that for the record from non-broadcast input, the new
KeySelector
must extract the same key as the originalKeySelector
s on theKeyedPartitionStream
. Otherwise, the partition of data will be messy. As for the record from broadcast input, the output key from keyed partition itself instead of the new key selector, so the data it outputs will not affect the partition.- Parameters:
other
-BroadcastStream
to perform operation with two input.processFunction
- to perform operation.newKeySelector
- to select the key after process.- Returns:
- new
KeyedPartitionStream
with this operation.
-
global
GlobalStream<T> global()
Coalesce this stream to aGlobalStream
.- Returns:
- the coalesced global stream.
-
keyBy
<NEW_KEY> KeyedPartitionStream<NEW_KEY,T> keyBy(KeySelector<T,NEW_KEY> keySelector)
Transform this stream to a newKeyedPartitionStream
.- Parameters:
keySelector
- to decide how to map data to partition.- Returns:
- the transformed stream partitioned by key.
-
shuffle
NonKeyedPartitionStream<T> shuffle()
Transform this stream to a newNonKeyedPartitionStream
, data will be shuffled between these two streams.- Returns:
- the transformed stream after shuffle.
-
broadcast
BroadcastStream<T> broadcast()
Transform this stream to a newBroadcastStream
.- Returns:
- the transformed
BroadcastStream
.
-
toSink
ProcessConfigurable<?> toSink(Sink<T> sink)
-
-