@Experimental public interface BroadcastStream<T> extends DataStream
Modifier and Type | Method and Description |
---|---|
<K,T_OTHER,OUT> |
connectAndProcess(KeyedPartitionStream<K,T_OTHER> other,
TwoInputBroadcastStreamProcessFunction<T_OTHER,T,OUT> processFunction)
Apply a two input operation to this and other
KeyedPartitionStream . |
<K,T_OTHER,OUT> |
connectAndProcess(KeyedPartitionStream<K,T_OTHER> other,
TwoInputBroadcastStreamProcessFunction<T_OTHER,T,OUT> processFunction,
KeySelector<OUT,K> newKeySelector)
Apply a two input operation to this and other
KeyedPartitionStream . |
<T_OTHER,OUT> |
connectAndProcess(NonKeyedPartitionStream<T_OTHER> other,
TwoInputBroadcastStreamProcessFunction<T_OTHER,T,OUT> processFunction)
Apply a two input operation to this and other
NonKeyedPartitionStream . |
<K,T_OTHER,OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T_OTHER,T,OUT> processFunction)
KeyedPartitionStream
.
Generally, concatenating BroadcastStream
and KeyedPartitionStream
will
result in a NonKeyedPartitionStream
, and you can manually generate a KeyedPartitionStream
via keyBy partitioning. In some cases, you can guarantee that the
partition on which the data is processed will not change, then you can use connectAndProcess(KeyedPartitionStream, TwoInputBroadcastStreamProcessFunction,
KeySelector)
to avoid shuffling.
other
- KeyedPartitionStream
to perform operation with two input.processFunction
- to perform operation.<T_OTHER,OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> connectAndProcess(NonKeyedPartitionStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T_OTHER,T,OUT> processFunction)
NonKeyedPartitionStream
.other
- NonKeyedPartitionStream
to perform operation with two input.processFunction
- to perform operation.<K,T_OTHER,OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T_OTHER,T,OUT> processFunction, KeySelector<OUT,K> newKeySelector)
KeyedPartitionStream
.
This method is used to avoid shuffle after applying the process function. It is required
that for the record from non-broadcast input, the new KeySelector
must extract the
same key as the original KeySelector
s on the KeyedPartitionStream
. Otherwise,
the partition of data will be messy. As for the record from broadcast input, the output key
from keyed partition itself instead of the new key selector, so the data it outputs will not
affect the partition.
other
- KeyedPartitionStream
to perform operation with two input.processFunction
- to perform operation.newKeySelector
- to select the key after process.KeyedPartitionStream
with this operation.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.