ForwardForConsecutiveHashPartitioner (Flink : 1.20-SNAPSHOT API)

java.lang.Object
- org.apache.flink.streaming.runtime.partitioner.StreamPartitioner<T>
- - org.apache.flink.streaming.runtime.partitioner.ForwardPartitioner<T>
  - - org.apache.flink.streaming.runtime.partitioner.ForwardForConsecutiveHashPartitioner<T>

Type Parameters:

T - Type of the elements in the Stream

All Implemented Interfaces:

Serializable, ChannelSelector<SerializationDelegate<StreamRecord<T>>>
```
@Internal
public class ForwardForConsecutiveHashPartitioner<T>
extends ForwardPartitioner<T>
```
If there are multiple consecutive and the same hash shuffles, SQL planner will change them except the first one to use forward partitioner, so that these operators can be chained to reduce unnecessary shuffles.
```
 A --[hash]--> B --[hash]--> C
            |
            V
 A --[hash]--> B --[forward]--> C

 
```
However, sometimes the consecutive hash operators are not chained (e.g. multiple inputs), and this kind of forward partitioners will turn into forward job edges. These forward edges still have the consecutive hash assumption, so that they cannot be changed into rescale/rebalance edges, otherwise it can lead to incorrect results. This prevents the adaptive batch scheduler from determining parallelism for other forward edge downstream job vertices(see FLINK-25046).
To solve it, we introduce the ForwardForConsecutiveHashPartitioner. When SQL planner optimizes the case of multiple consecutive and the same hash shuffles, it should use this partitioner, and then the runtime framework will change it to forward/hash after the operator chain creation.
```
 A --[hash]--> B --[hash]--> C
            |
            V
 A --[hash]--> B --[ForwardForConsecutiveHash]--> C

 
```
This partitioner will be converted to following partitioners after the operator chain creation:
1. Be converted to ForwardPartitioner if this partitioner is intra-chain.
2. Be converted to hashPartitioner if this partitioner is inter-chain.
This partitioner should only be used for SQL Batch jobs and when using AdaptiveBatchScheduler.
See Also:

Serialized Form

Field Summary
- Fields inherited from class org.apache.flink.streaming.runtime.partitioner.StreamPartitioner
  numberOfChannels

Constructor Summary

Constructors
Constructor and Description

ForwardForConsecutiveHashPartitioner(StreamPartitioner<T> hashPartitioner)
Create a new ForwardForConsecutiveHashPartitioner.

Constructors
Constructor and Description
`ForwardForConsecutiveHashPartitioner(StreamPartitioner<T> hashPartitioner)` Create a new ForwardForConsecutiveHashPartitioner.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`StreamPartitioner<T>`	`copy()`
`SubtaskStateMapper`	`getDownstreamSubtaskStateMapper()` Defines the behavior of this partitioner, when downstream rescaled during recovery of in-flight data.
`StreamPartitioner<T>`	`getHashPartitioner()`
`boolean`	`isPointwise()`
`int`	`selectChannel(SerializationDelegate<StreamRecord<T>> record)` Returns the logical channel index, to which the given record should be written.

Methods inherited from class org.apache.flink.streaming.runtime.partitioner.ForwardPartitioner
getUpstreamSubtaskStateMapper, toString

Methods inherited from class org.apache.flink.streaming.runtime.partitioner.StreamPartitioner
equals, hashCode, isBroadcast, setup

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - ForwardForConsecutiveHashPartitioner
```
public ForwardForConsecutiveHashPartitioner(StreamPartitioner<T> hashPartitioner)
```
    Create a new ForwardForConsecutiveHashPartitioner.
    
    Parameters:
    
    hashPartitioner - the HashPartitioner
- Method Detail
  - copy
```
public StreamPartitioner<T> copy()
```
    Overrides:
    
    copy in class ForwardPartitioner<T>
  - getDownstreamSubtaskStateMapper
```
public SubtaskStateMapper getDownstreamSubtaskStateMapper()
```
    Description copied from class: StreamPartitioner
    
    Defines the behavior of this partitioner, when downstream rescaled during recovery of in-flight data.
    
    Overrides:
    
    getDownstreamSubtaskStateMapper in class ForwardPartitioner<T>
  - isPointwise
```
public boolean isPointwise()
```
    Overrides:
    
    isPointwise in class ForwardPartitioner<T>
  - selectChannel
```
public int selectChannel(SerializationDelegate<StreamRecord<T>> record)
```
    Description copied from interface: ChannelSelector
    
    Returns the logical channel index, to which the given record should be written. It is illegal to call this method for broadcast channel selectors and this method can remain not implemented in that case (for example by throwing UnsupportedOperationException).
    
    Specified by:
    
    selectChannel in interface ChannelSelector<SerializationDelegate<StreamRecord<T>>>
    
    Overrides:
    
    selectChannel in class ForwardPartitioner<T>
    
    Parameters:
    
    record - the record to determine the output channels for.
    
    Returns:
    
    an integer number which indicates the index of the output channel through which the record shall be forwarded.
  - getHashPartitioner
```
public StreamPartitioner<T> getHashPartitioner()
```

Back to Flink Website

Class ForwardForConsecutiveHashPartitioner<T>

Field Summary

Fields inherited from class org.apache.flink.streaming.runtime.partitioner.StreamPartitioner

Constructor Summary

Method Summary

Methods inherited from class org.apache.flink.streaming.runtime.partitioner.ForwardPartitioner

Methods inherited from class org.apache.flink.streaming.runtime.partitioner.StreamPartitioner

Methods inherited from class java.lang.Object

Constructor Detail

ForwardForConsecutiveHashPartitioner

Method Detail

copy

getDownstreamSubtaskStateMapper

isPointwise

selectChannel

getHashPartitioner

Back to Flink Website