DataStreamSinkProvider (Flink : 1.19-SNAPSHOT API)

All Superinterfaces:

DynamicTableSink.SinkRuntimeProvider, ParallelismProvider
```
@PublicEvolving
public interface DataStreamSinkProvider
extends DynamicTableSink.SinkRuntimeProvider, ParallelismProvider
```
Provider that consumes a Java DataStream as a runtime implementation for DynamicTableSink.
Note: This provider is only meant for advanced connector developers. Usually, a sink should consist of a single entity expressed via SinkProvider, SinkFunctionProvider, or OutputFormatProvider. When using a DataStream an implementer needs to pay attention to how changes are shuffled to not mess up the changelog per parallel subtask.

Method Summary

All Methods Instance Methods Default Methods Deprecated Methods
Modifier and Type	Method and Description
`default DataStreamSink<?>`	`consumeDataStream(DataStream<RowData> dataStream)` Deprecated. Use `consumeDataStream(ProviderContext, DataStream)` and correctly set a unique identifier for each data stream transformation.
`default DataStreamSink<?>`	`consumeDataStream(ProviderContext providerContext, DataStream<RowData> dataStream)` Consumes the given Java `DataStream` and returns the sink transformation `DataStreamSink`.
`default Optional<Integer>`	`getParallelism()` Returns the parallelism for this instance.

- Method Detail
  - consumeDataStream
```
default DataStreamSink<?> consumeDataStream(ProviderContext providerContext,
                                            DataStream<RowData> dataStream)
```
    Consumes the given Java DataStream and returns the sink transformation DataStreamSink.
    Note: If the CompiledPlan feature should be supported, this method MUST set a unique identifier for each transformation/operator in the data stream. This enables stateful Flink version upgrades for streaming jobs. The identifier is used to map state back from a savepoint to an actual operator in the topology. The framework can generate topology-wide unique identifiers with ProviderContext.generateUid(String).
    
    See Also:
    
    SingleOutputStreamOperator.uid(String)
  - consumeDataStream
```
@Deprecated
default DataStreamSink<?> consumeDataStream(DataStream<RowData> dataStream)
```
    Deprecated. Use consumeDataStream(ProviderContext, DataStream) and correctly set a unique identifier for each data stream transformation.
    
    Consumes the given Java DataStream and returns the sink transformation DataStreamSink.
  - getParallelism
```
default Optional<Integer> getParallelism()
```
    Returns the parallelism for this instance.
    The parallelism denotes how many parallel instances of a source or sink will be spawned during the execution.
    Enforcing a different parallelism for sources/sinks might mess up the changelog if the output/input is not ChangelogMode.insertOnly(). Therefore, a primary key is required by which the output/input will be shuffled after/before records leave/enter the ScanTableSource.ScanRuntimeProvider/DynamicTableSink.SinkRuntimeProvider implementation.
    Note: If a custom parallelism is returned and consumeDataStream(ProviderContext, DataStream) applies multiple transformations, make sure to set the same custom parallelism to each operator to not mess up the changelog.
    
    Specified by:
    
    getParallelism in interface ParallelismProvider
    
    Returns:
    
    empty if the connector does not provide a custom parallelism, then the planner will decide the number of parallel instances by itself.

Back to Flink Website

Interface DataStreamSinkProvider

Method Summary

Method Detail

consumeDataStream

consumeDataStream

getParallelism

Back to Flink Website