Interface DataStreamSinkProvider
-
- All Superinterfaces:
DynamicTableSink.SinkRuntimeProvider
,ParallelismProvider
@PublicEvolving public interface DataStreamSinkProvider extends DynamicTableSink.SinkRuntimeProvider, ParallelismProvider
Provider that consumes a JavaDataStream
as a runtime implementation forDynamicTableSink
.Note: This provider is only meant for advanced connector developers. Usually, a sink should consist of a single entity expressed via
SinkFunctionProvider
, orOutputFormatProvider
. When using aDataStream
an implementer needs to pay attention to how changes are shuffled to not mess up the changelog per parallel subtask.
-
-
Method Summary
All Methods Instance Methods Default Methods Deprecated Methods Modifier and Type Method Description default DataStreamSink<?>
consumeDataStream(DataStream<RowData> dataStream)
Deprecated.UseconsumeDataStream(ProviderContext, DataStream)
and correctly set a unique identifier for each data stream transformation.default DataStreamSink<?>
consumeDataStream(ProviderContext providerContext, DataStream<RowData> dataStream)
Consumes the given JavaDataStream
and returns the sink transformationDataStreamSink
.default Optional<Integer>
getParallelism()
Returns the parallelism for this instance.
-
-
-
Method Detail
-
consumeDataStream
default DataStreamSink<?> consumeDataStream(ProviderContext providerContext, DataStream<RowData> dataStream)
Consumes the given JavaDataStream
and returns the sink transformationDataStreamSink
.Note: If the
CompiledPlan
feature should be supported, this method MUST set a unique identifier for each transformation/operator in the data stream. This enables stateful Flink version upgrades for streaming jobs. The identifier is used to map state back from a savepoint to an actual operator in the topology. The framework can generate topology-wide unique identifiers withProviderContext.generateUid(String)
.- See Also:
SingleOutputStreamOperator.uid(String)
-
consumeDataStream
@Deprecated default DataStreamSink<?> consumeDataStream(DataStream<RowData> dataStream)
Deprecated.UseconsumeDataStream(ProviderContext, DataStream)
and correctly set a unique identifier for each data stream transformation.Consumes the given JavaDataStream
and returns the sink transformationDataStreamSink
.
-
getParallelism
default Optional<Integer> getParallelism()
Returns the parallelism for this instance.The parallelism denotes how many parallel instances of a source or sink will be spawned during the execution.
Enforcing a different parallelism for sources/sinks might mess up the changelog if the output/input is not
ChangelogMode.insertOnly()
. Therefore, a primary key is required by which the output/input will be shuffled after/before records leave/enter theScanTableSource.ScanRuntimeProvider
/DynamicTableSink.SinkRuntimeProvider
implementation.Note: If a custom parallelism is returned and
consumeDataStream(ProviderContext, DataStream)
applies multiple transformations, make sure to set the same custom parallelism to each operator to not mess up the changelog.- Specified by:
getParallelism
in interfaceParallelismProvider
- Returns:
- empty if the connector does not provide a custom parallelism, then the planner will decide the number of parallel instances by itself.
-
-