@Public public interface WatermarkStrategy<T> extends TimestampAssignerSupplier<T>, WatermarkGeneratorSupplier<T>
Watermark
s in the stream sources. The
WatermarkStrategy is a builder/factory for the WatermarkGenerator
that generates the
watermarks and the TimestampAssigner
which assigns the internal timestamp of a record.
This interface is split into three parts: 1) methods that an implementor of this interface
needs to implement, 2) builder methods for building a WatermarkStrategy
on a base
strategy, 3) convenience methods for constructing a WatermarkStrategy
for common built-in
strategies or based on a WatermarkGeneratorSupplier
Implementors of this interface need only implement createWatermarkGenerator(WatermarkGeneratorSupplier.Context)
. Optionally, you can implement
createTimestampAssigner(TimestampAssignerSupplier.Context)
.
The builder methods, like withIdleness(Duration)
or createTimestampAssigner(TimestampAssignerSupplier.Context)
create a new WatermarkStrategy
that wraps and enriches a base strategy. The strategy on which the method is
called is the base strategy.
The convenience methods, for example forBoundedOutOfOrderness(Duration)
, create a
WatermarkStrategy
for common built in strategies.
This interface is Serializable
because watermark strategies may be shipped to workers
during distributed execution.
TimestampAssignerSupplier.Context, TimestampAssignerSupplier.SupplierFromSerializableTimestampAssigner<T>
WatermarkGeneratorSupplier.Context
Modifier and Type | Method and Description |
---|---|
default TimestampAssigner<T> |
createTimestampAssigner(TimestampAssignerSupplier.Context context)
Instantiates a
TimestampAssigner for assigning timestamps according to this strategy. |
WatermarkGenerator<T> |
createWatermarkGenerator(WatermarkGeneratorSupplier.Context context)
Instantiates a WatermarkGenerator that generates watermarks according to this strategy.
|
static <T> WatermarkStrategy<T> |
forBoundedOutOfOrderness(Duration maxOutOfOrderness)
Creates a watermark strategy for situations where records are out of order, but you can place
an upper bound on how far the events are out of order.
|
static <T> WatermarkStrategy<T> |
forGenerator(WatermarkGeneratorSupplier<T> generatorSupplier)
Creates a watermark strategy based on an existing
WatermarkGeneratorSupplier . |
static <T> WatermarkStrategy<T> |
forMonotonousTimestamps()
Creates a watermark strategy for situations with monotonously ascending timestamps.
|
default WatermarkAlignmentParams |
getAlignmentParameters()
Provides configuration for watermark alignment of a maximum watermark of multiple
sources/tasks/partitions in the same watermark group.
|
static <T> WatermarkStrategy<T> |
noWatermarks()
Creates a watermark strategy that generates no watermarks at all.
|
default WatermarkStrategy<T> |
withIdleness(Duration idleTimeout)
Creates a new enriched
WatermarkStrategy that also does idleness detection in the
created WatermarkGenerator . |
default WatermarkStrategy<T> |
withTimestampAssigner(SerializableTimestampAssigner<T> timestampAssigner)
Creates a new
WatermarkStrategy that wraps this strategy but instead uses the given
SerializableTimestampAssigner . |
default WatermarkStrategy<T> |
withTimestampAssigner(TimestampAssignerSupplier<T> timestampAssigner)
Creates a new
WatermarkStrategy that wraps this strategy but instead uses the given
TimestampAssigner (via a TimestampAssignerSupplier ). |
default WatermarkStrategy<T> |
withWatermarkAlignment(String watermarkGroup,
Duration maxAllowedWatermarkDrift)
Creates a new
WatermarkStrategy that configures the maximum watermark drift from
other sources/tasks/partitions in the same watermark group. |
default WatermarkStrategy<T> |
withWatermarkAlignment(String watermarkGroup,
Duration maxAllowedWatermarkDrift,
Duration updateInterval)
Creates a new
WatermarkStrategy that configures the maximum watermark drift from
other sources/tasks/partitions in the same watermark group. |
of
WatermarkGenerator<T> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context)
createWatermarkGenerator
in interface WatermarkGeneratorSupplier<T>
default TimestampAssigner<T> createTimestampAssigner(TimestampAssignerSupplier.Context context)
TimestampAssigner
for assigning timestamps according to this strategy.createTimestampAssigner
in interface TimestampAssignerSupplier<T>
@PublicEvolving default WatermarkAlignmentParams getAlignmentParameters()
Once configured Flink will "pause" consuming from a source/task/partition that is ahead of the emitted watermark in the group by more than the maxAllowedWatermarkDrift.
default WatermarkStrategy<T> withTimestampAssigner(TimestampAssignerSupplier<T> timestampAssigner)
WatermarkStrategy
that wraps this strategy but instead uses the given
TimestampAssigner
(via a TimestampAssignerSupplier
).
You can use this when a TimestampAssigner
needs additional context, for example
access to the metrics system.
WatermarkStrategy<Object> wmStrategy = WatermarkStrategy
.forMonotonousTimestamps()
.withTimestampAssigner((ctx) -> new MetricsReportingAssigner(ctx));
default WatermarkStrategy<T> withTimestampAssigner(SerializableTimestampAssigner<T> timestampAssigner)
WatermarkStrategy
that wraps this strategy but instead uses the given
SerializableTimestampAssigner
.
You can use this in case you want to specify a TimestampAssigner
via a lambda
function.
WatermarkStrategy<CustomObject> wmStrategy = WatermarkStrategy
.<CustomObject>forMonotonousTimestamps()
.withTimestampAssigner((event, timestamp) -> event.getTimestamp());
default WatermarkStrategy<T> withIdleness(Duration idleTimeout)
WatermarkStrategy
that also does idleness detection in the
created WatermarkGenerator
.
Add an idle timeout to the watermark strategy. If no records flow in a partition of a stream for that amount of time, then that partition is considered "idle" and will not hold back the progress of watermarks in downstream operators.
Idleness can be important if some partitions have little data and might not have events during some periods. Without idleness, these streams can stall the overall event time progress of the application.
@PublicEvolving default WatermarkStrategy<T> withWatermarkAlignment(String watermarkGroup, Duration maxAllowedWatermarkDrift)
WatermarkStrategy
that configures the maximum watermark drift from
other sources/tasks/partitions in the same watermark group. The group may contain completely
independent sources (e.g. File and Kafka).
Once configured Flink will "pause" consuming from a source/task/partition that is ahead of the emitted watermark in the group by more than the maxAllowedWatermarkDrift.
watermarkGroup
- A group of sources to align watermarksmaxAllowedWatermarkDrift
- Maximal drift, before we pause consuming from the
source/task/partition@PublicEvolving default WatermarkStrategy<T> withWatermarkAlignment(String watermarkGroup, Duration maxAllowedWatermarkDrift, Duration updateInterval)
WatermarkStrategy
that configures the maximum watermark drift from
other sources/tasks/partitions in the same watermark group. The group may contain completely
independent sources (e.g. File and Kafka).
Once configured Flink will "pause" consuming from a source/task/partition that is ahead of the emitted watermark in the group by more than the maxAllowedWatermarkDrift.
watermarkGroup
- A group of sources to align watermarksmaxAllowedWatermarkDrift
- Maximal drift, before we pause consuming from the
source/task/partitionupdateInterval
- How often tasks should notify coordinator about the current watermark
and how often the coordinator should announce the maximal aligned watermark.static <T> WatermarkStrategy<T> forMonotonousTimestamps()
The watermarks are generated periodically and tightly follow the latest timestamp in the data. The delay introduced by this strategy is mainly the periodic interval in which the watermarks are generated.
AscendingTimestampsWatermarks
static <T> WatermarkStrategy<T> forBoundedOutOfOrderness(Duration maxOutOfOrderness)
T - B
will
follow any more.
The watermarks are generated periodically. The delay introduced by this watermark strategy is the periodic interval length, plus the out of orderness bound.
BoundedOutOfOrdernessWatermarks
static <T> WatermarkStrategy<T> forGenerator(WatermarkGeneratorSupplier<T> generatorSupplier)
WatermarkGeneratorSupplier
.static <T> WatermarkStrategy<T> noWatermarks()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.