Class WatermarkStatus


  • @Internal
    public final class WatermarkStatus
    extends StreamElement
    A Watermark Status element informs stream tasks whether or not they should continue to expect watermarks from the input stream that sent them. There are 2 kinds of status, namely IDLE and ACTIVE. Watermark Status elements are generated at the sources, and may be propagated through the tasks of the topology. They directly infer the current status of the emitting task; a SourceStreamTask or StreamTask emits a IDLE if it will temporarily halt to emit any watermarks (i.e. is idle), and emits a ACTIVE once it resumes to do so (i.e. is active). Tasks are responsible for propagating their status further downstream once they toggle between being idle and active. The cases that source tasks and downstream tasks are considered either idle or active is explained below:
    • Source tasks: A source task is considered to be idle if its head operator, i.e. a StreamSource, will not emit watermarks for an indefinite amount of time. This is the case, for example, for Flink's Kafka Consumer, where sources might initially have no assigned partitions to read from, or no records can be read from the assigned partitions. Once the head StreamSource operator detects that it will resume emitting data, the source task is considered to be active. StreamSources are responsible for toggling the status of the containing source task and ensuring that no watermarks will be emitted while the task is idle. This guarantee should be enforced on sources through SourceFunction.SourceContext implementations.
    • Downstream tasks: a downstream task is considered to be idle if all its input streams are idle, i.e. the last received Watermark Status element from all input streams is a IDLE. As long as one of its input streams is active, i.e. the last received Watermark Status element from the input stream is ACTIVE, the task is active.

    Watermark Status elements received at downstream tasks also affect and control how their operators process and advance their watermarks. The below describes the effects (the logic is implemented as a StatusWatermarkValve which downstream tasks should use for such purposes):

    • Since there may be watermark generators that might produce watermarks anywhere in the middle of topologies regardless of whether there are input data at the operator, the current status of the task must be checked before forwarding watermarks emitted from an operator. If the status is actually idle, the watermark must be blocked.
    • For downstream tasks with multiple input streams, the watermarks of input streams that are temporarily idle, or has resumed to be active but its watermark is behind the overall min watermark of the operator, should not be accounted for when deciding whether or not to advance the watermark and propagated through the operator chain.

    Note that to notify downstream tasks that a source task is permanently closed and will no longer send any more elements, the source should still send a Watermark.MAX_WATERMARK instead of IDLE. Watermark Status elements only serve as markers for temporary status.

    • Constructor Detail

      • WatermarkStatus

        public WatermarkStatus​(int status)
    • Method Detail

      • isIdle

        public boolean isIdle()
      • isActive

        public boolean isActive()
      • getStatus

        public int getStatus()
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object