T- The type of emitted records.
@Internal public class SourceOutputWithWatermarks<T> extends Object implements SourceOutput<T>
PushingAsyncDataInput.DataOutput. The watermarks are pushed into the same output, or into a separate
WatermarkOutput, if one is provided.
This output does not implement automatic periodic watermark emission. The method
emitPeriodicWatermark() needs to be called periodically.
long) are highly performance-critical (part of the hot loop). To make the code as JIT friendly
as possible, we want to have only a single implementation of these two methods, across all
classes. That way, the JIT compiler can de-virtualize (and inline) them better.
Currently, we have one implementation of these methods for the case where we don't need
watermarks (see class
NoOpTimestampsAndWatermarks) and one for the case where we do (this
class). When the JVM is dedicated to a single job (or type of job) only one of these classes will
be loaded. In mixed job setups, we still have a bimorphic method (rather than a
|Modifier||Constructor and Description|
Creates a new SourceOutputWithWatermarks that emits records to the given DataOutput and watermarks to the (possibly different) WatermarkOutput.
|Modifier and Type||Method and Description|
Emit a record without a timestamp.
Emit a record with a timestamp.
Creates a new SourceOutputWithWatermarks that emits records to the given DataOutput and watermarks to the different WatermarkOutputs.
Emits the given watermark.
Marks this output as active, meaning that downstream operations should wait for watermarks from this output.
Marks this output as idle, meaning that downstream operations do not wait for watermarks from this output.
protected SourceOutputWithWatermarks(PushingAsyncDataInput.DataOutput<T> recordsOutput, WatermarkOutput onEventWatermarkOutput, WatermarkOutput periodicWatermarkOutput, TimestampAssigner<T> timestampAssigner, WatermarkGenerator<T> watermarkGenerator)
public final void collect(T record)
Use this method if the source system does not have a notion of records with timestamps.
The events later pass through a
TimestampAssigner, which attaches a timestamp to
the event based on the event's contents. For example a file source with JSON records would
not have a generic timestamp from the file reading and JSON parsing process, and thus use
this method to produce initially a record without a timestamp. The
in the next step would be used to extract timestamp from a field of the JSON object.
public final void collect(T record, long timestamp)
Use this method if the source system has timestamps attached to records. Typical examples would be Logs, PubSubs, or Message Queues, like Kafka or Kinesis, which store a timestamp with each event.
The events typically still pass through a
TimestampAssigner, which may decide to
either use this source-provided timestamp, or replace it with a timestamp stored within the
event (for example if the event was a JSON object one could configure aTimestampAssigner that
extracts one of the object's fields and uses that as a timestamp).
public final void emitWatermark(Watermark watermark)
Emitting a watermark also implicitly marks the stream as active, ending previously marked idleness.
public final void markIdle()
An output becomes active again as soon as the next watermark is emitted or
WatermarkOutput.markActive() is explicitly called.
public void markActive()
public final void emitPeriodicWatermark()
public static <E> SourceOutputWithWatermarks<E> createWithSeparateOutputs(PushingAsyncDataInput.DataOutput<E> recordsOutput, WatermarkOutput onEventWatermarkOutput, WatermarkOutput periodicWatermarkOutput, TimestampAssigner<E> timestampAssigner, WatermarkGenerator<E> watermarkGenerator)
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.