Class SourceOutputWithWatermarks<T>
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.source.SourceOutputWithWatermarks<T>
-
- Type Parameters:
T
- The type of emitted records.
- All Implemented Interfaces:
WatermarkOutput
,SourceOutput<T>
@Internal public class SourceOutputWithWatermarks<T> extends Object implements SourceOutput<T>
Implementation of the SourceOutput. The records emitted to this output are pushed into a givenPushingAsyncDataInput.DataOutput
. The watermarks are pushed into the same output, or into a separateWatermarkOutput
, if one is provided.Periodic Watermarks
This output does not implement automatic periodic watermark emission. The method
emitPeriodicWatermark()
needs to be called periodically.Note on Performance Considerations
The methods
SourceOutput.collect(Object)
andSourceOutput.collect(Object, long)
are highly performance-critical (part of the hot loop). To make the code as JIT friendly as possible, we want to have only a single implementation of these two methods, across all classes. That way, the JIT compiler can de-virtualize (and inline) them better.Currently, we have one implementation of these methods for the case where we don't need watermarks (see class
NoOpTimestampsAndWatermarks
) and one for the case where we do (this class). When the JVM is dedicated to a single job (or type of job) only one of these classes will be loaded. In mixed job setups, we still have a bimorphic method (rather than a poly/-/mega-morphic method).
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
SourceOutputWithWatermarks(PushingAsyncDataInput.DataOutput<T> recordsOutput, WatermarkOutput onEventWatermarkOutput, WatermarkOutput periodicWatermarkOutput, TimestampAssigner<T> timestampAssigner, WatermarkGenerator<T> watermarkGenerator)
Creates a new SourceOutputWithWatermarks that emits records to the given DataOutput and watermarks to the (possibly different) WatermarkOutput.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
collect(T record)
Emit a record without a timestamp.void
collect(T record, long timestamp)
Emit a record with a timestamp.static <E> SourceOutputWithWatermarks<E>
createWithSeparateOutputs(PushingAsyncDataInput.DataOutput<E> recordsOutput, WatermarkOutput onEventWatermarkOutput, WatermarkOutput periodicWatermarkOutput, TimestampAssigner<E> timestampAssigner, WatermarkGenerator<E> watermarkGenerator)
Creates a new SourceOutputWithWatermarks that emits records to the given DataOutput and watermarks to the different WatermarkOutputs.void
emitPeriodicWatermark()
void
emitWatermark(Watermark watermark)
Emits the given watermark.void
markActive()
Marks this output as active, meaning that downstream operations should wait for watermarks from this output.void
markIdle()
Marks this output as idle, meaning that downstream operations do not wait for watermarks from this output.
-
-
-
Constructor Detail
-
SourceOutputWithWatermarks
protected SourceOutputWithWatermarks(PushingAsyncDataInput.DataOutput<T> recordsOutput, WatermarkOutput onEventWatermarkOutput, WatermarkOutput periodicWatermarkOutput, TimestampAssigner<T> timestampAssigner, WatermarkGenerator<T> watermarkGenerator)
Creates a new SourceOutputWithWatermarks that emits records to the given DataOutput and watermarks to the (possibly different) WatermarkOutput.
-
-
Method Detail
-
collect
public final void collect(T record)
Description copied from interface:SourceOutput
Emit a record without a timestamp.Use this method if the source system does not have a notion of records with timestamps.
The events later pass through a
TimestampAssigner
, which attaches a timestamp to the event based on the event's contents. For example a file source with JSON records would not have a generic timestamp from the file reading and JSON parsing process, and thus use this method to produce initially a record without a timestamp. TheTimestampAssigner
in the next step would be used to extract timestamp from a field of the JSON object.- Specified by:
collect
in interfaceSourceOutput<T>
- Parameters:
record
- the record to emit.
-
collect
public final void collect(T record, long timestamp)
Description copied from interface:SourceOutput
Emit a record with a timestamp.Use this method if the source system has timestamps attached to records. Typical examples would be Logs, PubSubs, or Message Queues, like Kafka or Kinesis, which store a timestamp with each event.
The events typically still pass through a
TimestampAssigner
, which may decide to either use this source-provided timestamp, or replace it with a timestamp stored within the event (for example if the event was a JSON object one could configure aTimestampAssigner that extracts one of the object's fields and uses that as a timestamp).- Specified by:
collect
in interfaceSourceOutput<T>
- Parameters:
record
- the record to emit.timestamp
- the timestamp of the record.
-
emitWatermark
public final void emitWatermark(Watermark watermark)
Description copied from interface:WatermarkOutput
Emits the given watermark.Emitting a watermark also implicitly marks the stream as active, ending previously marked idleness.
- Specified by:
emitWatermark
in interfaceWatermarkOutput
-
markIdle
public final void markIdle()
Description copied from interface:WatermarkOutput
Marks this output as idle, meaning that downstream operations do not wait for watermarks from this output.An output becomes active again as soon as the next watermark is emitted or
WatermarkOutput.markActive()
is explicitly called.- Specified by:
markIdle
in interfaceWatermarkOutput
-
markActive
public void markActive()
Description copied from interface:WatermarkOutput
Marks this output as active, meaning that downstream operations should wait for watermarks from this output.- Specified by:
markActive
in interfaceWatermarkOutput
-
emitPeriodicWatermark
public final void emitPeriodicWatermark()
-
createWithSeparateOutputs
public static <E> SourceOutputWithWatermarks<E> createWithSeparateOutputs(PushingAsyncDataInput.DataOutput<E> recordsOutput, WatermarkOutput onEventWatermarkOutput, WatermarkOutput periodicWatermarkOutput, TimestampAssigner<E> timestampAssigner, WatermarkGenerator<E> watermarkGenerator)
Creates a new SourceOutputWithWatermarks that emits records to the given DataOutput and watermarks to the different WatermarkOutputs.
-
-