@Public public interface SourceOutput<T> extends WatermarkOutput
SourceOutput
is the gateway for a SourceReader
) to emit the produced records
and watermarks.
A SourceReader
may have multiple SourceOutputs, scoped to individual Source
Splits. That way, streams of events from different splits can be identified and treated
separately, for example for watermark generation, or event-time skew handling.
Modifier and Type | Method and Description |
---|---|
void |
collect(T record)
Emit a record without a timestamp.
|
void |
collect(T record,
long timestamp)
Emit a record with a timestamp.
|
emitWatermark, markActive, markIdle
void collect(T record)
Use this method if the source system does not have a notion of records with timestamps.
The events later pass through a TimestampAssigner
, which attaches a timestamp to
the event based on the event's contents. For example a file source with JSON records would
not have a generic timestamp from the file reading and JSON parsing process, and thus use
this method to produce initially a record without a timestamp. The TimestampAssigner
in the next step would be used to extract timestamp from a field of the JSON object.
record
- the record to emit.void collect(T record, long timestamp)
Use this method if the source system has timestamps attached to records. Typical examples would be Logs, PubSubs, or Message Queues, like Kafka or Kinesis, which store a timestamp with each event.
The events typically still pass through a TimestampAssigner
, which may decide to
either use this source-provided timestamp, or replace it with a timestamp stored within the
event (for example if the event was a JSON object one could configure aTimestampAssigner that
extracts one of the object's fields and uses that as a timestamp).
record
- the record to emit.timestamp
- the timestamp of the record.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.