public class TemporalRowTimeJoinOperator extends BaseTwoInputStreamOperatorWithStateRetention
Cleaning up the state drops all of the "old" values from the probe side, where "old" is defined as older then the current watermark. Build side is also cleaned up in the similar fashion, however we always keep at least one record - the latest one - even if it's past the last watermark.
One more trick is how the emitting results and cleaning up is triggered. It is achieved by registering timers for the keys. We could register a timer for every probe and build side element's event time (when watermark exceeds this timer, that's when we are emitting and/or cleaning up the state). However this would cause huge number of registered timers. For example with following evenTimes of probe records accumulated: {1, 2, 5, 8, 9}, if we had received Watermark(10), it would trigger 5 separate timers for the same key. To avoid that we always keep only one single registered timer for any given key, registered for the minimal value. Upon triggering it, we process all records with event times older then or equal to currentWatermark.
AbstractStreamOperator.CountingOutput<OUT>
stateCleaningEnabled
chainingStrategy, config, latencyStats, LOG, metrics, output, timeServiceManager
Constructor and Description |
---|
TemporalRowTimeJoinOperator(BaseRowTypeInfo leftType,
BaseRowTypeInfo rightType,
GeneratedJoinCondition generatedJoinCondition,
int leftTimeAttribute,
int rightTimeAttribute,
long minRetentionTime,
long maxRetentionTime) |
Modifier and Type | Method and Description |
---|---|
void |
cleanupState(long time)
The method to be called when a cleanup timer fires.
|
void |
close()
This method is called after all records have been added to the operators via the methods
OneInputStreamOperator.processElement(StreamRecord) , or
TwoInputStreamOperator.processElement1(StreamRecord) and
TwoInputStreamOperator.processElement2(StreamRecord) . |
void |
onEventTime(InternalTimer<Object,VoidNamespace> timer)
Invoked when an event-time timer fires.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElement1(StreamRecord<BaseRow> element)
Processes one element that arrived on the first input of this two-input operator.
|
void |
processElement2(StreamRecord<BaseRow> element)
Processes one element that arrived on the second input of this two-input operator.
|
cleanupLastTimer, onProcessingTime, registerProcessingCleanupTimer
dispose, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getUserCodeClassloader, initializeState, initializeState, notifyCheckpointComplete, numEventTimeTimers, numProcessingTimeTimers, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2
dispose, getChainingStrategy, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointComplete
getCurrentKey, setCurrentKey
public TemporalRowTimeJoinOperator(BaseRowTypeInfo leftType, BaseRowTypeInfo rightType, GeneratedJoinCondition generatedJoinCondition, int leftTimeAttribute, int rightTimeAttribute, long minRetentionTime, long maxRetentionTime)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<BaseRow>
open
in class BaseTwoInputStreamOperatorWithStateRetention
Exception
- An exception in this method causes the operator to fail.public void processElement1(StreamRecord<BaseRow> element) throws Exception
TwoInputStreamOperator
Exception
public void processElement2(StreamRecord<BaseRow> element) throws Exception
TwoInputStreamOperator
Exception
public void onEventTime(InternalTimer<Object,VoidNamespace> timer) throws Exception
Triggerable
Exception
public void close() throws Exception
AbstractStreamOperator
OneInputStreamOperator.processElement(StreamRecord)
, or
TwoInputStreamOperator.processElement1(StreamRecord)
and
TwoInputStreamOperator.processElement2(StreamRecord)
.
The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered should be propagated, in order to cause the operation to be recognized asa failed, because the last data items are not processed properly.
close
in interface StreamOperator<BaseRow>
close
in class AbstractStreamOperator<BaseRow>
Exception
- An exception in this method causes the operator to fail.public void cleanupState(long time)
cleanupState
in class BaseTwoInputStreamOperatorWithStateRetention
time
- The timestamp of the fired timer.Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.