@Internal public class CompactorOperator extends AbstractStreamOperator<CommittableMessage<FileSinkCommittable>> implements OneInputStreamOperator<CompactorRequest,CommittableMessage<FileSinkCommittable>>, BoundedOneInput, CheckpointListener
Requests received from the
CompactCoordinator will firstly be held in memory, and
snapshot into the state of a checkpoint. When the checkpoint is successfully completed, all
requests received before can be submitted. The results can be emitted at the next
prepareSnapshotPreBarrier(long) invoking after the compaction is finished, to ensure that committers
can receive only one CommittableSummary and the corresponding number of Committable for a single
|Constructor and Description|
|Modifier and Type||Method and Description|
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.
It is notified that no more data will arrive from the input.
Stream operators with state which can be restored need to override this hook method.
Notifies the listener that the checkpoint with the given
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g.
This method is called when the operator should do a snapshot, before it emits its own checkpoint barrier.
Processes one element that arrived on this input of the
Stream operators with state, which want to participate in a snapshot need to override this hook method.
finish, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
finish, getMetricGroup, getOperatorID, initializeState, setKeyContextElement1, setKeyContextElement2, snapshotState
public void open() throws Exception
The default implementation does nothing.
public void processElement(StreamRecord<CompactorRequest> element) throws Exception
MultipleInputStreamOperator. This method is guaranteed to not be called concurrently with other methods of the operator.
public void endInput() throws Exception
WARNING: It is not safe to use this method to commit any transactions or other side
effects! You can use this method to flush any buffered data that can later on be committed
e.g. in a
NOTE: Given it is semantically very similar to the
method. It might be dropped in favour of the other method at some point in time.
public void notifyCheckpointComplete(long checkpointId) throws Exception
checkpointIdcompleted and was committed.
These notifications are "best effort", meaning they can sometimes be skipped. To behave
properly, implementers need to follow the "Checkpoint Subsuming Contract". Please see the
class-level JavaDocs for details.
Please note that checkpoints may generally overlap, so you cannot assume that the
notifyCheckpointComplete() call is always for the latest prior checkpoint (or snapshot) that
was taken on the function/operator implementing this interface. It might be for a checkpoint
that was triggered earlier. Implementing the "Checkpoint Subsuming Contract" (see above)
properly handles this situation correctly as well.
Please note that throwing exceptions from this method will not cause the completed checkpoint to be revoked. Throwing exceptions will typically cause task/job failure and trigger recovery.
checkpointId- The ID of the checkpoint that has been completed.
Exception- This method can propagate exceptions, which leads to a failure/recovery for the task. Note that this will NOT lead to the checkpoint being revoked.
public void prepareSnapshotPreBarrier(long checkpointId) throws Exception
This method is intended not for any actual state persistence, but only for emitting some data before emitting the checkpoint barrier. Operators that maintain some small transient state that is inefficient to checkpoint (especially when it would need to be checkpointed in a re-scalable way) but can simply be sent downstream before the checkpoint. An example are opportunistic pre-aggregation operators, which have small the pre-aggregation state that is frequently flushed downstream.
Important: This method should not be used for any actual state snapshot logic, because it will inherently be within the synchronous part of the operator's checkpoint. If heavy work is done within this method, it will affect latency and downstream checkpoint alignments.
checkpointId- The ID of the checkpoint.
Exception- Throwing an exception here causes the operator to fail and go into recovery.
public void snapshotState(StateSnapshotContext context) throws Exception
context- context that provides information and means required for taking a snapshot
public void initializeState(StateInitializationContext context) throws Exception
public void close() throws Exception
This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of
processing, do so in the
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.