Class CompactorOperator
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
-
- org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator
-
- All Implemented Interfaces:
Serializable
,CheckpointListener
,BoundedOneInput
,Input<CompactorRequest>
,KeyContext
,KeyContextHandler
,OneInputStreamOperator<CompactorRequest,CommittableMessage<FileSinkCommittable>>
,StreamOperator<CommittableMessage<FileSinkCommittable>>
,StreamOperatorStateHandler.CheckpointedStreamOperator
,YieldingOperator<CommittableMessage<FileSinkCommittable>>
@Internal public class CompactorOperator extends AbstractStreamOperator<CommittableMessage<FileSinkCommittable>> implements OneInputStreamOperator<CompactorRequest,CommittableMessage<FileSinkCommittable>>, BoundedOneInput, CheckpointListener
An operator that perform compaction for theFileSink
.Requests received from the
CompactCoordinator
will firstly be held in memory, and snapshot into the state of a checkpoint. When the checkpoint is successfully completed, all requests received before can be submitted. The results can be emitted at the nextprepareSnapshotPreBarrier(long)
invoking after the compaction is finished, to ensure that committers can receive only one CommittableSummary and the corresponding number of Committable for a single checkpoint.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Constructor Summary
Constructors Constructor Description CompactorOperator(StreamOperatorParameters<CommittableMessage<FileSinkCommittable>> parameters, FileCompactStrategy strategy, SimpleVersionedSerializer<FileSinkCommittable> committableSerializer, FileCompactor fileCompactor, BucketWriter<?,String> bucketWriter)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.void
endInput()
It is notified that no more data will arrive from the input.CompletableFuture<?>
getAllTasksFuture()
void
initializeState(StateInitializationContext context)
Stream operators with state which can be restored need to override this hook method.void
notifyCheckpointComplete(long checkpointId)
Notifies the listener that the checkpoint with the givencheckpointId
completed and was committed.void
open()
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.void
prepareSnapshotPreBarrier(long checkpointId)
This method is called when the operator should do a snapshot, before it emits its own checkpoint barrier.void
processElement(StreamRecord<CompactorRequest> element)
Processes one element that arrived on this input of theMultipleInputStreamOperator
.void
snapshotState(StateSnapshotContext context)
Stream operators with state, which want to participate in a snapshot need to override this hook method.-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
finish, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, notifyCheckpointAborted, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted
-
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, setKeyContextElement1, setKeyContextElement2, snapshotState
-
-
-
-
Constructor Detail
-
CompactorOperator
public CompactorOperator(StreamOperatorParameters<CommittableMessage<FileSinkCommittable>> parameters, FileCompactStrategy strategy, SimpleVersionedSerializer<FileSinkCommittable> committableSerializer, FileCompactor fileCompactor, BucketWriter<?,String> bucketWriter)
-
-
Method Detail
-
open
public void open() throws Exception
Description copied from class:AbstractStreamOperator
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
open
in interfaceStreamOperator<CommittableMessage<FileSinkCommittable>>
- Overrides:
open
in classAbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
processElement
public void processElement(StreamRecord<CompactorRequest> element) throws Exception
Description copied from interface:Input
Processes one element that arrived on this input of theMultipleInputStreamOperator
. This method is guaranteed to not be called concurrently with other methods of the operator.- Specified by:
processElement
in interfaceInput<CompactorRequest>
- Throws:
Exception
-
endInput
public void endInput() throws Exception
Description copied from interface:BoundedOneInput
It is notified that no more data will arrive from the input.WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long)
.NOTE: Given it is semantically very similar to the
StreamOperator.finish()
method. It might be dropped in favour of the other method at some point in time.- Specified by:
endInput
in interfaceBoundedOneInput
- Throws:
Exception
-
notifyCheckpointComplete
public void notifyCheckpointComplete(long checkpointId) throws Exception
Description copied from interface:CheckpointListener
Notifies the listener that the checkpoint with the givencheckpointId
completed and was committed.These notifications are "best effort", meaning they can sometimes be skipped. To behave properly, implementers need to follow the "Checkpoint Subsuming Contract". Please see the
class-level JavaDocs
for details.Please note that checkpoints may generally overlap, so you cannot assume that the
notifyCheckpointComplete()
call is always for the latest prior checkpoint (or snapshot) that was taken on the function/operator implementing this interface. It might be for a checkpoint that was triggered earlier. Implementing the "Checkpoint Subsuming Contract" (see above) properly handles this situation correctly as well.Please note that throwing exceptions from this method will not cause the completed checkpoint to be revoked. Throwing exceptions will typically cause task/job failure and trigger recovery.
- Specified by:
notifyCheckpointComplete
in interfaceCheckpointListener
- Overrides:
notifyCheckpointComplete
in classAbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- Parameters:
checkpointId
- The ID of the checkpoint that has been completed.- Throws:
Exception
- This method can propagate exceptions, which leads to a failure/recovery for the task. Note that this will NOT lead to the checkpoint being revoked.
-
prepareSnapshotPreBarrier
public void prepareSnapshotPreBarrier(long checkpointId) throws Exception
Description copied from interface:StreamOperator
This method is called when the operator should do a snapshot, before it emits its own checkpoint barrier.This method is intended not for any actual state persistence, but only for emitting some data before emitting the checkpoint barrier. Operators that maintain some small transient state that is inefficient to checkpoint (especially when it would need to be checkpointed in a re-scalable way) but can simply be sent downstream before the checkpoint. An example are opportunistic pre-aggregation operators, which have small the pre-aggregation state that is frequently flushed downstream.
Important: This method should not be used for any actual state snapshot logic, because it will inherently be within the synchronous part of the operator's checkpoint. If heavy work is done within this method, it will affect latency and downstream checkpoint alignments.
- Specified by:
prepareSnapshotPreBarrier
in interfaceStreamOperator<CommittableMessage<FileSinkCommittable>>
- Overrides:
prepareSnapshotPreBarrier
in classAbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- Parameters:
checkpointId
- The ID of the checkpoint.- Throws:
Exception
- Throwing an exception here causes the operator to fail and go into recovery.
-
snapshotState
public void snapshotState(StateSnapshotContext context) throws Exception
Description copied from class:AbstractStreamOperator
Stream operators with state, which want to participate in a snapshot need to override this hook method.- Specified by:
snapshotState
in interfaceStreamOperatorStateHandler.CheckpointedStreamOperator
- Overrides:
snapshotState
in classAbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- Parameters:
context
- context that provides information and means required for taking a snapshot- Throws:
Exception
-
initializeState
public void initializeState(StateInitializationContext context) throws Exception
Description copied from class:AbstractStreamOperator
Stream operators with state which can be restored need to override this hook method.- Specified by:
initializeState
in interfaceStreamOperatorStateHandler.CheckpointedStreamOperator
- Overrides:
initializeState
in classAbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- Parameters:
context
- context that allows to register different states.- Throws:
Exception
-
close
public void close() throws Exception
Description copied from interface:StreamOperator
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of processing, do so in the
StreamOperator.finish()
method.- Specified by:
close
in interfaceStreamOperator<CommittableMessage<FileSinkCommittable>>
- Overrides:
close
in classAbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- Throws:
Exception
-
getAllTasksFuture
@VisibleForTesting public CompletableFuture<?> getAllTasksFuture()
-
-