CompactorOperator (Flink : 2.0-SNAPSHOT API)

java.lang.Object
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
- - org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator

All Implemented Interfaces:

Serializable, CheckpointListener, BoundedOneInput, Input<CompactorRequest>, KeyContext, KeyContextHandler, OneInputStreamOperator<CompactorRequest,CommittableMessage<FileSinkCommittable>>, SetupableStreamOperator<CommittableMessage<FileSinkCommittable>>, StreamOperator<CommittableMessage<FileSinkCommittable>>, StreamOperatorStateHandler.CheckpointedStreamOperator, YieldingOperator<CommittableMessage<FileSinkCommittable>>
```
@Internal
public class CompactorOperator
extends AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
implements OneInputStreamOperator<CompactorRequest,CommittableMessage<FileSinkCommittable>>, BoundedOneInput, CheckpointListener
```
An operator that perform compaction for the FileSink.
Requests received from the CompactCoordinator will firstly be held in memory, and snapshot into the state of a checkpoint. When the checkpoint is successfully completed, all requests received before can be submitted. The results can be emitted at the next prepareSnapshotPreBarrier(long) invoking after the compaction is finished, to ensure that committers can receive only one CommittableSummary and the corresponding number of Committable for a single checkpoint.

See Also:

Serialized Form

Field Summary
- Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
  chainingStrategy, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager

Constructor Summary

Constructors
Constructor and Description
`CompactorOperator(FileCompactStrategy strategy, SimpleVersionedSerializer<FileSinkCommittable> committableSerializer, FileCompactor fileCompactor, BucketWriter<?,String> bucketWriter)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`close()` This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.
`void`	`endInput()` It is notified that no more data will arrive from the input.
`CompletableFuture<?>`	`getAllTasksFuture()`
`void`	`initializeState(StateInitializationContext context)` Stream operators with state which can be restored need to override this hook method.
`void`	`notifyCheckpointComplete(long checkpointId)` Notifies the listener that the checkpoint with the given `checkpointId` completed and was committed.
`void`	`open()` This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g.
`void`	`prepareSnapshotPreBarrier(long checkpointId)` This method is called when the operator should do a snapshot, before it emits its own checkpoint barrier.
`void`	`processElement(StreamRecord<CompactorRequest> element)` Processes one element that arrived on this input of the `MultipleInputStreamOperator`.
`void`	`snapshotState(StateSnapshotContext context)` Stream operators with state, which want to participate in a snapshot need to override this hook method.

Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
finish, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, useSplittableTimers

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement

Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, setKeyContextElement1, setKeyContextElement2, snapshotState

Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted

Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey

Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus

Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext

- Constructor Detail
  - CompactorOperator
```
public CompactorOperator(FileCompactStrategy strategy,
                         SimpleVersionedSerializer<FileSinkCommittable> committableSerializer,
                         FileCompactor fileCompactor,
                         BucketWriter<?,String> bucketWriter)
```
- Method Detail
  - open
```
public void open()
          throws Exception
```
    Description copied from class: AbstractStreamOperator
    
    This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.
    The default implementation does nothing.
    
    Specified by:
    
    open in interface StreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Overrides:
    
    open in class AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Throws:
    
    Exception - An exception in this method causes the operator to fail.
  - processElement
```
public void processElement(StreamRecord<CompactorRequest> element)
                    throws Exception
```
    Description copied from interface: Input
    
    Processes one element that arrived on this input of the MultipleInputStreamOperator. This method is guaranteed to not be called concurrently with other methods of the operator.
    
    Specified by:
    
    processElement in interface Input<CompactorRequest>
    
    Throws:
    
    Exception
  - endInput
```
public void endInput()
              throws Exception
```
    Description copied from interface: BoundedOneInput
    
    It is notified that no more data will arrive from the input.
    WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a CheckpointListener.notifyCheckpointComplete(long).
    NOTE: Given it is semantically very similar to the StreamOperator.finish() method. It might be dropped in favour of the other method at some point in time.
    
    Specified by:
    
    endInput in interface BoundedOneInput
    
    Throws:
    
    Exception
  - notifyCheckpointComplete
```
public void notifyCheckpointComplete(long checkpointId)
                              throws Exception
```
    Description copied from interface: CheckpointListener
    
    Notifies the listener that the checkpoint with the given checkpointId completed and was committed.
    These notifications are "best effort", meaning they can sometimes be skipped. To behave properly, implementers need to follow the "Checkpoint Subsuming Contract". Please see the class-level JavaDocs for details.
    Please note that checkpoints may generally overlap, so you cannot assume that the notifyCheckpointComplete() call is always for the latest prior checkpoint (or snapshot) that was taken on the function/operator implementing this interface. It might be for a checkpoint that was triggered earlier. Implementing the "Checkpoint Subsuming Contract" (see above) properly handles this situation correctly as well.
    Please note that throwing exceptions from this method will not cause the completed checkpoint to be revoked. Throwing exceptions will typically cause task/job failure and trigger recovery.
    
    Specified by:
    
    notifyCheckpointComplete in interface CheckpointListener
    
    Overrides:
    
    notifyCheckpointComplete in class AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Parameters:
    
    checkpointId - The ID of the checkpoint that has been completed.
    
    Throws:
    
    Exception - This method can propagate exceptions, which leads to a failure/recovery for the task. Note that this will NOT lead to the checkpoint being revoked.
  - prepareSnapshotPreBarrier
```
public void prepareSnapshotPreBarrier(long checkpointId)
                               throws Exception
```
    Description copied from interface: StreamOperator
    
    This method is called when the operator should do a snapshot, before it emits its own checkpoint barrier.
    This method is intended not for any actual state persistence, but only for emitting some data before emitting the checkpoint barrier. Operators that maintain some small transient state that is inefficient to checkpoint (especially when it would need to be checkpointed in a re-scalable way) but can simply be sent downstream before the checkpoint. An example are opportunistic pre-aggregation operators, which have small the pre-aggregation state that is frequently flushed downstream.
    Important: This method should not be used for any actual state snapshot logic, because it will inherently be within the synchronous part of the operator's checkpoint. If heavy work is done within this method, it will affect latency and downstream checkpoint alignments.
    
    Specified by:
    
    prepareSnapshotPreBarrier in interface StreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Overrides:
    
    prepareSnapshotPreBarrier in class AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Parameters:
    
    checkpointId - The ID of the checkpoint.
    
    Throws:
    
    Exception - Throwing an exception here causes the operator to fail and go into recovery.
  - snapshotState
```
public void snapshotState(StateSnapshotContext context)
                   throws Exception
```
    Description copied from class: AbstractStreamOperator
    
    Stream operators with state, which want to participate in a snapshot need to override this hook method.
    
    Specified by:
    
    snapshotState in interface StreamOperatorStateHandler.CheckpointedStreamOperator
    
    Overrides:
    
    snapshotState in class AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Parameters:
    
    context - context that provides information and means required for taking a snapshot
    
    Throws:
    
    Exception
  - initializeState
```
public void initializeState(StateInitializationContext context)
                     throws Exception
```
    Description copied from class: AbstractStreamOperator
    
    Stream operators with state which can be restored need to override this hook method.
    
    Specified by:
    
    initializeState in interface StreamOperatorStateHandler.CheckpointedStreamOperator
    
    Overrides:
    
    initializeState in class AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Parameters:
    
    context - context that allows to register different states.
    
    Throws:
    
    Exception
  - close
```
public void close()
           throws Exception
```
    Description copied from interface: StreamOperator
    
    This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.
    This method is expected to make a thorough effort to release all resources that the operator has acquired.
    NOTE:It can not emit any records! If you need to emit records at the end of processing, do so in the StreamOperator.finish() method.
    
    Specified by:
    
    close in interface StreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Overrides:
    
    close in class AbstractStreamOperator<CommittableMessage<FileSinkCommittable>>
    
    Throws:
    
    Exception
  - getAllTasksFuture
```
@VisibleForTesting
public CompletableFuture<?> getAllTasksFuture()
```

Back to Flink Website

Class CompactorOperator

Field Summary

Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator

Constructor Summary

Method Summary

Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator

Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator

Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener

Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext

Methods inherited from interface org.apache.flink.streaming.api.operators.Input

Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler

Constructor Detail

CompactorOperator

Method Detail

open

processElement

endInput

notifyCheckpointComplete

prepareSnapshotPreBarrier

snapshotState

initializeState

close

getAllTasksFuture

Back to Flink Website