public class BatchCompactOperator<T> extends AbstractStreamOperator<CompactMessages.CompactOutput> implements OneInputStreamOperator<CompactMessages.CoordinatorOutput,CompactMessages.CompactOutput>, BoundedOneInput
CompactOperator
but skip some unnecessary
operations in batch mode.
Note: if the size of the files to be compacted is 1, this operator won't do anything and just emit the file to downstream. Also, the name of the files to be compacted is not a hidden file, it's expected these files are in hidden or temporary directory. Please make sure it. This assumption can help skip rename hidden file.
Modifier and Type | Field and Description |
---|---|
static String |
ATTEMPT_PREFIX |
static String |
COMPACTED_PREFIX |
static String |
UNCOMPACTED_PREFIX |
chainingStrategy, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
Constructor and Description |
---|
BatchCompactOperator(SupplierWithException<FileSystem,IOException> fsFactory,
CompactReader.Factory<T> readerFactory,
CompactWriter.Factory<T> writerFactory) |
Modifier and Type | Method and Description |
---|---|
void |
close()
This method is called at the very end of the operator's life, both in the case of a
successful completion of the operation, and in the case of a failure and canceling.
|
static Path |
convertFromUncompacted(Path path) |
void |
endInput()
It is notified that no more data will arrive from the input.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g. state initialization.
|
void |
processElement(StreamRecord<CompactMessages.CoordinatorOutput> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator . |
finish, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
hasKeyContext
public static final String UNCOMPACTED_PREFIX
public static final String COMPACTED_PREFIX
public static final String ATTEMPT_PREFIX
public BatchCompactOperator(SupplierWithException<FileSystem,IOException> fsFactory, CompactReader.Factory<T> readerFactory, CompactWriter.Factory<T> writerFactory)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<CompactMessages.CompactOutput>
open
in class AbstractStreamOperator<CompactMessages.CompactOutput>
Exception
- An exception in this method causes the operator to fail.public void processElement(StreamRecord<CompactMessages.CoordinatorOutput> element) throws Exception
Input
MultipleInputStreamOperator
.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement
in interface Input<CompactMessages.CoordinatorOutput>
Exception
public void endInput() throws Exception
BoundedOneInput
WARNING: It is not safe to use this method to commit any transactions or other side
effects! You can use this method to flush any buffered data that can later on be committed
e.g. in a CheckpointListener.notifyCheckpointComplete(long)
.
NOTE: Given it is semantically very similar to the StreamOperator.finish()
method. It might be dropped in favour of the other method at some point in time.
endInput
in interface BoundedOneInput
Exception
public void close() throws Exception
StreamOperator
This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of
processing, do so in the StreamOperator.finish()
method.
close
in interface StreamOperator<CompactMessages.CompactOutput>
close
in class AbstractStreamOperator<CompactMessages.CompactOutput>
Exception
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.