@Internal public class ContinuousFileReaderOperator<OUT,T extends TimestampedInputSplit> extends AbstractStreamOperator<OUT> implements OneInputStreamOperator<T,OUT>, OutputTypeConfigurable<OUT>
splitsreceived from the preceding
ContinuousFileMonitoringFunction. Contrary to the
ContinuousFileMonitoringFunctionwhich has a parallelism of 1, this operator can have DOP > 1.
This implementation uses
MailboxExecutor to execute each action and state machine
approach. The workflow is the following:
OPENINGand enqueue a
READING, read one record, re-enqueue self
IDLEthen close immediately
yieldin a loop until state is
yield()causes remaining records (and splits) to be processed in the same way as above
MailboxExecutor allows to avoid explicit synchronization. At most one mail
should be enqueued at any given time.
Using FSM approach allows to explicitly define states and enforce
transitions between them.
|Modifier and Type||Method and Description|
This method is called after all records have been added to the operators via the methods
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.
Stream operators with state which can be restored need to override this hook method.
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g.
Processes one element that arrived on this input of the
Is called by the
Stream operators with state, which want to participate in a snapshot need to override this hook method.
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
public void initializeState(StateInitializationContext context) throws Exception
public void open() throws Exception
The default implementation does nothing.
public void processElement(StreamRecord<T> element) throws Exception
MultipleInputStreamOperator. This method is guaranteed to not be called concurrently with other methods of the operator.
Watermarkthat arrived on the first input of this two-input operator. This method is guaranteed to not be called concurrently with other methods of the operator.
public void dispose() throws Exception
This method is expected to make a thorough effort to release all resources that the operator has acquired.
public void close() throws Exception
The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered should be propagated, in order to cause the operation to be recognized asa failed, because the last data items are not processed properly.
public void snapshotState(StateSnapshotContext context) throws Exception
public void setOutputType(TypeInformation<OUT> outTypeInfo, ExecutionConfig executionConfig)
org.apache.flink.streaming.api.graph.StreamGraph#addOperator(Integer, String, StreamOperator, TypeInformation, TypeInformation, String)method when the
StreamGraphis generated. The method is called with the output
TypeInformationwhich is also used for the
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.