Class ContinuousFileReaderOperator<OUT,T extends TimestampedInputSplit>
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.streaming.api.functions.source.ContinuousFileReaderOperator<OUT,T>
-
- All Implemented Interfaces:
Serializable
,CheckpointListener
,Input<T>
,KeyContext
,KeyContextHandler
,OneInputStreamOperator<T,OUT>
,OutputTypeConfigurable<OUT>
,StreamOperator<OUT>
,StreamOperatorStateHandler.CheckpointedStreamOperator
,YieldingOperator<OUT>
@Internal public class ContinuousFileReaderOperator<OUT,T extends TimestampedInputSplit> extends AbstractStreamOperator<OUT> implements OneInputStreamOperator<T,OUT>, OutputTypeConfigurable<OUT>
The operator that reads thesplits
received from the precedingContinuousFileMonitoringFunction
. Contrary to theContinuousFileMonitoringFunction
which has a parallelism of 1, this operator can have DOP > 1.This implementation uses
MailboxExecutor
to execute each action and state machine approach. The workflow is the following:- start in
IDLE
- upon receiving a split add it to the queue, switch to
OPENING
and enqueue amail
to process it - open file, switch to
READING
, read one record, re-enqueue self - if no more records or splits available, switch back to
IDLE
On close:
- if
IDLE
then close immediately - otherwise switch to
CLOSING
, callyield
in a loop until state isCLOSED
yield()
causes remaining records (and splits) to be processed in the same way as above
Using
MailboxExecutor
allows to avoid explicit synchronization. At most one mail should be enqueued at any given time.Using FSM approach allows to explicitly define states and enforce
transitions
between them.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.void
finish()
This method is called at the end of data processing.void
initializeState(StateInitializationContext context)
Stream operators with state which can be restored need to override this hook method.void
open()
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.void
processElement(StreamRecord<T> element)
Processes one element that arrived on this input of theMultipleInputStreamOperator
.void
processWatermark(Watermark mark)
Processes aWatermark
that arrived on the first input of this two-input operator.void
setOutputType(TypeInformation<OUT> outTypeInfo, ExecutionConfig executionConfig)
Is called by theorg.apache.flink.streaming.api.graph.StreamGraph#addOperator(Integer, String, StreamOperator, TypeInformation, TypeInformation, String)
method when theorg.apache.flink.streaming.api.graph.StreamGraph
is generated.void
snapshotState(StateSnapshotContext context)
Stream operators with state, which want to participate in a snapshot need to override this hook method.-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermarkStatus
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
-
-
-
Method Detail
-
initializeState
public void initializeState(StateInitializationContext context) throws Exception
Description copied from class:AbstractStreamOperator
Stream operators with state which can be restored need to override this hook method.- Specified by:
initializeState
in interfaceStreamOperatorStateHandler.CheckpointedStreamOperator
- Overrides:
initializeState
in classAbstractStreamOperator<OUT>
- Parameters:
context
- context that allows to register different states.- Throws:
Exception
-
open
public void open() throws Exception
Description copied from class:AbstractStreamOperator
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
open
in interfaceStreamOperator<OUT>
- Overrides:
open
in classAbstractStreamOperator<OUT>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
processElement
public void processElement(StreamRecord<T> element) throws Exception
Description copied from interface:Input
Processes one element that arrived on this input of theMultipleInputStreamOperator
. This method is guaranteed to not be called concurrently with other methods of the operator.- Specified by:
processElement
in interfaceInput<OUT>
- Throws:
Exception
-
processWatermark
public void processWatermark(Watermark mark) throws Exception
Description copied from interface:Input
Processes aWatermark
that arrived on the first input of this two-input operator. This method is guaranteed to not be called concurrently with other methods of the operator.- Specified by:
processWatermark
in interfaceInput<OUT>
- Overrides:
processWatermark
in classAbstractStreamOperator<OUT>
- Throws:
Exception
- See Also:
Watermark
-
finish
public void finish() throws Exception
Description copied from interface:StreamOperator
This method is called at the end of data processing.The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered data should be propagated, in order to cause the operation to be recognized as failed, because the last data items are not processed properly.
After this method is called, no more records can be produced for the downstream operators.
WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long)
.NOTE:This method does not need to close any resources. You should release external resources in the
StreamOperator.close()
method.- Specified by:
finish
in interfaceStreamOperator<OUT>
- Overrides:
finish
in classAbstractStreamOperator<OUT>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
close
public void close() throws Exception
Description copied from interface:StreamOperator
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of processing, do so in the
StreamOperator.finish()
method.- Specified by:
close
in interfaceStreamOperator<OUT>
- Overrides:
close
in classAbstractStreamOperator<OUT>
- Throws:
Exception
-
snapshotState
public void snapshotState(StateSnapshotContext context) throws Exception
Description copied from class:AbstractStreamOperator
Stream operators with state, which want to participate in a snapshot need to override this hook method.- Specified by:
snapshotState
in interfaceStreamOperatorStateHandler.CheckpointedStreamOperator
- Overrides:
snapshotState
in classAbstractStreamOperator<OUT>
- Parameters:
context
- context that provides information and means required for taking a snapshot- Throws:
Exception
-
setOutputType
public void setOutputType(TypeInformation<OUT> outTypeInfo, ExecutionConfig executionConfig)
Description copied from interface:OutputTypeConfigurable
Is called by theorg.apache.flink.streaming.api.graph.StreamGraph#addOperator(Integer, String, StreamOperator, TypeInformation, TypeInformation, String)
method when theorg.apache.flink.streaming.api.graph.StreamGraph
is generated. The method is called with the outputTypeInformation
which is also used for theorg.apache.flink.streaming.runtime.tasks.StreamTask
output serializer.- Specified by:
setOutputType
in interfaceOutputTypeConfigurable<OUT>
- Parameters:
outTypeInfo
- Output type information of theorg.apache.flink.streaming.runtime.tasks.StreamTask
executionConfig
- Execution configuration
-
-