Class ExternalPythonKeyedProcessOperator<OUT>
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.process.AbstractExternalDataStreamPythonFunctionOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.process.AbstractExternalOneInputPythonFunctionOperator<Row,OUT>
-
- org.apache.flink.streaming.api.operators.python.process.ExternalPythonKeyedProcessOperator<OUT>
-
- All Implemented Interfaces:
Serializable
,CheckpointListener
,ResultTypeQueryable<OUT>
,BoundedOneInput
,Input<Row>
,KeyContext
,KeyContextHandler
,OneInputStreamOperator<Row,OUT>
,DataStreamPythonFunctionOperator<OUT>
,StreamOperator<OUT>
,StreamOperatorStateHandler.CheckpointedStreamOperator
,Triggerable<Row,Object>
,YieldingOperator<OUT>
@Internal public class ExternalPythonKeyedProcessOperator<OUT> extends AbstractExternalOneInputPythonFunctionOperator<Row,OUT> implements Triggerable<Row,Object>
ExternalPythonKeyedProcessOperator
is responsible for launching beam runner which will start a python harness to execute user defined python function. It is also able to handle the timer and state request from the python stateful user defined function.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalOneInputPythonFunctionOperator
baos, baosWrapper
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
pythonFunctionRunner
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Constructor Summary
Constructors Constructor Description ExternalPythonKeyedProcessOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo)
ExternalPythonKeyedProcessOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo, TypeSerializer namespaceSerializer)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description <T> AbstractExternalDataStreamPythonFunctionOperator<T>
copy(DataStreamPythonFunctionInfo pythonFunctionInfo, TypeInformation<T> outputTypeInfo)
Make a copy of the DataStreamPythonFunctionOperator with the given pythonFunctionInfo and outputTypeInfo.PythonFunctionRunner
createPythonFunctionRunner()
Creates thePythonFunctionRunner
which is responsible for Python user-defined function execution.Object
getCurrentKey()
void
onEventTime(InternalTimer<Row,Object> timer)
Invoked when an event-time timer fires.void
onProcessingTime(InternalTimer<Row,Object> timer)
Invoked when a processing-time timer fires.void
open()
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.void
processElement(StreamRecord<Row> element)
Processes one element that arrived on this input of theMultipleInputStreamOperator
.void
setCurrentKey(Object key)
As the beam state gRPC service will access the KeyedStateBackend in parallel with this operator, we must override this method to prevent changing the current key of the KeyedStateBackend while the beam service is handling requests.-
Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalOneInputPythonFunctionOperator
createInputCoderInfoDescriptor, createOutputCoderInfoDescriptor, emitResult, endInput, getInputTypeInfo, processElement
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalDataStreamPythonFunctionOperator
addSideOutputTags, createSideOutputCoderDescriptors, getInternalParameters, getOutputTagById, getProducedType, getPythonEnv, getPythonFunctionInfo, getSideOutputTags, getSideOutputTypeSerializerById, setNumPartitions
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
close, createPythonEnvironmentManager, emitResults, invokeFinishBundle
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
checkInvokeFinishBundleByCount, finish, getConfiguration, getFlinkMetricContainer, isBundleFinished, prepareSnapshotPreBarrier, processWatermark
-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getContainingTask, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
close, finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
-
-
-
Constructor Detail
-
ExternalPythonKeyedProcessOperator
public ExternalPythonKeyedProcessOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo)
-
ExternalPythonKeyedProcessOperator
public ExternalPythonKeyedProcessOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo, TypeSerializer namespaceSerializer)
-
-
Method Detail
-
open
public void open() throws Exception
Description copied from class:AbstractStreamOperator
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
open
in interfaceStreamOperator<OUT>
- Overrides:
open
in classAbstractExternalOneInputPythonFunctionOperator<Row,OUT>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
onEventTime
public void onEventTime(InternalTimer<Row,Object> timer) throws Exception
Description copied from interface:Triggerable
Invoked when an event-time timer fires.- Specified by:
onEventTime
in interfaceTriggerable<Row,Object>
- Throws:
Exception
-
onProcessingTime
public void onProcessingTime(InternalTimer<Row,Object> timer) throws Exception
Description copied from interface:Triggerable
Invoked when a processing-time timer fires.- Specified by:
onProcessingTime
in interfaceTriggerable<Row,Object>
- Throws:
Exception
-
createPythonFunctionRunner
public PythonFunctionRunner createPythonFunctionRunner() throws Exception
Description copied from class:AbstractExternalPythonFunctionOperator
Creates thePythonFunctionRunner
which is responsible for Python user-defined function execution.- Specified by:
createPythonFunctionRunner
in classAbstractExternalPythonFunctionOperator<OUT>
- Throws:
Exception
-
processElement
public void processElement(StreamRecord<Row> element) throws Exception
Description copied from interface:Input
Processes one element that arrived on this input of theMultipleInputStreamOperator
. This method is guaranteed to not be called concurrently with other methods of the operator.- Specified by:
processElement
in interfaceInput<OUT>
- Throws:
Exception
-
setCurrentKey
public void setCurrentKey(Object key)
As the beam state gRPC service will access the KeyedStateBackend in parallel with this operator, we must override this method to prevent changing the current key of the KeyedStateBackend while the beam service is handling requests.- Specified by:
setCurrentKey
in interfaceKeyContext
- Overrides:
setCurrentKey
in classAbstractPythonFunctionOperator<OUT>
-
getCurrentKey
public Object getCurrentKey()
- Specified by:
getCurrentKey
in interfaceKeyContext
- Overrides:
getCurrentKey
in classAbstractStreamOperator<OUT>
-
copy
public <T> AbstractExternalDataStreamPythonFunctionOperator<T> copy(DataStreamPythonFunctionInfo pythonFunctionInfo, TypeInformation<T> outputTypeInfo)
Description copied from interface:DataStreamPythonFunctionOperator
Make a copy of the DataStreamPythonFunctionOperator with the given pythonFunctionInfo and outputTypeInfo. This is used for chaining optimization which may need to update the underlying pythonFunctionInfo and outputTypeInfo with the other fields not changed.- Specified by:
copy
in interfaceDataStreamPythonFunctionOperator<OUT>
-
-