@Internal public class PythonKeyedProcessOperator<OUT> extends AbstractOneInputPythonFunctionOperator<Row,OUT> implements ResultTypeQueryable<OUT>, Triggerable<Row,Object>
PythonKeyedProcessOperator
is responsible for launching beam runner which will start a
python harness to execute user defined python function. It is also able to handle the timer and
state request from the python stateful user defined function.elementCount, maxBundleSize, pythonFunctionRunner
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
PythonKeyedProcessOperator(Configuration config,
RowTypeInfo inputTypeInfo,
TypeInformation<OUT> outputTypeInfo,
DataStreamPythonFunctionInfo pythonFunctionInfo) |
PythonKeyedProcessOperator(Configuration config,
RowTypeInfo inputTypeInfo,
TypeInformation<OUT> outputTypeInfo,
DataStreamPythonFunctionInfo pythonFunctionInfo,
TypeSerializer namespaceSerializer) |
Modifier and Type | Method and Description |
---|---|
PythonFunctionRunner |
createPythonFunctionRunner()
Creates the
PythonFunctionRunner which is responsible for Python user-defined
function execution. |
void |
emitResult(Tuple2<byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.
|
Object |
getCurrentKey() |
TypeInformation<OUT> |
getProducedType()
Gets the data type (as a
TypeInformation ) produced by this function or input format. |
PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
void |
onEventTime(InternalTimer<Row,Object> timer)
Invoked when an event-time timer fires.
|
void |
onProcessingTime(InternalTimer<Row,Object> timer)
Invoked when a processing-time timer fires.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElement(StreamRecord<Row> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator . |
void |
setCurrentKey(Object key)
As the beam state gRPC service will access the KeyedStateBackend in parallel with this
operator, we must override this method to prevent changing the current key of the
KeyedStateBackend while the beam service is handling requests.
|
endInput
checkInvokeFinishBundleByCount, close, createPythonEnvironmentManager, dispose, emitResults, getConfig, getFlinkMetricContainer, getPythonConfig, invokeFinishBundle, isBundleFinished, prepareSnapshotPreBarrier, processWatermark, setPythonConfig
getChainingStrategy, getContainingTask, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
close, dispose, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
processLatencyMarker, processWatermark
public PythonKeyedProcessOperator(Configuration config, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo, DataStreamPythonFunctionInfo pythonFunctionInfo)
public PythonKeyedProcessOperator(Configuration config, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo, DataStreamPythonFunctionInfo pythonFunctionInfo, TypeSerializer namespaceSerializer)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<OUT>
open
in class AbstractPythonFunctionOperator<OUT>
Exception
- An exception in this method causes the operator to fail.public TypeInformation<OUT> getProducedType()
ResultTypeQueryable
TypeInformation
) produced by this function or input format.getProducedType
in interface ResultTypeQueryable<OUT>
public void onEventTime(InternalTimer<Row,Object> timer) throws Exception
Triggerable
onEventTime
in interface Triggerable<Row,Object>
Exception
public void onProcessingTime(InternalTimer<Row,Object> timer) throws Exception
Triggerable
onProcessingTime
in interface Triggerable<Row,Object>
Exception
public PythonFunctionRunner createPythonFunctionRunner() throws Exception
AbstractPythonFunctionOperator
PythonFunctionRunner
which is responsible for Python user-defined
function execution.createPythonFunctionRunner
in class AbstractPythonFunctionOperator<OUT>
Exception
public PythonEnv getPythonEnv()
AbstractPythonFunctionOperator
PythonEnv
used to create PythonEnvironmentManager..getPythonEnv
in class AbstractPythonFunctionOperator<OUT>
public void emitResult(Tuple2<byte[],Integer> resultTuple) throws Exception
AbstractPythonFunctionOperator
emitResult
in class AbstractPythonFunctionOperator<OUT>
Exception
public void processElement(StreamRecord<Row> element) throws Exception
Input
MultipleInputStreamOperator
.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement
in interface Input<Row>
Exception
public void setCurrentKey(Object key)
setCurrentKey
in interface KeyContext
setCurrentKey
in class AbstractPythonFunctionOperator<OUT>
public Object getCurrentKey()
getCurrentKey
in interface KeyContext
getCurrentKey
in class AbstractStreamOperator<OUT>
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.