@Internal public abstract class AbstractPythonScalarFunctionOperator extends AbstractStatelessFunctionOperator<RowData,RowData,RowData>
ScalarFunction
s. It executes the
Python ScalarFunction
s in separate Python execution environment.
The inputs are assumed as the following format: {{{ +------------------+--------------+ | forwarded fields | extra fields | +------------------+--------------+ }}}.
The Python UDFs may take input columns directly from the input row or the execution result of Java UDFs: 1) The input columns from the input row can be referred from the 'forwarded fields'; 2) The Java UDFs will be computed and the execution results can be referred from the 'extra fields'.
The outputs will be as the following format: {{{ +------------------+-------------------------+ | forwarded fields | scalar function results | +------------------+-------------------------+ }}}.
Modifier and Type | Field and Description |
---|---|
protected JoinedRowData |
reuseJoinedRow
The JoinedRowData reused holding the execution result.
|
protected StreamRecordRowDataWrappingCollector |
rowDataWrapper
The collector used to collect records.
|
protected PythonFunctionInfo[] |
scalarFunctions
The Python
ScalarFunction s to be executed. |
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType
pythonFunctionRunner
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
AbstractPythonScalarFunctionOperator(Configuration config,
PythonFunctionInfo[] scalarFunctions,
RowType inputType,
RowType udfInputType,
RowType udfOutputType,
GeneratedProjection udfInputGeneratedProjection,
GeneratedProjection forwardedFieldGeneratedProjection) |
Modifier and Type | Method and Description |
---|---|
void |
bufferInput(RowData input)
Buffers the specified input, it will be used to construct the operator result together with
the user-defined function execution result.
|
RowData |
getFunctionInput(RowData element) |
String |
getFunctionUrn() |
PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
FlinkFnApi.UserDefinedFunctions |
getUserDefinedFunctionsProto()
Gets the proto representation of the Python user-defined functions to be executed.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
createInputCoderInfoDescriptor, createOutputCoderInfoDescriptor, createPythonFunctionRunner, processElement, processElementInternal
endInput
close, createPythonEnvironmentManager, emitResult, emitResults, invokeFinishBundle
checkInvokeFinishBundleByCount, finish, getConfiguration, getFlinkMetricContainer, isBundleFinished, prepareSnapshotPreBarrier, processWatermark, setCurrentKey
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, registerCounterOnOutput, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
close, finish, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processWatermark, processWatermarkStatus
protected final PythonFunctionInfo[] scalarFunctions
ScalarFunction
s to be executed.protected transient StreamRecordRowDataWrappingCollector rowDataWrapper
protected transient JoinedRowData reuseJoinedRow
public AbstractPythonScalarFunctionOperator(Configuration config, PythonFunctionInfo[] scalarFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, GeneratedProjection udfInputGeneratedProjection, GeneratedProjection forwardedFieldGeneratedProjection)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
open
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
- An exception in this method causes the operator to fail.public PythonEnv getPythonEnv()
AbstractPythonFunctionOperator
PythonEnv
used to create PythonEnvironmentManager..getPythonEnv
in class AbstractPythonFunctionOperator<RowData>
public FlinkFnApi.UserDefinedFunctions getUserDefinedFunctionsProto()
getUserDefinedFunctionsProto
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public String getFunctionUrn()
getFunctionUrn
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public void bufferInput(RowData input)
AbstractStatelessFunctionOperator
bufferInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public RowData getFunctionInput(RowData element)
getFunctionInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.