Class AbstractArrowPythonAggregateFunctionOperator
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator<OUT>
-
- org.apache.flink.table.runtime.operators.python.AbstractOneInputPythonFunctionOperator<IN,OUT>
-
- org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
- org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
-
- All Implemented Interfaces:
Serializable
,CheckpointListener
,BoundedOneInput
,Input<RowData>
,KeyContext
,KeyContextHandler
,OneInputStreamOperator<RowData,RowData>
,StreamOperator<RowData>
,StreamOperatorStateHandler.CheckpointedStreamOperator
,YieldingOperator<RowData>
- Direct Known Subclasses:
AbstractStreamArrowPythonOverWindowAggregateFunctionOperator
,BatchArrowPythonGroupAggregateFunctionOperator
,BatchArrowPythonGroupWindowAggregateFunctionOperator
,BatchArrowPythonOverWindowAggregateFunctionOperator
,StreamArrowPythonGroupWindowAggregateFunctionOperator
@Internal public abstract class AbstractArrowPythonAggregateFunctionOperator extends AbstractStatelessFunctionOperator<RowData,RowData,RowData>
The Abstract class of Arrow Aggregate Operator for PandasAggregateFunction
.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected ArrowSerializer
arrowSerializer
protected int
currentBatchCount
The current number of elements to be included in an arrow batch.protected PythonFunctionInfo[]
pandasAggFunctions
The PandasAggregateFunction
s to be executed.protected JoinedRowData
reuseJoinedRow
The JoinedRowData reused holding the execution result.protected StreamRecordRowDataWrappingCollector
rowDataWrapper
The collector used to collect records.-
Fields inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
pythonFunctionRunner
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Constructor Summary
Constructors Constructor Description AbstractArrowPythonAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, GeneratedProjection udafInputGeneratedProjection)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.FlinkFnApi.CoderInfoDescriptor
createInputCoderInfoDescriptor(RowType runnerInputType)
FlinkFnApi.CoderInfoDescriptor
createOutputCoderInfoDescriptor(RowType runnerOutType)
FlinkFnApi.UserDefinedFunctions
createUserDefinedFunctionsProto()
Gets the proto representation of the Python user-defined functions to be executed.RowData
getFunctionInput(RowData element)
String
getFunctionUrn()
PythonEnv
getPythonEnv()
Returns thePythonEnv
used to create PythonEnvironmentManager..boolean
isBundleFinished()
Returns whether the bundle is finished.void
open()
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.void
processElement(StreamRecord<RowData> element)
Processes one element that arrived on this input of theMultipleInputStreamOperator
.-
Methods inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator
bufferInput, createPythonFunctionRunner, processElementInternal
-
Methods inherited from class org.apache.flink.table.runtime.operators.python.AbstractOneInputPythonFunctionOperator
endInput
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
createPythonEnvironmentManager, emitResult, emitResults, invokeFinishBundle
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
checkInvokeFinishBundleByCount, finish, getConfiguration, getFlinkMetricContainer, prepareSnapshotPreBarrier, processWatermark, setCurrentKey
-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
-
-
-
Field Detail
-
pandasAggFunctions
protected final PythonFunctionInfo[] pandasAggFunctions
The PandasAggregateFunction
s to be executed.
-
arrowSerializer
protected transient ArrowSerializer arrowSerializer
-
rowDataWrapper
protected transient StreamRecordRowDataWrappingCollector rowDataWrapper
The collector used to collect records.
-
reuseJoinedRow
protected transient JoinedRowData reuseJoinedRow
The JoinedRowData reused holding the execution result.
-
currentBatchCount
protected transient int currentBatchCount
The current number of elements to be included in an arrow batch.
-
-
Constructor Detail
-
AbstractArrowPythonAggregateFunctionOperator
public AbstractArrowPythonAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, GeneratedProjection udafInputGeneratedProjection)
-
-
Method Detail
-
open
public void open() throws Exception
Description copied from class:AbstractStreamOperator
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
open
in interfaceStreamOperator<RowData>
- Overrides:
open
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
close
public void close() throws Exception
Description copied from interface:StreamOperator
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of processing, do so in the
StreamOperator.finish()
method.- Specified by:
close
in interfaceStreamOperator<RowData>
- Overrides:
close
in classAbstractExternalPythonFunctionOperator<RowData>
- Throws:
Exception
-
processElement
public void processElement(StreamRecord<RowData> element) throws Exception
Description copied from interface:Input
Processes one element that arrived on this input of theMultipleInputStreamOperator
. This method is guaranteed to not be called concurrently with other methods of the operator.- Specified by:
processElement
in interfaceInput<RowData>
- Overrides:
processElement
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
- Throws:
Exception
-
isBundleFinished
public boolean isBundleFinished()
Description copied from class:AbstractPythonFunctionOperator
Returns whether the bundle is finished.- Overrides:
isBundleFinished
in classAbstractPythonFunctionOperator<RowData>
-
getPythonEnv
public PythonEnv getPythonEnv()
Description copied from class:AbstractExternalPythonFunctionOperator
Returns thePythonEnv
used to create PythonEnvironmentManager..- Specified by:
getPythonEnv
in classAbstractExternalPythonFunctionOperator<RowData>
-
getFunctionUrn
public String getFunctionUrn()
- Specified by:
getFunctionUrn
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
createInputCoderInfoDescriptor
public FlinkFnApi.CoderInfoDescriptor createInputCoderInfoDescriptor(RowType runnerInputType)
- Specified by:
createInputCoderInfoDescriptor
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
createOutputCoderInfoDescriptor
public FlinkFnApi.CoderInfoDescriptor createOutputCoderInfoDescriptor(RowType runnerOutType)
- Specified by:
createOutputCoderInfoDescriptor
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
getFunctionInput
public RowData getFunctionInput(RowData element)
- Specified by:
getFunctionInput
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
createUserDefinedFunctionsProto
public FlinkFnApi.UserDefinedFunctions createUserDefinedFunctionsProto()
Description copied from class:AbstractStatelessFunctionOperator
Gets the proto representation of the Python user-defined functions to be executed.- Specified by:
createUserDefinedFunctionsProto
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
-