Class BatchArrowPythonGroupAggregateFunctionOperator
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator<OUT>
-
- org.apache.flink.table.runtime.operators.python.AbstractOneInputPythonFunctionOperator<IN,OUT>
-
- org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
- org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
-
- org.apache.flink.table.runtime.operators.python.aggregate.arrow.batch.BatchArrowPythonGroupAggregateFunctionOperator
-
- All Implemented Interfaces:
Serializable
,CheckpointListener
,BoundedOneInput
,Input<RowData>
,KeyContext
,KeyContextHandler
,OneInputStreamOperator<RowData,RowData>
,StreamOperator<RowData>
,StreamOperatorStateHandler.CheckpointedStreamOperator
,YieldingOperator<RowData>
@Internal public class BatchArrowPythonGroupAggregateFunctionOperator extends AbstractArrowPythonAggregateFunctionOperator
The Batch Arrow PythonAggregateFunction
Operator for Group Aggregation.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
arrowSerializer, currentBatchCount, pandasAggFunctions, reuseJoinedRow, rowDataWrapper
-
Fields inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
pythonFunctionRunner
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Constructor Summary
Constructors Constructor Description BatchArrowPythonGroupAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, GeneratedProjection inputGeneratedProjection, GeneratedProjection groupKeyGeneratedProjection, GeneratedProjection groupSetGeneratedProjection)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
bufferInput(RowData input)
Buffers the specified input, it will be used to construct the operator result together with the user-defined function execution result.void
emitResult(Tuple3<String,byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.void
endInput()
It is notified that no more data will arrive from the input.void
finish()
This method is called at the end of data processing.protected void
invokeCurrentBatch()
void
open()
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.void
processElementInternal(RowData value)
-
Methods inherited from class org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
close, createInputCoderInfoDescriptor, createOutputCoderInfoDescriptor, createUserDefinedFunctionsProto, getFunctionInput, getFunctionUrn, getPythonEnv, isBundleFinished, processElement
-
Methods inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator
createPythonFunctionRunner
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
createPythonEnvironmentManager, emitResults, invokeFinishBundle
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
checkInvokeFinishBundleByCount, getConfiguration, getFlinkMetricContainer, prepareSnapshotPreBarrier, processWatermark, setCurrentKey
-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
-
-
-
Constructor Detail
-
BatchArrowPythonGroupAggregateFunctionOperator
public BatchArrowPythonGroupAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, GeneratedProjection inputGeneratedProjection, GeneratedProjection groupKeyGeneratedProjection, GeneratedProjection groupSetGeneratedProjection)
-
-
Method Detail
-
bufferInput
public void bufferInput(RowData input) throws Exception
Description copied from class:AbstractStatelessFunctionOperator
Buffers the specified input, it will be used to construct the operator result together with the user-defined function execution result.- Specified by:
bufferInput
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
- Throws:
Exception
-
processElementInternal
public void processElementInternal(RowData value)
- Specified by:
processElementInternal
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
emitResult
public void emitResult(Tuple3<String,byte[],Integer> resultTuple) throws Exception
Description copied from class:AbstractExternalPythonFunctionOperator
Sends the execution result to the downstream operator.- Specified by:
emitResult
in classAbstractExternalPythonFunctionOperator<RowData>
- Throws:
Exception
-
open
public void open() throws Exception
Description copied from class:AbstractStreamOperator
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
open
in interfaceStreamOperator<RowData>
- Overrides:
open
in classAbstractArrowPythonAggregateFunctionOperator
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
endInput
public void endInput() throws Exception
Description copied from interface:BoundedOneInput
It is notified that no more data will arrive from the input.WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long)
.NOTE: Given it is semantically very similar to the
StreamOperator.finish()
method. It might be dropped in favour of the other method at some point in time.- Specified by:
endInput
in interfaceBoundedOneInput
- Overrides:
endInput
in classAbstractOneInputPythonFunctionOperator<RowData,RowData>
- Throws:
Exception
-
finish
public void finish() throws Exception
Description copied from interface:StreamOperator
This method is called at the end of data processing.The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered data should be propagated, in order to cause the operation to be recognized as failed, because the last data items are not processed properly.
After this method is called, no more records can be produced for the downstream operators.
WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long)
.NOTE:This method does not need to close any resources. You should release external resources in the
StreamOperator.close()
method.- Specified by:
finish
in interfaceStreamOperator<RowData>
- Overrides:
finish
in classAbstractPythonFunctionOperator<RowData>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
-