Class BatchArrowPythonGroupWindowAggregateFunctionOperator
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator<OUT>
-
- org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator<OUT>
-
- org.apache.flink.table.runtime.operators.python.AbstractOneInputPythonFunctionOperator<IN,OUT>
-
- org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator<RowData,RowData,RowData>
-
- org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
-
- org.apache.flink.table.runtime.operators.python.aggregate.arrow.batch.BatchArrowPythonGroupWindowAggregateFunctionOperator
-
- All Implemented Interfaces:
Serializable
,CheckpointListener
,BoundedOneInput
,Input<RowData>
,KeyContext
,KeyContextHandler
,OneInputStreamOperator<RowData,RowData>
,StreamOperator<RowData>
,StreamOperatorStateHandler.CheckpointedStreamOperator
,YieldingOperator<RowData>
@Internal public class BatchArrowPythonGroupWindowAggregateFunctionOperator extends AbstractArrowPythonAggregateFunctionOperator
The Batch Arrow PythonAggregateFunction
Operator for Group Window Aggregation.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
arrowSerializer, currentBatchCount, pandasAggFunctions, reuseJoinedRow, rowDataWrapper
-
Fields inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
pythonFunctionRunner
-
Fields inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Constructor Summary
Constructors Constructor Description BatchArrowPythonGroupWindowAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, int inputTimeFieldIndex, int maxLimitSize, long windowSize, long slideSize, int[] namedProperties, GeneratedProjection inputGeneratedProjection, GeneratedProjection groupKeyGeneratedProjection, GeneratedProjection groupSetGeneratedProjection)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
bufferInput(RowData input)
Buffers the specified input, it will be used to construct the operator result together with the user-defined function execution result.void
close()
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.void
emitResult(Tuple3<String,byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.void
endInput()
It is notified that no more data will arrive from the input.void
finish()
This method is called at the end of data processing.protected void
invokeCurrentBatch()
void
open()
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.void
processElementInternal(RowData value)
-
Methods inherited from class org.apache.flink.table.runtime.operators.python.aggregate.arrow.AbstractArrowPythonAggregateFunctionOperator
createInputCoderInfoDescriptor, createOutputCoderInfoDescriptor, createUserDefinedFunctionsProto, getFunctionInput, getFunctionUrn, getPythonEnv, isBundleFinished, processElement
-
Methods inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator
createPythonFunctionRunner
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator
createPythonEnvironmentManager, emitResults, invokeFinishBundle
-
Methods inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator
checkInvokeFinishBundleByCount, getConfiguration, getFlinkMetricContainer, prepareSnapshotPreBarrier, processWatermark, setCurrentKey
-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElement
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
-
-
-
Constructor Detail
-
BatchArrowPythonGroupWindowAggregateFunctionOperator
public BatchArrowPythonGroupWindowAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, int inputTimeFieldIndex, int maxLimitSize, long windowSize, long slideSize, int[] namedProperties, GeneratedProjection inputGeneratedProjection, GeneratedProjection groupKeyGeneratedProjection, GeneratedProjection groupSetGeneratedProjection)
-
-
Method Detail
-
open
public void open() throws Exception
Description copied from class:AbstractStreamOperator
This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
open
in interfaceStreamOperator<RowData>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
close
public void close() throws Exception
Description copied from interface:StreamOperator
This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of processing, do so in the
StreamOperator.finish()
method.- Specified by:
close
in interfaceStreamOperator<RowData>
- Overrides:
close
in classAbstractArrowPythonAggregateFunctionOperator
- Throws:
Exception
-
bufferInput
public void bufferInput(RowData input) throws Exception
Description copied from class:AbstractStatelessFunctionOperator
Buffers the specified input, it will be used to construct the operator result together with the user-defined function execution result.- Specified by:
bufferInput
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
- Throws:
Exception
-
processElementInternal
public void processElementInternal(RowData value) throws Exception
- Specified by:
processElementInternal
in classAbstractStatelessFunctionOperator<RowData,RowData,RowData>
- Throws:
Exception
-
emitResult
public void emitResult(Tuple3<String,byte[],Integer> resultTuple) throws Exception
Description copied from class:AbstractExternalPythonFunctionOperator
Sends the execution result to the downstream operator.- Specified by:
emitResult
in classAbstractExternalPythonFunctionOperator<RowData>
- Throws:
Exception
-
endInput
public void endInput() throws Exception
Description copied from interface:BoundedOneInput
It is notified that no more data will arrive from the input.WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long)
.NOTE: Given it is semantically very similar to the
StreamOperator.finish()
method. It might be dropped in favour of the other method at some point in time.- Specified by:
endInput
in interfaceBoundedOneInput
- Overrides:
endInput
in classAbstractOneInputPythonFunctionOperator<RowData,RowData>
- Throws:
Exception
-
finish
public void finish() throws Exception
Description copied from interface:StreamOperator
This method is called at the end of data processing.The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered data should be propagated, in order to cause the operation to be recognized as failed, because the last data items are not processed properly.
After this method is called, no more records can be produced for the downstream operators.
WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long)
.NOTE:This method does not need to close any resources. You should release external resources in the
StreamOperator.close()
method.- Specified by:
finish
in interfaceStreamOperator<RowData>
- Overrides:
finish
in classAbstractPythonFunctionOperator<RowData>
- Throws:
Exception
- An exception in this method causes the operator to fail.
-
-