@Internal public class BatchArrowPythonGroupWindowAggregateFunctionOperator extends AbstractArrowPythonAggregateFunctionOperator
AggregateFunction
Operator for Group Window Aggregation.arrowSerializer, currentBatchCount, pandasAggFunctions, reuseJoinedRow, rowDataWrapper
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType
pythonFunctionRunner
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
BatchArrowPythonGroupWindowAggregateFunctionOperator(Configuration config,
PythonFunctionInfo[] pandasAggFunctions,
RowType inputType,
RowType udfInputType,
RowType udfOutputType,
int inputTimeFieldIndex,
int maxLimitSize,
long windowSize,
long slideSize,
int[] namedProperties,
GeneratedProjection inputGeneratedProjection,
GeneratedProjection groupKeyGeneratedProjection,
GeneratedProjection groupSetGeneratedProjection) |
Modifier and Type | Method and Description |
---|---|
void |
bufferInput(RowData input)
Buffers the specified input, it will be used to construct the operator result together with
the user-defined function execution result.
|
void |
close()
This method is called at the very end of the operator's life, both in the case of a
successful completion of the operation, and in the case of a failure and canceling.
|
void |
emitResult(Tuple3<String,byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.
|
void |
endInput()
It is notified that no more data will arrive from the input.
|
void |
finish()
This method is called at the end of data processing.
|
protected void |
invokeCurrentBatch() |
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g. state initialization.
|
void |
processElementInternal(RowData value) |
createInputCoderInfoDescriptor, createOutputCoderInfoDescriptor, createUserDefinedFunctionsProto, getFunctionInput, getFunctionUrn, getPythonEnv, isBundleFinished, processElement
createPythonFunctionRunner
createPythonEnvironmentManager, emitResults, invokeFinishBundle
checkInvokeFinishBundleByCount, getConfiguration, getFlinkMetricContainer, prepareSnapshotPreBarrier, processWatermark, setCurrentKey
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processWatermark, processWatermarkStatus
hasKeyContext
public BatchArrowPythonGroupWindowAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, int inputTimeFieldIndex, int maxLimitSize, long windowSize, long slideSize, int[] namedProperties, GeneratedProjection inputGeneratedProjection, GeneratedProjection groupKeyGeneratedProjection, GeneratedProjection groupSetGeneratedProjection)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
Exception
- An exception in this method causes the operator to fail.public void close() throws Exception
StreamOperator
This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of
processing, do so in the StreamOperator.finish()
method.
close
in interface StreamOperator<RowData>
close
in class AbstractArrowPythonAggregateFunctionOperator
Exception
public void bufferInput(RowData input) throws Exception
AbstractStatelessFunctionOperator
bufferInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public void processElementInternal(RowData value) throws Exception
processElementInternal
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public void emitResult(Tuple3<String,byte[],Integer> resultTuple) throws Exception
AbstractExternalPythonFunctionOperator
emitResult
in class AbstractExternalPythonFunctionOperator<RowData>
Exception
public void endInput() throws Exception
BoundedOneInput
WARNING: It is not safe to use this method to commit any transactions or other side
effects! You can use this method to flush any buffered data that can later on be committed
e.g. in a CheckpointListener.notifyCheckpointComplete(long)
.
NOTE: Given it is semantically very similar to the StreamOperator.finish()
method. It might be dropped in favour of the other method at some point in time.
endInput
in interface BoundedOneInput
endInput
in class AbstractOneInputPythonFunctionOperator<RowData,RowData>
Exception
public void finish() throws Exception
StreamOperator
The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered data should be propagated, in order to cause the operation to be recognized as failed, because the last data items are not processed properly.
After this method is called, no more records can be produced for the downstream operators.
WARNING: It is not safe to use this method to commit any transactions or other side
effects! You can use this method to flush any buffered data that can later on be committed
e.g. in a CheckpointListener.notifyCheckpointComplete(long)
.
NOTE:This method does not need to close any resources. You should release external
resources in the StreamOperator.close()
method.
finish
in interface StreamOperator<RowData>
finish
in class AbstractPythonFunctionOperator<RowData>
Exception
- An exception in this method causes the operator to fail.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.