@Internal public class BatchArrowPythonGroupWindowAggregateFunctionOperator extends AbstractArrowPythonAggregateFunctionOperator
AggregateFunction
Operator for Group Window Aggregation.arrowSerializer, currentBatchCount, groupingSet, pandasAggFunctions, reuseJoinedRow, rowDataWrapper
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, outputType, userDefinedFunctionInputOffsets, userDefinedFunctionInputType, userDefinedFunctionOutputType
elementCount, maxBundleSize, pythonFunctionRunner
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
BatchArrowPythonGroupWindowAggregateFunctionOperator(Configuration config,
PythonFunctionInfo[] pandasAggFunctions,
RowType inputType,
RowType outputType,
int inputTimeFieldIndex,
int maxLimitSize,
long windowSize,
long slideSize,
int[] namedProperties,
int[] groupKey,
int[] groupingSet,
int[] udafInputOffsets) |
Modifier and Type | Method and Description |
---|---|
void |
bufferInput(RowData input)
Buffers the specified input, it will be used to construct the operator result together with
the user-defined function execution result.
|
void |
close()
This method is called after all records have been added to the operators via the methods
Input.processElement(StreamRecord) , or TwoInputStreamOperator.processElement1(StreamRecord) and TwoInputStreamOperator.processElement2(StreamRecord) . |
void |
emitResult(Tuple2<byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.
|
void |
endInput()
It is notified that no more data will arrive from the input.
|
protected void |
invokeCurrentBatch() |
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElementInternal(RowData value) |
dispose, getFunctionInput, getFunctionUrn, getInputOutputCoderUrn, getPythonEnv, getUserDefinedFunctionsProto, isBundleFinished, processElement
createPythonFunctionRunner
checkInvokeFinishBundleByCount, createPythonEnvironmentManager, emitResults, getConfig, getFlinkMetricContainer, getPythonConfig, invokeFinishBundle, prepareSnapshotPreBarrier, processWatermark, setCurrentKey, setPythonConfig
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processWatermark
public BatchArrowPythonGroupWindowAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType outputType, int inputTimeFieldIndex, int maxLimitSize, long windowSize, long slideSize, int[] namedProperties, int[] groupKey, int[] groupingSet, int[] udafInputOffsets)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
Exception
- An exception in this method causes the operator to fail.public void close() throws Exception
AbstractStreamOperator
Input.processElement(StreamRecord)
, or TwoInputStreamOperator.processElement1(StreamRecord)
and TwoInputStreamOperator.processElement2(StreamRecord)
.
The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered should be propagated, in order to cause the operation to be recognized asa failed, because the last data items are not processed properly.
close
in interface StreamOperator<RowData>
Exception
- An exception in this method causes the operator to fail.public void bufferInput(RowData input) throws Exception
AbstractStatelessFunctionOperator
bufferInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public void processElementInternal(RowData value) throws Exception
processElementInternal
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public void emitResult(Tuple2<byte[],Integer> resultTuple) throws Exception
AbstractPythonFunctionOperator
emitResult
in class AbstractPythonFunctionOperator<RowData>
Exception
public void endInput() throws Exception
BoundedOneInput
WARNING: It is not safe to use this method to commit any transactions or other side
effects! You can use this method to flush any buffered data that can later on be committed
e.g. in a CheckpointListener.notifyCheckpointComplete(long)
.
endInput
in interface BoundedOneInput
endInput
in class AbstractOneInputPythonFunctionOperator<RowData,RowData>
Exception
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.