@Internal public class BatchArrowPythonGroupAggregateFunctionOperator extends AbstractArrowPythonAggregateFunctionOperator
AggregateFunction
Operator for Group Aggregation.arrowSerializer, currentBatchCount, groupingSet, pandasAggFunctions, reuseJoinedRow, rowDataWrapper
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, outputType, userDefinedFunctionInputOffsets, userDefinedFunctionInputType, userDefinedFunctionOutputType
elementCount, maxBundleSize, pythonFunctionRunner
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
BatchArrowPythonGroupAggregateFunctionOperator(Configuration config,
PythonFunctionInfo[] pandasAggFunctions,
RowType inputType,
RowType outputType,
int[] groupKey,
int[] groupingSet,
int[] udafInputOffsets) |
Modifier and Type | Method and Description |
---|---|
void |
bufferInput(RowData input)
Buffers the specified input, it will be used to construct the operator result together with
the user-defined function execution result.
|
void |
close()
This method is called after all records have been added to the operators via the methods
Input.processElement(StreamRecord) , or TwoInputStreamOperator.processElement1(StreamRecord) and TwoInputStreamOperator.processElement2(StreamRecord) . |
void |
emitResult(Tuple2<byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.
|
void |
endInput()
It is notified that no more data will arrive on the input.
|
protected void |
invokeCurrentBatch() |
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElementInternal(RowData value) |
dispose, getFunctionInput, getFunctionUrn, getInputOutputCoderUrn, getPythonEnv, getUserDefinedFunctionsProto, isBundleFinished, processElement
createPythonFunctionRunner
checkInvokeFinishBundleByCount, createPythonEnvironmentManager, emitResults, getConfig, getFlinkMetricContainer, getPythonConfig, invokeFinishBundle, prepareSnapshotPreBarrier, processWatermark, setPythonConfig
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processWatermark
public BatchArrowPythonGroupAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType outputType, int[] groupKey, int[] groupingSet, int[] udafInputOffsets)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
Exception
- An exception in this method causes the operator to fail.public void bufferInput(RowData input) throws Exception
AbstractStatelessFunctionOperator
bufferInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public void processElementInternal(RowData value)
processElementInternal
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public void emitResult(Tuple2<byte[],Integer> resultTuple) throws Exception
AbstractPythonFunctionOperator
emitResult
in class AbstractPythonFunctionOperator<RowData>
Exception
public void endInput() throws Exception
BoundedOneInput
endInput
in interface BoundedOneInput
endInput
in class AbstractOneInputPythonFunctionOperator<RowData,RowData>
Exception
public void close() throws Exception
AbstractStreamOperator
Input.processElement(StreamRecord)
, or TwoInputStreamOperator.processElement1(StreamRecord)
and TwoInputStreamOperator.processElement2(StreamRecord)
.
The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered should be propagated, in order to cause the operation to be recognized asa failed, because the last data items are not processed properly.
close
in interface StreamOperator<RowData>
close
in class AbstractPythonFunctionOperator<RowData>
Exception
- An exception in this method causes the operator to fail.Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.