@Internal public abstract class AbstractArrowPythonAggregateFunctionOperator extends AbstractStatelessFunctionOperator<RowData,RowData,RowData>
AggregateFunction
.Modifier and Type | Field and Description |
---|---|
protected ArrowSerializer<RowData> |
arrowSerializer |
protected int |
currentBatchCount
The current number of elements to be included in an arrow batch.
|
protected int[] |
groupingSet |
protected PythonFunctionInfo[] |
pandasAggFunctions
The Pandas
AggregateFunction s to be executed. |
protected JoinedRowData |
reuseJoinedRow
The JoinedRowData reused holding the execution result.
|
protected StreamRecordRowDataWrappingCollector |
rowDataWrapper
The collector used to collect records.
|
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, outputType, userDefinedFunctionInputOffsets, userDefinedFunctionInputType, userDefinedFunctionOutputType
elementCount, maxBundleSize, pythonFunctionRunner
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
AbstractArrowPythonAggregateFunctionOperator(Configuration config,
PythonFunctionInfo[] pandasAggFunctions,
RowType inputType,
RowType outputType,
int[] groupingSet,
int[] udafInputOffsets) |
Modifier and Type | Method and Description |
---|---|
void |
dispose()
This method is called at the very end of the operator's life, both in the case of a
successful completion of the operation, and in the case of a failure and canceling.
|
RowData |
getFunctionInput(RowData element) |
String |
getFunctionUrn() |
String |
getInputOutputCoderUrn() |
PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
FlinkFnApi.UserDefinedFunctions |
getUserDefinedFunctionsProto()
Gets the proto representation of the Python user-defined functions to be executed.
|
boolean |
isBundleFinished()
Returns whether the bundle is finished.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElement(StreamRecord<RowData> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator . |
bufferInput, createPythonFunctionRunner, processElementInternal
endInput
checkInvokeFinishBundleByCount, close, createPythonEnvironmentManager, emitResult, emitResults, getConfig, getFlinkMetricContainer, getPythonConfig, invokeFinishBundle, prepareSnapshotPreBarrier, processWatermark, setCurrentKey, setPythonConfig
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
close, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processWatermark
protected final PythonFunctionInfo[] pandasAggFunctions
AggregateFunction
s to be executed.protected final int[] groupingSet
protected transient ArrowSerializer<RowData> arrowSerializer
protected transient StreamRecordRowDataWrappingCollector rowDataWrapper
protected transient JoinedRowData reuseJoinedRow
protected transient int currentBatchCount
public AbstractArrowPythonAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType outputType, int[] groupingSet, int[] udafInputOffsets)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
open
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
- An exception in this method causes the operator to fail.public void dispose() throws Exception
AbstractStreamOperator
This method is expected to make a thorough effort to release all resources that the operator has acquired.
dispose
in interface StreamOperator<RowData>
dispose
in interface Disposable
dispose
in class AbstractPythonFunctionOperator<RowData>
Exception
- if something goes wrong during disposal.public void processElement(StreamRecord<RowData> element) throws Exception
Input
MultipleInputStreamOperator
.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement
in interface Input<RowData>
processElement
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public boolean isBundleFinished()
AbstractPythonFunctionOperator
isBundleFinished
in class AbstractPythonFunctionOperator<RowData>
public PythonEnv getPythonEnv()
AbstractPythonFunctionOperator
PythonEnv
used to create PythonEnvironmentManager..getPythonEnv
in class AbstractPythonFunctionOperator<RowData>
public String getFunctionUrn()
getFunctionUrn
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public String getInputOutputCoderUrn()
getInputOutputCoderUrn
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public RowData getFunctionInput(RowData element)
getFunctionInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public FlinkFnApi.UserDefinedFunctions getUserDefinedFunctionsProto()
AbstractStatelessFunctionOperator
getUserDefinedFunctionsProto
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.