@Internal public abstract class AbstractArrowPythonAggregateFunctionOperator extends AbstractStatelessFunctionOperator<RowData,RowData,RowData>
AggregateFunction
.Modifier and Type | Field and Description |
---|---|
protected ArrowSerializer |
arrowSerializer |
protected int |
currentBatchCount
The current number of elements to be included in an arrow batch.
|
protected PythonFunctionInfo[] |
pandasAggFunctions
The Pandas
AggregateFunction s to be executed. |
protected JoinedRowData |
reuseJoinedRow
The JoinedRowData reused holding the execution result.
|
protected StreamRecordRowDataWrappingCollector |
rowDataWrapper
The collector used to collect records.
|
bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType
pythonFunctionRunner
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
chainingStrategy, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
Constructor and Description |
---|
AbstractArrowPythonAggregateFunctionOperator(Configuration config,
PythonFunctionInfo[] pandasAggFunctions,
RowType inputType,
RowType udfInputType,
RowType udfOutputType,
GeneratedProjection udafInputGeneratedProjection) |
Modifier and Type | Method and Description |
---|---|
void |
close()
This method is called at the very end of the operator's life, both in the case of a
successful completion of the operation, and in the case of a failure and canceling.
|
FlinkFnApi.CoderInfoDescriptor |
createInputCoderInfoDescriptor(RowType runnerInputType) |
FlinkFnApi.CoderInfoDescriptor |
createOutputCoderInfoDescriptor(RowType runnerOutType) |
FlinkFnApi.UserDefinedFunctions |
createUserDefinedFunctionsProto()
Gets the proto representation of the Python user-defined functions to be executed.
|
RowData |
getFunctionInput(RowData element) |
String |
getFunctionUrn() |
PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
boolean |
isBundleFinished()
Returns whether the bundle is finished.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g. state initialization.
|
void |
processElement(StreamRecord<RowData> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator . |
bufferInput, createPythonFunctionRunner, processElementInternal
endInput
createPythonEnvironmentManager, emitResult, emitResults, invokeFinishBundle
checkInvokeFinishBundleByCount, finish, getConfiguration, getFlinkMetricContainer, prepareSnapshotPreBarrier, processWatermark, setCurrentKey
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus
hasKeyContext
protected final PythonFunctionInfo[] pandasAggFunctions
AggregateFunction
s to be executed.protected transient ArrowSerializer arrowSerializer
protected transient StreamRecordRowDataWrappingCollector rowDataWrapper
protected transient JoinedRowData reuseJoinedRow
protected transient int currentBatchCount
public AbstractArrowPythonAggregateFunctionOperator(Configuration config, PythonFunctionInfo[] pandasAggFunctions, RowType inputType, RowType udfInputType, RowType udfOutputType, GeneratedProjection udafInputGeneratedProjection)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
open
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
- An exception in this method causes the operator to fail.public void close() throws Exception
StreamOperator
This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of
processing, do so in the StreamOperator.finish()
method.
close
in interface StreamOperator<RowData>
close
in class AbstractExternalPythonFunctionOperator<RowData>
Exception
public void processElement(StreamRecord<RowData> element) throws Exception
Input
MultipleInputStreamOperator
.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement
in interface Input<RowData>
processElement
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Exception
public boolean isBundleFinished()
AbstractPythonFunctionOperator
isBundleFinished
in class AbstractPythonFunctionOperator<RowData>
public PythonEnv getPythonEnv()
AbstractExternalPythonFunctionOperator
PythonEnv
used to create PythonEnvironmentManager..getPythonEnv
in class AbstractExternalPythonFunctionOperator<RowData>
public String getFunctionUrn()
getFunctionUrn
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public FlinkFnApi.CoderInfoDescriptor createInputCoderInfoDescriptor(RowType runnerInputType)
createInputCoderInfoDescriptor
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public FlinkFnApi.CoderInfoDescriptor createOutputCoderInfoDescriptor(RowType runnerOutType)
createOutputCoderInfoDescriptor
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public RowData getFunctionInput(RowData element)
getFunctionInput
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
public FlinkFnApi.UserDefinedFunctions createUserDefinedFunctionsProto()
AbstractStatelessFunctionOperator
createUserDefinedFunctionsProto
in class AbstractStatelessFunctionOperator<RowData,RowData,RowData>
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.