@Internal public abstract class AbstractPythonFunctionOperator<OUT> extends AbstractStreamOperator<OUT>
Modifier and Type | Field and Description |
---|---|
protected Configuration |
config |
protected int |
elementCount
Number of processed elements in the current bundle.
|
protected Map<String,String> |
jobOptions
The options used to configure the Python worker process.
|
protected int |
maxBundleSize
Max number of elements to include in a bundle.
|
protected PythonConfig |
pythonConfig
The python config.
|
protected PythonFunctionRunner |
pythonFunctionRunner
The
PythonFunctionRunner which is responsible for Python user-defined function
execution. |
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
AbstractPythonFunctionOperator(Configuration config) |
Modifier and Type | Method and Description |
---|---|
protected void |
checkInvokeFinishBundleByCount()
Checks whether to invoke finishBundle by elements count.
|
void |
close()
This method is called at the very end of the operator's life, both in the case of a
successful completion of the operation, and in the case of a failure and canceling.
|
protected PythonEnvironmentManager |
createPythonEnvironmentManager() |
abstract PythonFunctionRunner |
createPythonFunctionRunner()
Creates the
PythonFunctionRunner which is responsible for Python user-defined
function execution. |
abstract void |
emitResult(Tuple2<byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.
|
protected void |
emitResults() |
void |
finish()
This method is called at the end of data processing.
|
Configuration |
getConfiguration()
Returns the
Configuration . |
protected FlinkMetricContainer |
getFlinkMetricContainer() |
abstract PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
protected void |
invokeFinishBundle() |
boolean |
isBundleFinished()
Returns whether the bundle is finished.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
prepareSnapshotPreBarrier(long checkpointId)
This method is called when the operator should do a snapshot, before it emits its own
checkpoint barrier.
|
void |
processWatermark(Watermark mark) |
void |
setConfiguration(Configuration config)
Reset the
Configuration if needed. |
void |
setCurrentKey(Object key) |
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
protected Configuration config
protected transient PythonFunctionRunner pythonFunctionRunner
PythonFunctionRunner
which is responsible for Python user-defined function
execution.protected transient int maxBundleSize
protected transient int elementCount
protected transient PythonConfig pythonConfig
public AbstractPythonFunctionOperator(Configuration config)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<OUT>
open
in class AbstractStreamOperator<OUT>
Exception
- An exception in this method causes the operator to fail.public void finish() throws Exception
StreamOperator
The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered data should be propagated, in order to cause the operation to be recognized as failed, because the last data items are not processed properly.
After this method is called, no more records can be produced for the downstream operators.
WARNING: It is not safe to use this method to commit any transactions or other side
effects! You can use this method to flush any buffered data that can later on be committed
e.g. in a CheckpointListener.notifyCheckpointComplete(long)
.
NOTE:This method does not need to close any resources. You should release external
resources in the StreamOperator.close()
method.
finish
in interface StreamOperator<OUT>
finish
in class AbstractStreamOperator<OUT>
Exception
- An exception in this method causes the operator to fail.public void close() throws Exception
StreamOperator
This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of
processing, do so in the StreamOperator.finish()
method.
close
in interface StreamOperator<OUT>
close
in class AbstractStreamOperator<OUT>
Exception
public void prepareSnapshotPreBarrier(long checkpointId) throws Exception
StreamOperator
This method is intended not for any actual state persistence, but only for emitting some data before emitting the checkpoint barrier. Operators that maintain some small transient state that is inefficient to checkpoint (especially when it would need to be checkpointed in a re-scalable way) but can simply be sent downstream before the checkpoint. An example are opportunistic pre-aggregation operators, which have small the pre-aggregation state that is frequently flushed downstream.
Important: This method should not be used for any actual state snapshot logic, because it will inherently be within the synchronous part of the operator's checkpoint. If heavy work is done within this method, it will affect latency and downstream checkpoint alignments.
prepareSnapshotPreBarrier
in interface StreamOperator<OUT>
prepareSnapshotPreBarrier
in class AbstractStreamOperator<OUT>
checkpointId
- The ID of the checkpoint.Exception
- Throwing an exception here causes the operator to fail and go into
recovery.public void processWatermark(Watermark mark) throws Exception
processWatermark
in class AbstractStreamOperator<OUT>
Exception
public void setCurrentKey(Object key)
setCurrentKey
in interface KeyContext
setCurrentKey
in class AbstractStreamOperator<OUT>
public boolean isBundleFinished()
public void setConfiguration(Configuration config)
Configuration
if needed.public Configuration getConfiguration()
Configuration
.public abstract PythonFunctionRunner createPythonFunctionRunner() throws Exception
PythonFunctionRunner
which is responsible for Python user-defined
function execution.Exception
public abstract PythonEnv getPythonEnv()
PythonEnv
used to create PythonEnvironmentManager..public abstract void emitResult(Tuple2<byte[],Integer> resultTuple) throws Exception
Exception
protected void checkInvokeFinishBundleByCount() throws Exception
Exception
protected PythonEnvironmentManager createPythonEnvironmentManager()
protected FlinkMetricContainer getFlinkMetricContainer()
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.