public class Task extends Object implements Runnable, TaskSlotPayload, TaskActions, PartitionProducerStateProvider, CheckpointListener
The Flink operators (implemented as subclasses of AbstractInvokable
have only data
readers, writers, and certain event callbacks. The task connects those to the network stack and
actor messages, and tracks the state of the execution and handles exceptions.
Tasks have no knowledge about how they relate to other tasks, or whether they are the first attempt to execute the task, or a repeated attempt. All of that is only known to the JobManager. All the task knows are its own runnable code, the task's configuration, and the IDs of the intermediate results to consume and produce (if any).
Each Task is run by one dedicated thread.
PartitionProducerStateProvider.ResponseHandle
Constructor and Description |
---|
Task(JobInformation jobInformation,
TaskInformation taskInformation,
ExecutionAttemptID executionAttemptID,
AllocationID slotAllocationId,
int subtaskIndex,
int attemptNumber,
List<ResultPartitionDeploymentDescriptor> resultPartitionDeploymentDescriptors,
List<InputGateDeploymentDescriptor> inputGateDeploymentDescriptors,
MemoryManager memManager,
IOManager ioManager,
ShuffleEnvironment<?,?> shuffleEnvironment,
KvStateService kvStateService,
BroadcastVariableManager bcVarManager,
TaskEventDispatcher taskEventDispatcher,
ExternalResourceInfoProvider externalResourceInfoProvider,
TaskStateManager taskStateManager,
TaskManagerActions taskManagerActions,
InputSplitProvider inputSplitProvider,
CheckpointResponder checkpointResponder,
TaskOperatorEventGateway operatorCoordinatorEventGateway,
GlobalAggregateManager aggregateManager,
LibraryCacheManager.ClassLoaderHandle classLoaderHandle,
FileCache fileCache,
TaskManagerRuntimeInfo taskManagerConfig,
TaskMetricGroup metricGroup,
ResultPartitionConsumableNotifier resultPartitionConsumableNotifier,
PartitionProducerStateChecker partitionProducerStateChecker,
Executor executor)
IMPORTANT: This constructor may not start any work that would need to be undone in the
case of a failing task deployment.
|
Modifier and Type | Method and Description |
---|---|
void |
cancelExecution()
Cancels the task execution.
|
void |
deliverOperatorEvent(OperatorID operator,
SerializedValue<OperatorEvent> evt)
Dispatches an operator event to the invokable task.
|
void |
failExternally(Throwable cause)
Marks task execution failed for an external reason (a reason other than the task code itself
throwing an exception).
|
AccumulatorRegistry |
getAccumulatorRegistry() |
AllocationID |
getAllocationId() |
Thread |
getExecutingThread() |
ExecutionAttemptID |
getExecutionId() |
ExecutionState |
getExecutionState()
Returns the current execution state of the task.
|
Throwable |
getFailureCause()
If the task has failed, this method gets the exception that caused this task to fail.
|
Configuration |
getJobConfiguration() |
JobID |
getJobID() |
JobVertexID |
getJobVertexId() |
TaskMetricGroup |
getMetricGroup() |
Configuration |
getTaskConfiguration() |
TaskInfo |
getTaskInfo() |
CompletableFuture<ExecutionState> |
getTerminationFuture() |
boolean |
isBackPressured() |
boolean |
isCanceledOrFailed()
Checks whether the task has failed, is canceled, or is being canceled at the moment.
|
void |
notifyCheckpointAborted(long checkpointID)
This method is called as a notification once a distributed checkpoint has been aborted.
|
void |
notifyCheckpointComplete(long checkpointID)
Notifies the listener that the checkpoint with the given
checkpointId completed and
was committed. |
void |
requestPartitionProducerState(IntermediateDataSetID intermediateDataSetId,
ResultPartitionID resultPartitionId,
java.util.function.Consumer<? super PartitionProducerStateProvider.ResponseHandle> responseConsumer)
Trigger the producer execution state request.
|
void |
run()
The core work method that bootstraps the task and executes its code.
|
static void |
setupPartitionsAndGates(ResultPartitionWriter[] producedPartitions,
InputGate[] inputGates) |
void |
startTaskThread()
Starts the task's thread.
|
String |
toString() |
void |
triggerCheckpointBarrier(long checkpointID,
long checkpointTimestamp,
CheckpointOptions checkpointOptions)
Calls the invokable to trigger a checkpoint.
|
public Task(JobInformation jobInformation, TaskInformation taskInformation, ExecutionAttemptID executionAttemptID, AllocationID slotAllocationId, int subtaskIndex, int attemptNumber, List<ResultPartitionDeploymentDescriptor> resultPartitionDeploymentDescriptors, List<InputGateDeploymentDescriptor> inputGateDeploymentDescriptors, MemoryManager memManager, IOManager ioManager, ShuffleEnvironment<?,?> shuffleEnvironment, KvStateService kvStateService, BroadcastVariableManager bcVarManager, TaskEventDispatcher taskEventDispatcher, ExternalResourceInfoProvider externalResourceInfoProvider, TaskStateManager taskStateManager, TaskManagerActions taskManagerActions, InputSplitProvider inputSplitProvider, CheckpointResponder checkpointResponder, TaskOperatorEventGateway operatorCoordinatorEventGateway, GlobalAggregateManager aggregateManager, LibraryCacheManager.ClassLoaderHandle classLoaderHandle, FileCache fileCache, TaskManagerRuntimeInfo taskManagerConfig, @Nonnull TaskMetricGroup metricGroup, ResultPartitionConsumableNotifier resultPartitionConsumableNotifier, PartitionProducerStateChecker partitionProducerStateChecker, Executor executor)
public JobID getJobID()
getJobID
in interface TaskSlotPayload
public JobVertexID getJobVertexId()
public ExecutionAttemptID getExecutionId()
getExecutionId
in interface TaskSlotPayload
public AllocationID getAllocationId()
getAllocationId
in interface TaskSlotPayload
public TaskInfo getTaskInfo()
public Configuration getJobConfiguration()
public Configuration getTaskConfiguration()
public AccumulatorRegistry getAccumulatorRegistry()
public TaskMetricGroup getMetricGroup()
public Thread getExecutingThread()
public CompletableFuture<ExecutionState> getTerminationFuture()
getTerminationFuture
in interface TaskSlotPayload
public boolean isBackPressured()
public ExecutionState getExecutionState()
public boolean isCanceledOrFailed()
public Throwable getFailureCause()
public void startTaskThread()
public void run()
@VisibleForTesting public static void setupPartitionsAndGates(ResultPartitionWriter[] producedPartitions, InputGate[] inputGates) throws IOException
IOException
public void cancelExecution()
This method never blocks.
public void failExternally(Throwable cause)
This method never blocks.
failExternally
in interface TaskSlotPayload
failExternally
in interface TaskActions
cause
- of the failurepublic void requestPartitionProducerState(IntermediateDataSetID intermediateDataSetId, ResultPartitionID resultPartitionId, java.util.function.Consumer<? super PartitionProducerStateProvider.ResponseHandle> responseConsumer)
PartitionProducerStateProvider
requestPartitionProducerState
in interface PartitionProducerStateProvider
intermediateDataSetId
- ID of the parent intermediate data set.resultPartitionId
- ID of the result partition to check. This identifies the producing
execution and partition.responseConsumer
- consumer for the response handle.public void triggerCheckpointBarrier(long checkpointID, long checkpointTimestamp, CheckpointOptions checkpointOptions)
checkpointID
- The ID identifying the checkpoint.checkpointTimestamp
- The timestamp associated with the checkpoint.checkpointOptions
- Options for performing this checkpoint.public void notifyCheckpointComplete(long checkpointID)
CheckpointListener
checkpointId
completed and
was committed.
These notifications are "best effort", meaning they can sometimes be skipped. To behave
properly, implementers need to follow the "Checkpoint Subsuming Contract". Please see the
class-level JavaDocs
for details.
Please note that checkpoints may generally overlap, so you cannot assume that the notifyCheckpointComplete()
call is always for the latest prior checkpoint (or snapshot) that
was taken on the function/operator implementing this interface. It might be for a checkpoint
that was triggered earlier. Implementing the "Checkpoint Subsuming Contract" (see above)
properly handles this situation correctly as well.
Please note that throwing exceptions from this method will not cause the completed checkpoint to be revoked. Throwing exceptions will typically cause task/job failure and trigger recovery.
notifyCheckpointComplete
in interface CheckpointListener
checkpointID
- The ID of the checkpoint that has been completed.public void notifyCheckpointAborted(long checkpointID)
CheckpointListener
Important: The fact that a checkpoint has been aborted does NOT mean that the data
and artifacts produced between the previous checkpoint and the aborted checkpoint are to be
discarded. The expected behavior is as if this checkpoint was never triggered in the first
place, and the next successful checkpoint simply covers a longer time span. See the
"Checkpoint Subsuming Contract" in the class-level JavaDocs
for
details.
These notifications are "best effort", meaning they can sometimes be skipped.
This method is very rarely necessary to implement. The "best effort" guarantee, together with the fact that this method should not result in discarding any data (per the "Checkpoint Subsuming Contract") means it is mainly useful for earlier cleanups of auxiliary resources. One example is to pro-actively clear a local per-checkpoint state cache upon checkpoint failure.
notifyCheckpointAborted
in interface CheckpointListener
checkpointID
- The ID of the checkpoint that has been aborted.public void deliverOperatorEvent(OperatorID operator, SerializedValue<OperatorEvent> evt) throws FlinkException
If the event delivery did not succeed, this method throws an exception. Callers can use that exception for error reporting, but need not react with failing this task (this method takes care of that).
FlinkException
- This method throws exceptions indicating the reason why delivery did
not succeed.Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.