Interface ExecutionGraph

  • All Superinterfaces:
    AccessExecutionGraph, JobStatusProvider
    All Known Implementing Classes:
    DefaultExecutionGraph

    public interface ExecutionGraph
    extends AccessExecutionGraph
    The execution graph is the central data structure that coordinates the distributed execution of a data flow. It keeps representations of each parallel task, each intermediate stream, and the communication between them.

    The execution graph consists of the following constructs:

    • The ExecutionJobVertex represents one vertex from the JobGraph (usually one operation like "map" or "join") during execution. It holds the aggregated state of all parallel subtasks. The ExecutionJobVertex is identified inside the graph by the JobVertexID, which it takes from the JobGraph's corresponding JobVertex.
    • The ExecutionVertex represents one parallel subtask. For each ExecutionJobVertex, there are as many ExecutionVertices as the parallelism. The ExecutionVertex is identified by the ExecutionJobVertex and the index of the parallel subtask
    • The Execution is one attempt to execute a ExecutionVertex. There may be multiple Executions for the ExecutionVertex, in case of a failure, or in the case where some data needs to be recomputed because it is no longer available when requested by later operations. An Execution is always identified by an ExecutionAttemptID. All messages between the JobManager and the TaskManager about deployment of tasks and updates in the task status always use the ExecutionAttemptID to address the message receiver.