Class JobGraph

  • All Implemented Interfaces:
    Serializable

    public class JobGraph
    extends Object
    implements Serializable
    The JobGraph represents a Flink dataflow program, at the low level that the JobManager accepts. All programs from higher level APIs are transformed into JobGraphs.

    The JobGraph is a graph of vertices and intermediate results that are connected together to form a DAG. Note that iterations (feedback edges) are currently not encoded inside the JobGraph but inside certain special vertices that establish the feedback channel amongst themselves.

    The JobGraph defines the job-wide configuration settings, while each vertex and intermediate result define the characteristics of the concrete operation and intermediate data.

    See Also:
    Serialized Form
    • Constructor Detail

      • JobGraph

        public JobGraph​(String jobName)
        Constructs a new job graph with the given name, the given ExecutionConfig, and a random job ID. The ExecutionConfig will be serialized and can't be modified afterwards.
        Parameters:
        jobName - The name of the job.
      • JobGraph

        public JobGraph​(@Nullable
                        JobID jobId,
                        String jobName)
        Constructs a new job graph with the given job ID (or a random ID, if null is passed), the given name and the given execution configuration (see ExecutionConfig). The ExecutionConfig will be serialized and can't be modified afterwards.
        Parameters:
        jobId - The id of the job. A random ID is generated, if null is passed.
        jobName - The name of the job.
      • JobGraph

        public JobGraph​(@Nullable
                        JobID jobId,
                        String jobName,
                        JobVertex... vertices)
        Constructs a new job graph with the given name, the given ExecutionConfig, the given jobId or a random one if null supplied, and the given job vertices. The ExecutionConfig will be serialized and can't be modified afterwards.
        Parameters:
        jobId - The id of the job. A random ID is generated, if null is passed.
        jobName - The name of the job.
        vertices - The vertices to add to the graph.
    • Method Detail

      • getJobID

        public JobID getJobID()
        Returns the ID of the job.
        Returns:
        the ID of the job
      • setJobID

        public void setJobID​(JobID jobID)
        Sets the ID of the job.
      • getName

        public String getName()
        Returns the name assigned to the job graph.
        Returns:
        the name assigned to the job graph
      • setJobConfiguration

        public void setJobConfiguration​(Configuration jobConfiguration)
      • getJobConfiguration

        public Configuration getJobConfiguration()
        Returns the configuration object for this job. Job-wide parameters should be set into that configuration object.
        Returns:
        The configuration object for this job.
      • setJobType

        public void setJobType​(JobType type)
      • getJobType

        public JobType getJobType()
      • setDynamic

        public void setDynamic​(boolean dynamic)
      • isDynamic

        public boolean isDynamic()
      • enableApproximateLocalRecovery

        public void enableApproximateLocalRecovery​(boolean enabled)
      • isApproximateLocalRecoveryEnabled

        public boolean isApproximateLocalRecoveryEnabled()
      • setSavepointRestoreSettings

        public void setSavepointRestoreSettings​(SavepointRestoreSettings settings)
        Sets the savepoint restore settings.
        Parameters:
        settings - The savepoint restore settings.
      • getSavepointRestoreSettings

        public SavepointRestoreSettings getSavepointRestoreSettings()
        Returns the configured savepoint restore setting.
        Returns:
        The configured savepoint restore settings.
      • setExecutionConfig

        public void setExecutionConfig​(ExecutionConfig executionConfig)
                                throws IOException
        Sets the execution config. This method eagerly serialized the ExecutionConfig for future RPC transport. Further modification of the referenced ExecutionConfig object will not affect this serialized copy.
        Parameters:
        executionConfig - The ExecutionConfig to be serialized.
        Throws:
        IOException - Thrown if the serialization of the ExecutionConfig fails
      • addVertex

        public void addVertex​(JobVertex vertex)
        Adds a new task vertex to the job graph if it is not already included.
        Parameters:
        vertex - the new task vertex to be added
      • getVertices

        public Iterable<JobVertex> getVertices()
        Returns an Iterable to iterate all vertices registered with the job graph.
        Returns:
        an Iterable to iterate all vertices registered with the job graph
      • getVerticesAsArray

        public JobVertex[] getVerticesAsArray()
        Returns an array of all job vertices that are registered with the job graph. The order in which the vertices appear in the list is not defined.
        Returns:
        an array of all job vertices that are registered with the job graph
      • getNumberOfVertices

        public int getNumberOfVertices()
        Returns the number of all vertices.
        Returns:
        The number of all vertices.
      • getCoLocationGroups

        public Set<CoLocationGroup> getCoLocationGroups()
        Returns all CoLocationGroup instances associated with this JobGraph.
        Returns:
        The associated CoLocationGroup instances.
      • setSnapshotSettings

        public void setSnapshotSettings​(JobCheckpointingSettings settings)
        Sets the settings for asynchronous snapshots. A value of null means that snapshotting is not enabled.
        Parameters:
        settings - The snapshot settings
      • getCheckpointingSettings

        public JobCheckpointingSettings getCheckpointingSettings()
        Gets the settings for asynchronous snapshots. This method returns null, when checkpointing is not enabled.
        Returns:
        The snapshot settings
      • isCheckpointingEnabled

        public boolean isCheckpointingEnabled()
        Checks if the checkpointing was enabled for this job graph.
        Returns:
        true if checkpointing enabled
      • findVertexByID

        public JobVertex findVertexByID​(JobVertexID id)
        Searches for a vertex with a matching ID and returns it.
        Parameters:
        id - the ID of the vertex to search for
        Returns:
        the vertex with the matching ID or null if no vertex with such ID could be found
      • setClasspaths

        public void setClasspaths​(List<URL> paths)
        Sets the classpaths required to run the job on a task manager.
        Parameters:
        paths - paths of the directories/JAR files required to run the job on a task manager
      • getClasspaths

        public List<URL> getClasspaths()
      • getMaximumParallelism

        public int getMaximumParallelism()
        Gets the maximum parallelism of all operations in this job graph.
        Returns:
        The maximum parallelism of this job graph
      • addJar

        public void addJar​(Path jar)
        Adds the path of a JAR file required to run the job on a task manager.
        Parameters:
        jar - path of the JAR file required to run the job on a task manager
      • getUserJars

        public List<Path> getUserJars()
        Gets the list of assigned user jar paths.
        Returns:
        The list of assigned user jar paths
      • addUserArtifact

        public void addUserArtifact​(String name,
                                    DistributedCache.DistributedCacheEntry file)
        Adds the path of a custom file required to run the job on a task manager.
        Parameters:
        name - a name under which this artifact will be accessible through DistributedCache
        file - path of a custom file required to run the job on a task manager
      • addUserJarBlobKey

        public void addUserJarBlobKey​(PermanentBlobKey key)
        Adds the BLOB referenced by the key to the JobGraph's dependencies.
        Parameters:
        key - path of the JAR file required to run the job on a task manager
      • hasUsercodeJarFiles

        public boolean hasUsercodeJarFiles()
        Checks whether the JobGraph has user code JAR files attached.
        Returns:
        True, if the JobGraph has user code JAR files attached, false otherwise.
      • getUserJarBlobKeys

        public List<PermanentBlobKey> getUserJarBlobKeys()
        Returns a set of BLOB keys referring to the JAR files required to run this job.
        Returns:
        set of BLOB keys referring to the JAR files required to run this job
      • setUserArtifactRemotePath

        public void setUserArtifactRemotePath​(String entryName,
                                              String remotePath)
      • writeUserArtifactEntriesToConfiguration

        public void writeUserArtifactEntriesToConfiguration()
      • setInitialClientHeartbeatTimeout

        public void setInitialClientHeartbeatTimeout​(long initialClientHeartbeatTimeout)
      • getInitialClientHeartbeatTimeout

        public long getInitialClientHeartbeatTimeout()