Class JobGraph
- java.lang.Object
-
- org.apache.flink.runtime.jobgraph.JobGraph
-
- All Implemented Interfaces:
Serializable
public class JobGraph extends Object implements Serializable
The JobGraph represents a Flink dataflow program, at the low level that the JobManager accepts. All programs from higher level APIs are transformed into JobGraphs.The JobGraph is a graph of vertices and intermediate results that are connected together to form a DAG. Note that iterations (feedback edges) are currently not encoded inside the JobGraph but inside certain special vertices that establish the feedback channel amongst themselves.
The JobGraph defines the job-wide configuration settings, while each vertex and intermediate result define the characteristics of the concrete operation and intermediate data.
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description JobGraph(String jobName)
Constructs a new job graph with the given name, the givenExecutionConfig
, and a random job ID.JobGraph(JobID jobId, String jobName)
Constructs a new job graph with the given job ID (or a random ID, ifnull
is passed), the given name and the given execution configuration (seeExecutionConfig
).JobGraph(JobID jobId, String jobName, JobVertex... vertices)
Constructs a new job graph with the given name, the givenExecutionConfig
, the given jobId or a random one if null supplied, and the given job vertices.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addJar(Path jar)
Adds the path of a JAR file required to run the job on a task manager.void
addJars(List<URL> jarFilesToAttach)
Adds the given jar files to theJobGraph
viaaddJar(org.apache.flink.core.fs.Path)
.void
addUserArtifact(String name, DistributedCache.DistributedCacheEntry file)
Adds the path of a custom file required to run the job on a task manager.void
addUserJarBlobKey(PermanentBlobKey key)
Adds the BLOB referenced by the key to the JobGraph's dependencies.void
addVertex(JobVertex vertex)
Adds a new task vertex to the job graph if it is not already included.void
enableApproximateLocalRecovery(boolean enabled)
JobVertex
findVertexByID(JobVertexID id)
Searches for a vertex with a matching ID and returns it.JobCheckpointingSettings
getCheckpointingSettings()
Gets the settings for asynchronous snapshots.List<URL>
getClasspaths()
Set<CoLocationGroup>
getCoLocationGroups()
Returns allCoLocationGroup
instances associated with thisJobGraph
.long
getInitialClientHeartbeatTimeout()
Configuration
getJobConfiguration()
Returns the configuration object for this job.JobID
getJobID()
Returns the ID of the job.List<JobStatusHook>
getJobStatusHooks()
JobType
getJobType()
int
getMaximumParallelism()
Gets the maximum parallelism of all operations in this job graph.String
getName()
Returns the name assigned to the job graph.int
getNumberOfVertices()
Returns the number of all vertices.SavepointRestoreSettings
getSavepointRestoreSettings()
Returns the configured savepoint restore setting.SerializedValue<ExecutionConfig>
getSerializedExecutionConfig()
Returns theExecutionConfig
.Set<SlotSharingGroup>
getSlotSharingGroups()
Map<String,DistributedCache.DistributedCacheEntry>
getUserArtifacts()
Gets the list of assigned user jar paths.List<PermanentBlobKey>
getUserJarBlobKeys()
Returns a set of BLOB keys referring to the JAR files required to run this job.List<Path>
getUserJars()
Gets the list of assigned user jar paths.Iterable<JobVertex>
getVertices()
Returns an Iterable to iterate all vertices registered with the job graph.JobVertex[]
getVerticesAsArray()
Returns an array of all job vertices that are registered with the job graph.List<JobVertex>
getVerticesSortedTopologicallyFromSources()
boolean
hasUsercodeJarFiles()
Checks whether the JobGraph has user code JAR files attached.boolean
isApproximateLocalRecoveryEnabled()
boolean
isCheckpointingEnabled()
Checks if the checkpointing was enabled for this job graph.boolean
isDynamic()
void
setClasspaths(List<URL> paths)
Sets the classpaths required to run the job on a task manager.void
setDynamic(boolean dynamic)
void
setExecutionConfig(ExecutionConfig executionConfig)
Sets the execution config.void
setInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout)
void
setJobConfiguration(Configuration jobConfiguration)
void
setJobID(JobID jobID)
Sets the ID of the job.void
setJobStatusHooks(List<JobStatusHook> hooks)
void
setJobType(JobType type)
void
setSavepointRestoreSettings(SavepointRestoreSettings settings)
Sets the savepoint restore settings.void
setSnapshotSettings(JobCheckpointingSettings settings)
Sets the settings for asynchronous snapshots.void
setUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey)
void
setUserArtifactRemotePath(String entryName, String remotePath)
String
toString()
void
writeUserArtifactEntriesToConfiguration()
-
-
-
Constructor Detail
-
JobGraph
public JobGraph(String jobName)
Constructs a new job graph with the given name, the givenExecutionConfig
, and a random job ID. The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobName
- The name of the job.
-
JobGraph
public JobGraph(@Nullable JobID jobId, String jobName)
Constructs a new job graph with the given job ID (or a random ID, ifnull
is passed), the given name and the given execution configuration (seeExecutionConfig
). The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobId
- The id of the job. A random ID is generated, ifnull
is passed.jobName
- The name of the job.
-
JobGraph
public JobGraph(@Nullable JobID jobId, String jobName, JobVertex... vertices)
Constructs a new job graph with the given name, the givenExecutionConfig
, the given jobId or a random one if null supplied, and the given job vertices. The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobId
- The id of the job. A random ID is generated, ifnull
is passed.jobName
- The name of the job.vertices
- The vertices to add to the graph.
-
-
Method Detail
-
getJobID
public JobID getJobID()
Returns the ID of the job.- Returns:
- the ID of the job
-
setJobID
public void setJobID(JobID jobID)
Sets the ID of the job.
-
getName
public String getName()
Returns the name assigned to the job graph.- Returns:
- the name assigned to the job graph
-
setJobConfiguration
public void setJobConfiguration(Configuration jobConfiguration)
-
getJobConfiguration
public Configuration getJobConfiguration()
Returns the configuration object for this job. Job-wide parameters should be set into that configuration object.- Returns:
- The configuration object for this job.
-
getSerializedExecutionConfig
public SerializedValue<ExecutionConfig> getSerializedExecutionConfig()
Returns theExecutionConfig
.- Returns:
- ExecutionConfig
-
setJobType
public void setJobType(JobType type)
-
getJobType
public JobType getJobType()
-
setDynamic
public void setDynamic(boolean dynamic)
-
isDynamic
public boolean isDynamic()
-
enableApproximateLocalRecovery
public void enableApproximateLocalRecovery(boolean enabled)
-
isApproximateLocalRecoveryEnabled
public boolean isApproximateLocalRecoveryEnabled()
-
setSavepointRestoreSettings
public void setSavepointRestoreSettings(SavepointRestoreSettings settings)
Sets the savepoint restore settings.- Parameters:
settings
- The savepoint restore settings.
-
getSavepointRestoreSettings
public SavepointRestoreSettings getSavepointRestoreSettings()
Returns the configured savepoint restore setting.- Returns:
- The configured savepoint restore settings.
-
setExecutionConfig
public void setExecutionConfig(ExecutionConfig executionConfig) throws IOException
Sets the execution config. This method eagerly serialized the ExecutionConfig for future RPC transport. Further modification of the referenced ExecutionConfig object will not affect this serialized copy.- Parameters:
executionConfig
- The ExecutionConfig to be serialized.- Throws:
IOException
- Thrown if the serialization of the ExecutionConfig fails
-
addVertex
public void addVertex(JobVertex vertex)
Adds a new task vertex to the job graph if it is not already included.- Parameters:
vertex
- the new task vertex to be added
-
getVertices
public Iterable<JobVertex> getVertices()
Returns an Iterable to iterate all vertices registered with the job graph.- Returns:
- an Iterable to iterate all vertices registered with the job graph
-
getVerticesAsArray
public JobVertex[] getVerticesAsArray()
Returns an array of all job vertices that are registered with the job graph. The order in which the vertices appear in the list is not defined.- Returns:
- an array of all job vertices that are registered with the job graph
-
getNumberOfVertices
public int getNumberOfVertices()
Returns the number of all vertices.- Returns:
- The number of all vertices.
-
getSlotSharingGroups
public Set<SlotSharingGroup> getSlotSharingGroups()
-
getCoLocationGroups
public Set<CoLocationGroup> getCoLocationGroups()
Returns allCoLocationGroup
instances associated with thisJobGraph
.- Returns:
- The associated
CoLocationGroup
instances.
-
setSnapshotSettings
public void setSnapshotSettings(JobCheckpointingSettings settings)
Sets the settings for asynchronous snapshots. A value ofnull
means that snapshotting is not enabled.- Parameters:
settings
- The snapshot settings
-
getCheckpointingSettings
public JobCheckpointingSettings getCheckpointingSettings()
Gets the settings for asynchronous snapshots. This method returns null, when checkpointing is not enabled.- Returns:
- The snapshot settings
-
isCheckpointingEnabled
public boolean isCheckpointingEnabled()
Checks if the checkpointing was enabled for this job graph.- Returns:
- true if checkpointing enabled
-
findVertexByID
public JobVertex findVertexByID(JobVertexID id)
Searches for a vertex with a matching ID and returns it.- Parameters:
id
- the ID of the vertex to search for- Returns:
- the vertex with the matching ID or
null
if no vertex with such ID could be found
-
setClasspaths
public void setClasspaths(List<URL> paths)
Sets the classpaths required to run the job on a task manager.- Parameters:
paths
- paths of the directories/JAR files required to run the job on a task manager
-
getMaximumParallelism
public int getMaximumParallelism()
Gets the maximum parallelism of all operations in this job graph.- Returns:
- The maximum parallelism of this job graph
-
getVerticesSortedTopologicallyFromSources
public List<JobVertex> getVerticesSortedTopologicallyFromSources() throws InvalidProgramException
- Throws:
InvalidProgramException
-
addJar
public void addJar(Path jar)
Adds the path of a JAR file required to run the job on a task manager.- Parameters:
jar
- path of the JAR file required to run the job on a task manager
-
addJars
public void addJars(List<URL> jarFilesToAttach)
Adds the given jar files to theJobGraph
viaaddJar(org.apache.flink.core.fs.Path)
.- Parameters:
jarFilesToAttach
- a list of theURLs
of the jar files to attach to the jobgraph.- Throws:
RuntimeException
- if a jar URL is not valid.
-
getUserJars
public List<Path> getUserJars()
Gets the list of assigned user jar paths.- Returns:
- The list of assigned user jar paths
-
addUserArtifact
public void addUserArtifact(String name, DistributedCache.DistributedCacheEntry file)
Adds the path of a custom file required to run the job on a task manager.- Parameters:
name
- a name under which this artifact will be accessible throughDistributedCache
file
- path of a custom file required to run the job on a task manager
-
getUserArtifacts
public Map<String,DistributedCache.DistributedCacheEntry> getUserArtifacts()
Gets the list of assigned user jar paths.- Returns:
- The list of assigned user jar paths
-
addUserJarBlobKey
public void addUserJarBlobKey(PermanentBlobKey key)
Adds the BLOB referenced by the key to the JobGraph's dependencies.- Parameters:
key
- path of the JAR file required to run the job on a task manager
-
hasUsercodeJarFiles
public boolean hasUsercodeJarFiles()
Checks whether the JobGraph has user code JAR files attached.- Returns:
- True, if the JobGraph has user code JAR files attached, false otherwise.
-
getUserJarBlobKeys
public List<PermanentBlobKey> getUserJarBlobKeys()
Returns a set of BLOB keys referring to the JAR files required to run this job.- Returns:
- set of BLOB keys referring to the JAR files required to run this job
-
setUserArtifactBlobKey
public void setUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey) throws IOException
- Throws:
IOException
-
setUserArtifactRemotePath
public void setUserArtifactRemotePath(String entryName, String remotePath)
-
writeUserArtifactEntriesToConfiguration
public void writeUserArtifactEntriesToConfiguration()
-
setJobStatusHooks
public void setJobStatusHooks(List<JobStatusHook> hooks)
-
getJobStatusHooks
public List<JobStatusHook> getJobStatusHooks()
-
setInitialClientHeartbeatTimeout
public void setInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout)
-
getInitialClientHeartbeatTimeout
public long getInitialClientHeartbeatTimeout()
-
-