public abstract class Dispatcher extends FencedRpcEndpoint<DispatcherId> implements DispatcherGateway
Modifier and Type | Class and Description |
---|---|
protected static class |
Dispatcher.ExecutionType
Enum to distinguish between initial job submission and re-submission for recovery.
|
RpcEndpoint.MainThreadExecutor
Modifier and Type | Field and Description |
---|---|
static ConfigOption<Duration> |
CLIENT_ALIVENESS_CHECK_DURATION |
static String |
DISPATCHER_NAME |
protected CompletableFuture<ApplicationStatus> |
shutDownFuture |
log, rpcServer
Modifier | Constructor and Description |
---|---|
|
Dispatcher(RpcService rpcService,
DispatcherId fencingToken,
Collection<JobGraph> recoveredJobs,
Collection<JobResult> recoveredDirtyJobs,
DispatcherBootstrapFactory dispatcherBootstrapFactory,
DispatcherServices dispatcherServices) |
protected |
Dispatcher(RpcService rpcService,
DispatcherId fencingToken,
Collection<JobGraph> recoveredJobs,
Collection<JobResult> recoveredDirtyJobs,
DispatcherBootstrapFactory dispatcherBootstrapFactory,
DispatcherServices dispatcherServices,
JobManagerRunnerRegistry jobManagerRunnerRegistry,
ResourceCleanerFactory resourceCleanerFactory) |
Modifier and Type | Method and Description |
---|---|
CompletableFuture<Acknowledge> |
cancelJob(JobID jobId,
Time timeout)
Cancel the given job.
|
CompletableFuture<CoordinationResponse> |
deliverCoordinationRequestToCoordinator(JobID jobId,
OperatorID operatorId,
SerializedValue<CoordinationRequest> serializedRequest,
Time timeout)
Deliver a coordination request to a specified coordinator and return the response.
|
CompletableFuture<Acknowledge> |
disposeSavepoint(String savepointPath,
Time timeout)
Dispose the given savepoint.
|
CompletableFuture<Integer> |
getBlobServerPort(Time timeout)
Returns the port of the blob server.
|
CompletableFuture<ApplicationStatus> |
getShutDownFuture() |
CompletableFuture<OperationResult<Long>> |
getTriggeredCheckpointStatus(AsynchronousJobOperationKey operationKey)
Get the status of a checkpoint triggered under the specified operation key.
|
CompletableFuture<OperationResult<String>> |
getTriggeredSavepointStatus(AsynchronousJobOperationKey operationKey)
Get the status of a savepoint triggered under the specified operation key.
|
protected CompletableFuture<org.apache.flink.runtime.dispatcher.Dispatcher.CleanupJobState> |
jobReachedTerminalState(ExecutionGraphInfo executionGraphInfo) |
CompletableFuture<Collection<JobID>> |
listJobs(Time timeout)
List the current set of submitted jobs.
|
protected void |
onFatalError(Throwable throwable) |
CompletableFuture<Void> |
onRemovedJobGraph(JobID jobId) |
void |
onStart()
User overridable callback which is called from
RpcEndpoint.internalCallOnStart() . |
CompletableFuture<Void> |
onStop()
User overridable callback which is called from
RpcEndpoint.internalCallOnStop() . |
CompletableFuture<Void> |
reportJobClientHeartbeat(JobID jobId,
long expiredTimestamp,
Time timeout)
The client reports the heartbeat to the dispatcher for aliveness.
|
CompletableFuture<CheckpointStatsSnapshot> |
requestCheckpointStats(JobID jobId,
Time timeout)
Requests the
CheckpointStatsSnapshot containing checkpointing information. |
CompletableFuture<ClusterOverview> |
requestClusterOverview(Time timeout)
Requests the cluster status overview.
|
CompletableFuture<ExecutionGraphInfo> |
requestExecutionGraphInfo(JobID jobId,
Time timeout)
Requests the
ExecutionGraphInfo containing additional information besides the ArchivedExecutionGraph . |
CompletableFuture<JobResourceRequirements> |
requestJobResourceRequirements(JobID jobId)
Read current
job resource requirements for a given job. |
CompletableFuture<JobResult> |
requestJobResult(JobID jobId,
Time timeout)
Requests the
JobResult of a job specified by the given jobId. |
CompletableFuture<JobStatus> |
requestJobStatus(JobID jobId,
Time timeout)
Request the
JobStatus of the given job. |
CompletableFuture<Collection<String>> |
requestMetricQueryServiceAddresses(Time timeout)
Requests the addresses of the
MetricQueryService to query. |
CompletableFuture<MultipleJobsDetails> |
requestMultipleJobDetails(Time timeout)
Requests job details currently being executed on the Flink cluster.
|
CompletableFuture<Collection<Tuple2<ResourceID,String>>> |
requestTaskManagerMetricQueryServiceAddresses(Time timeout)
Requests the addresses for the TaskManagers'
MetricQueryService to query. |
CompletableFuture<ThreadDumpInfo> |
requestThreadDump(Time timeout)
Requests the thread dump from the JobManager.
|
protected void |
runPostJobGloballyTerminated(JobID jobId,
JobStatus jobStatus) |
CompletableFuture<Acknowledge> |
shutDownCluster() |
CompletableFuture<Acknowledge> |
shutDownCluster(ApplicationStatus applicationStatus) |
CompletableFuture<Acknowledge> |
stopWithSavepoint(AsynchronousJobOperationKey operationKey,
String targetDirectory,
SavepointFormatType formatType,
TriggerSavepointMode savepointMode,
Time timeout)
Stops the job with a savepoint, returning a future that completes when the operation is
started.
|
CompletableFuture<String> |
stopWithSavepointAndGetLocation(JobID jobId,
String targetDirectory,
SavepointFormatType formatType,
TriggerSavepointMode savepointMode,
Time timeout)
Stops the job with a savepoint, returning a future that completes with the savepoint location
when the savepoint is completed.
|
CompletableFuture<Acknowledge> |
submitFailedJob(JobID jobId,
String jobName,
Throwable exception) |
CompletableFuture<Acknowledge> |
submitJob(JobGraph jobGraph,
Time timeout)
Submit a job to the dispatcher.
|
CompletableFuture<Acknowledge> |
triggerCheckpoint(AsynchronousJobOperationKey operationKey,
CheckpointType checkpointType,
Time timeout)
Triggers a checkpoint with the given savepoint directory as a target.
|
CompletableFuture<String> |
triggerCheckpoint(JobID jobID,
Time timeout) |
CompletableFuture<Acknowledge> |
triggerSavepoint(AsynchronousJobOperationKey operationKey,
String targetDirectory,
SavepointFormatType formatType,
TriggerSavepointMode savepointMode,
Time timeout)
Triggers a savepoint with the given savepoint directory as a target, returning a future that
completes when the operation is started.
|
CompletableFuture<String> |
triggerSavepointAndGetLocation(JobID jobId,
String targetDirectory,
SavepointFormatType formatType,
TriggerSavepointMode savepointMode,
Time timeout)
Triggers a savepoint with the given savepoint directory as a target, returning a future that
completes with the savepoint location when it is complete.
|
CompletableFuture<Acknowledge> |
updateJobResourceRequirements(JobID jobId,
JobResourceRequirements jobResourceRequirements)
Update
job resource requirements for a given job. |
getFencingToken
callAsync, closeAsync, getAddress, getEndpointId, getHostname, getMainThreadExecutor, getRpcService, getSelfGateway, getTerminationFuture, internalCallOnStart, internalCallOnStop, isRunning, registerResource, runAsync, scheduleRunAsync, scheduleRunAsync, start, stop, unregisterResource, validateRunsInMainThread
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFencingToken
requestJob
getAddress, getHostname
close
@VisibleForTesting public static final ConfigOption<Duration> CLIENT_ALIVENESS_CHECK_DURATION
public static final String DISPATCHER_NAME
protected final CompletableFuture<ApplicationStatus> shutDownFuture
public Dispatcher(RpcService rpcService, DispatcherId fencingToken, Collection<JobGraph> recoveredJobs, Collection<JobResult> recoveredDirtyJobs, DispatcherBootstrapFactory dispatcherBootstrapFactory, DispatcherServices dispatcherServices) throws Exception
Exception
@VisibleForTesting protected Dispatcher(RpcService rpcService, DispatcherId fencingToken, Collection<JobGraph> recoveredJobs, Collection<JobResult> recoveredDirtyJobs, DispatcherBootstrapFactory dispatcherBootstrapFactory, DispatcherServices dispatcherServices, JobManagerRunnerRegistry jobManagerRunnerRegistry, ResourceCleanerFactory resourceCleanerFactory) throws Exception
Exception
public CompletableFuture<ApplicationStatus> getShutDownFuture()
public void onStart() throws Exception
RpcEndpoint
RpcEndpoint.internalCallOnStart()
.
This method is called when the RpcEndpoint is being started. The method is guaranteed to be executed in the main thread context and can be used to start the rpc endpoint in the context of the rpc endpoint's main thread.
IMPORTANT: This method should never be called directly by the user.
onStart
in class RpcEndpoint
Exception
- indicating that the rpc endpoint could not be started. If an exception
occurs, then the rpc endpoint will automatically terminate.public CompletableFuture<Void> onStop()
RpcEndpoint
RpcEndpoint.internalCallOnStop()
.
This method is called when the RpcEndpoint is being shut down. The method is guaranteed to be executed in the main thread context and can be used to clean up internal state.
IMPORTANT: This method should never be called directly by the user.
onStop
in class RpcEndpoint
public CompletableFuture<Acknowledge> submitJob(JobGraph jobGraph, Time timeout)
DispatcherGateway
submitJob
in interface DispatcherGateway
jobGraph
- JobGraph to submittimeout
- RPC timeoutpublic CompletableFuture<Acknowledge> submitFailedJob(JobID jobId, String jobName, Throwable exception)
submitFailedJob
in interface DispatcherGateway
public CompletableFuture<Collection<JobID>> listJobs(Time timeout)
DispatcherGateway
listJobs
in interface DispatcherGateway
timeout
- RPC timeoutpublic CompletableFuture<Acknowledge> disposeSavepoint(String savepointPath, Time timeout)
RestfulGateway
disposeSavepoint
in interface RestfulGateway
savepointPath
- identifying the savepoint to disposetimeout
- RPC timeoutpublic CompletableFuture<Acknowledge> cancelJob(JobID jobId, Time timeout)
RestfulGateway
cancelJob
in interface RestfulGateway
jobId
- identifying the job to canceltimeout
- of the operationpublic CompletableFuture<ClusterOverview> requestClusterOverview(Time timeout)
RestfulGateway
requestClusterOverview
in interface RestfulGateway
timeout
- for the asynchronous operationpublic CompletableFuture<MultipleJobsDetails> requestMultipleJobDetails(Time timeout)
RestfulGateway
requestMultipleJobDetails
in interface RestfulGateway
timeout
- for the asynchronous operationpublic CompletableFuture<JobStatus> requestJobStatus(JobID jobId, Time timeout)
RestfulGateway
JobStatus
of the given job.requestJobStatus
in interface RestfulGateway
jobId
- identifying the job for which to retrieve the JobStatustimeout
- for the asynchronous operationJobStatus
of the given jobpublic CompletableFuture<ExecutionGraphInfo> requestExecutionGraphInfo(JobID jobId, Time timeout)
RestfulGateway
ExecutionGraphInfo
containing additional information besides the ArchivedExecutionGraph
. If there is no such graph, then the future is completed with a
FlinkJobNotFoundException
.requestExecutionGraphInfo
in interface RestfulGateway
jobId
- identifying the job whose ExecutionGraphInfo
is requestedtimeout
- for the asynchronous operationExecutionGraphInfo
for the given jobId, otherwise
FlinkJobNotFoundException
public CompletableFuture<CheckpointStatsSnapshot> requestCheckpointStats(JobID jobId, Time timeout)
RestfulGateway
CheckpointStatsSnapshot
containing checkpointing information.requestCheckpointStats
in interface RestfulGateway
jobId
- identifying the job whose CheckpointStatsSnapshot
is requestedtimeout
- for the asynchronous operationCheckpointStatsSnapshot
for the given jobIdpublic CompletableFuture<JobResult> requestJobResult(JobID jobId, Time timeout)
RestfulGateway
JobResult
of a job specified by the given jobId.requestJobResult
in interface RestfulGateway
jobId
- identifying the job for which to retrieve the JobResult
.timeout
- for the asynchronous operationJobResult
once the job has finishedpublic CompletableFuture<Collection<String>> requestMetricQueryServiceAddresses(Time timeout)
RestfulGateway
MetricQueryService
to query.requestMetricQueryServiceAddresses
in interface RestfulGateway
timeout
- for the asynchronous operationpublic CompletableFuture<Collection<Tuple2<ResourceID,String>>> requestTaskManagerMetricQueryServiceAddresses(Time timeout)
RestfulGateway
MetricQueryService
to query.requestTaskManagerMetricQueryServiceAddresses
in interface RestfulGateway
timeout
- for the asynchronous operationpublic CompletableFuture<ThreadDumpInfo> requestThreadDump(Time timeout)
RestfulGateway
requestThreadDump
in interface RestfulGateway
timeout
- timeout of the asynchronous operationpublic CompletableFuture<Integer> getBlobServerPort(Time timeout)
DispatcherGateway
getBlobServerPort
in interface DispatcherGateway
timeout
- of the operationpublic CompletableFuture<String> triggerCheckpoint(JobID jobID, Time timeout)
triggerCheckpoint
in interface DispatcherGateway
public CompletableFuture<Acknowledge> triggerCheckpoint(AsynchronousJobOperationKey operationKey, CheckpointType checkpointType, Time timeout)
RestfulGateway
triggerCheckpoint
in interface RestfulGateway
operationKey
- the key of the operation, for deduplication purposescheckpointType
- checkpoint backup type (configured / full / incremental)timeout
- Timeout for the asynchronous operationexternal pointer
of
the savepoint.public CompletableFuture<OperationResult<Long>> getTriggeredCheckpointStatus(AsynchronousJobOperationKey operationKey)
RestfulGateway
getTriggeredCheckpointStatus
in interface RestfulGateway
operationKey
- key of the operationpublic CompletableFuture<Acknowledge> triggerSavepoint(AsynchronousJobOperationKey operationKey, String targetDirectory, SavepointFormatType formatType, TriggerSavepointMode savepointMode, Time timeout)
RestfulGateway
triggerSavepoint
in interface RestfulGateway
operationKey
- the key of the operation, for deduplication purposestargetDirectory
- Target directory for the savepoint.formatType
- Binary format of the savepoint.savepointMode
- context of the savepoint operationtimeout
- Timeout for the asynchronous operationpublic CompletableFuture<String> triggerSavepointAndGetLocation(JobID jobId, String targetDirectory, SavepointFormatType formatType, TriggerSavepointMode savepointMode, Time timeout)
DispatcherGateway
triggerSavepointAndGetLocation
in interface DispatcherGateway
jobId
- the job idtargetDirectory
- Target directory for the savepoint.formatType
- Binary format of the savepoint.savepointMode
- context of the savepoint operationtimeout
- Timeout for the asynchronous operationpublic CompletableFuture<OperationResult<String>> getTriggeredSavepointStatus(AsynchronousJobOperationKey operationKey)
RestfulGateway
getTriggeredSavepointStatus
in interface RestfulGateway
operationKey
- key of the operationpublic CompletableFuture<Acknowledge> stopWithSavepoint(AsynchronousJobOperationKey operationKey, String targetDirectory, SavepointFormatType formatType, TriggerSavepointMode savepointMode, Time timeout)
RestfulGateway
stopWithSavepoint
in interface RestfulGateway
operationKey
- key of the operation, for deduplicationtargetDirectory
- Target directory for the savepoint.formatType
- Binary format of the savepoint.savepointMode
- context of the savepoint operationtimeout
- for the rpc callpublic CompletableFuture<String> stopWithSavepointAndGetLocation(JobID jobId, String targetDirectory, SavepointFormatType formatType, TriggerSavepointMode savepointMode, Time timeout)
DispatcherGateway
stopWithSavepointAndGetLocation
in interface DispatcherGateway
jobId
- the job idtargetDirectory
- Target directory for the savepoint.savepointMode
- context of the savepoint operationtimeout
- for the rpc callpublic CompletableFuture<Acknowledge> shutDownCluster()
shutDownCluster
in interface RestfulGateway
public CompletableFuture<Acknowledge> shutDownCluster(ApplicationStatus applicationStatus)
shutDownCluster
in interface DispatcherGateway
public CompletableFuture<CoordinationResponse> deliverCoordinationRequestToCoordinator(JobID jobId, OperatorID operatorId, SerializedValue<CoordinationRequest> serializedRequest, Time timeout)
RestfulGateway
deliverCoordinationRequestToCoordinator
in interface RestfulGateway
jobId
- identifying the job which the coordinator belongs tooperatorId
- identifying the coordinator to receive the requestserializedRequest
- serialized request to delivertimeout
- RPC timeoutFlinkException
if the task is not running, or no
operator/coordinator exists for the given ID, or the coordinator cannot handle client
events.public CompletableFuture<Void> reportJobClientHeartbeat(JobID jobId, long expiredTimestamp, Time timeout)
RestfulGateway
reportJobClientHeartbeat
in interface RestfulGateway
public CompletableFuture<JobResourceRequirements> requestJobResourceRequirements(JobID jobId)
RestfulGateway
job resource requirements
for a given job.requestJobResourceRequirements
in interface RestfulGateway
jobId
- job to read the resource requirements forpublic CompletableFuture<Acknowledge> updateJobResourceRequirements(JobID jobId, JobResourceRequirements jobResourceRequirements)
RestfulGateway
job resource requirements
for a given job. When the
returned future is complete the requirements have been updated and were persisted in HA, but
the job may not have been rescaled (yet).updateJobResourceRequirements
in interface RestfulGateway
jobId
- job the given requirements belong tojobResourceRequirements
- new resource requirements for the jobprotected void runPostJobGloballyTerminated(JobID jobId, JobStatus jobStatus)
protected void onFatalError(Throwable throwable)
@VisibleForTesting protected CompletableFuture<org.apache.flink.runtime.dispatcher.Dispatcher.CleanupJobState> jobReachedTerminalState(ExecutionGraphInfo executionGraphInfo)
public CompletableFuture<Void> onRemovedJobGraph(JobID jobId)
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.