public class ActiveResourceManager<WorkerType extends ResourceIDRetrievable> extends ResourceManager<WorkerType> implements ResourceEventHandler<WorkerType>
ResourceManager
.
This resource manager actively requests and releases resources from/to the external resource
management frameworks. With different ResourceManagerDriver
provided, this resource
manager can work with various frameworks.
RpcEndpoint.MainThreadExecutor
Modifier and Type | Field and Description |
---|---|
protected Configuration |
flinkConfig |
blocklistHandler, ioExecutor, RESOURCE_MANAGER_NAME, resourceManagerMetricGroup
log, rpcServer
Constructor and Description |
---|
ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver,
Configuration flinkConfig,
RpcService rpcService,
UUID leaderSessionId,
ResourceID resourceId,
HeartbeatServices heartbeatServices,
DelegationTokenManager delegationTokenManager,
SlotManager slotManager,
ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory,
BlocklistHandler.Factory blocklistHandlerFactory,
JobLeaderIdService jobLeaderIdService,
ClusterInformation clusterInformation,
FatalErrorHandler fatalErrorHandler,
ResourceManagerMetricGroup resourceManagerMetricGroup,
ThresholdMeter startWorkerFailureRater,
Duration retryInterval,
Duration workerRegistrationTimeout,
Duration previousWorkerRecoverTimeout,
Executor ioExecutor) |
Modifier and Type | Method and Description |
---|---|
void |
declareResourceNeeded(Collection<ResourceDeclaration> resourceDeclarations) |
CompletableFuture<Void> |
getReadyToServeFuture()
Get the ready to serve future of the resource manager.
|
protected ResourceAllocator |
getResourceAllocator() |
protected Optional<WorkerType> |
getWorkerNodeIfAcceptRegistration(ResourceID resourceID)
Get worker node if the worker resource is accepted.
|
protected void |
initialize()
Initializes the framework specific components.
|
protected void |
internalDeregisterApplication(ApplicationStatus finalStatus,
String optionalDiagnostics)
The framework specific code to deregister the application.
|
void |
onError(Throwable exception)
Notifies that an error has occurred that the process cannot proceed.
|
void |
onPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers)
Notifies that workers of previous attempt have been recovered from the external resource
manager.
|
protected void |
onWorkerRegistered(WorkerType worker,
WorkerResourceSpec workerResourceSpec) |
void |
onWorkerTerminated(ResourceID resourceId,
String diagnostics)
Notifies that the worker has been terminated.
|
protected void |
registerMetrics() |
void |
requestNewWorker(WorkerResourceSpec workerResourceSpec)
Allocates a resource using the worker resource specification.
|
protected void |
terminate()
Terminates the framework specific components.
|
closeJobManagerConnection, closeTaskManagerConnection, declareRequiredResources, deregisterApplication, disconnectJobManager, disconnectTaskManager, getClusterPartitionsShuffleDescriptors, getInstanceIdByResourceId, getNumberOfRegisteredTaskManagers, getStartedFuture, getWorkerByInstanceId, heartbeatFromJobManager, heartbeatFromTaskManager, jobLeaderLostLeadership, listDataSets, notifyNewBlockedNodes, notifySlotAvailable, onFatalError, onNewTokensObtained, onStart, onStop, registerJobMaster, registerTaskExecutor, releaseClusterPartitions, removeJob, reportClusterPartitions, requestProfiling, requestResourceOverview, requestTaskExecutorThreadInfoGateway, requestTaskManagerDetailsInfo, requestTaskManagerFileUploadByNameAndType, requestTaskManagerFileUploadByType, requestTaskManagerInfo, requestTaskManagerLogList, requestTaskManagerMetricQueryServiceAddresses, requestTaskManagerProfilingList, requestThreadDump, sendSlotReport, setFailUnfulfillableRequest, stopWorkerIfSupported
getFencingToken
callAsync, closeAsync, getAddress, getEndpointId, getHostname, getMainThreadExecutor, getMainThreadExecutor, getRpcService, getSelfGateway, getTerminationFuture, internalCallOnStart, internalCallOnStop, isRunning, registerResource, runAsync, scheduleRunAsync, scheduleRunAsync, start, stop, unregisterResource, validateRunsInMainThread
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFencingToken
getAddress, getHostname
close
protected final Configuration flinkConfig
public ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver, Configuration flinkConfig, RpcService rpcService, UUID leaderSessionId, ResourceID resourceId, HeartbeatServices heartbeatServices, DelegationTokenManager delegationTokenManager, SlotManager slotManager, ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory, BlocklistHandler.Factory blocklistHandlerFactory, JobLeaderIdService jobLeaderIdService, ClusterInformation clusterInformation, FatalErrorHandler fatalErrorHandler, ResourceManagerMetricGroup resourceManagerMetricGroup, ThresholdMeter startWorkerFailureRater, Duration retryInterval, Duration workerRegistrationTimeout, Duration previousWorkerRecoverTimeout, Executor ioExecutor)
protected void initialize() throws ResourceManagerException
ResourceManager
initialize
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
ResourceManagerException
- which occurs during initialization and causes the resource
manager to fail.protected void terminate() throws ResourceManagerException
ResourceManager
terminate
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
ResourceManagerException
protected void internalDeregisterApplication(ApplicationStatus finalStatus, @Nullable String optionalDiagnostics) throws ResourceManagerException
ResourceManager
This method also needs to make sure all pending containers that are not registered yet are returned.
internalDeregisterApplication
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
finalStatus
- The application status to report.optionalDiagnostics
- A diagnostics message or null
.ResourceManagerException
- if the application could not be shut down.protected Optional<WorkerType> getWorkerNodeIfAcceptRegistration(ResourceID resourceID)
ResourceManager
getWorkerNodeIfAcceptRegistration
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
resourceID
- The worker resource id@VisibleForTesting public void declareResourceNeeded(Collection<ResourceDeclaration> resourceDeclarations)
protected void onWorkerRegistered(WorkerType worker, WorkerResourceSpec workerResourceSpec)
onWorkerRegistered
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
protected void registerMetrics()
registerMetrics
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
public void onPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers)
ResourceEventHandler
onPreviousAttemptWorkersRecovered
in interface ResourceEventHandler<WorkerType extends ResourceIDRetrievable>
recoveredWorkers
- Collection of worker nodes, in the deployment specific type.public void onWorkerTerminated(ResourceID resourceId, String diagnostics)
ResourceEventHandler
onWorkerTerminated
in interface ResourceEventHandler<WorkerType extends ResourceIDRetrievable>
resourceId
- Identifier of the terminated worker.diagnostics
- Diagnostic message about the worker termination.public void onError(Throwable exception)
ResourceEventHandler
onError
in interface ResourceEventHandler<WorkerType extends ResourceIDRetrievable>
exception
- Exception that describes the error.@VisibleForTesting public void requestNewWorker(WorkerResourceSpec workerResourceSpec)
workerResourceSpec
- workerResourceSpec specifies the size of the to be allocated
resourcepublic CompletableFuture<Void> getReadyToServeFuture()
ResourceManager
getReadyToServeFuture
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
protected ResourceAllocator getResourceAllocator()
getResourceAllocator
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.