Class AdaptiveBatchScheduler
- java.lang.Object
-
- org.apache.flink.runtime.scheduler.SchedulerBase
-
- org.apache.flink.runtime.scheduler.DefaultScheduler
-
- org.apache.flink.runtime.scheduler.adaptivebatch.AdaptiveBatchScheduler
-
- All Implemented Interfaces:
AutoCloseable
,CheckpointScheduling
,GlobalFailureHandler
,SchedulerNG
,SchedulerOperations
,AutoCloseableAsync
public class AdaptiveBatchScheduler extends DefaultScheduler
This scheduler decides the parallelism of JobVertex according to the data volume it consumes. A dynamically built up ExecutionGraph is used for this purpose.
-
-
Field Summary
-
Fields inherited from class org.apache.flink.runtime.scheduler.DefaultScheduler
executionDeployer, executionSlotAllocator, failoverStrategy, log, schedulingStrategy, shuffleMaster
-
Fields inherited from class org.apache.flink.runtime.scheduler.SchedulerBase
executionVertexVersioner, inputsLocationsRetriever, jobInfo, jobManagerJobMetricGroup, operatorCoordinatorHandler, stateLocationRetriever
-
-
Constructor Summary
Constructors Constructor Description AdaptiveBatchScheduler(org.slf4j.Logger log, JobGraph jobGraph, Executor ioExecutor, Configuration jobMasterConfiguration, Consumer<ComponentMainThreadExecutor> startUpAction, ScheduledExecutor delayExecutor, ClassLoader userCodeLoader, CheckpointsCleaner checkpointsCleaner, CheckpointRecoveryFactory checkpointRecoveryFactory, JobManagerJobMetricGroup jobManagerJobMetricGroup, SchedulingStrategyFactory schedulingStrategyFactory, FailoverStrategy.Factory failoverStrategyFactory, RestartBackoffTimeStrategy restartBackoffTimeStrategy, ExecutionOperations executionOperations, ExecutionVertexVersioner executionVertexVersioner, ExecutionSlotAllocatorFactory executionSlotAllocatorFactory, long initializationTimestamp, ComponentMainThreadExecutor mainThreadExecutor, JobStatusListener jobStatusListener, Collection<FailureEnricher> failureEnrichers, ExecutionGraphFactory executionGraphFactory, ShuffleMaster<?> shuffleMaster, Duration rpcTimeout, VertexParallelismAndInputInfosDecider vertexParallelismAndInputInfosDecider, int defaultMaxParallelism, BlocklistOperations blocklistOperations, JobManagerOptions.HybridPartitionDataConsumeConstraint hybridPartitionDataConsumeConstraint, Map<JobVertexID,ForwardGroup> forwardGroupsByJobVertexId, BatchJobRecoveryHandler jobRecoveryHandler)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
allocateSlotsAndDeploy(List<ExecutionVertexID> verticesToDeploy)
Allocate slots and deploy the vertex when slots are returned.CompletableFuture<Void>
closeAsync()
Trigger the closing of the resource and return the corresponding close future.List<CompletableFuture<Integer>>
computeDynamicSourceParallelism()
static VertexParallelismStore
computeVertexParallelismStoreForDynamicGraph(Iterable<JobVertex> vertices, int defaultMaxParallelism)
Compute theVertexParallelismStore
for all given vertices in a dynamic graph, which will set defaults and ensure that the returned store contains valid parallelisms, with the configured default max parallelism.protected MarkPartitionFinishedStrategy
getMarkPartitionFinishedStrategy()
protected void
handleTaskFailure(Execution failedExecution, Throwable error)
void
initializeVerticesIfPossible()
protected void
maybeRestartTasks(FailureHandlingResult failureHandlingResult)
Modifies the vertices which need to be restarted.protected void
onTaskFailed(Execution execution)
protected void
onTaskFinished(Execution execution, IOMetrics ioMetrics)
protected void
resetForNewExecution(ExecutionVertexID executionVertexId)
protected void
resetForNewExecutions(Collection<ExecutionVertexID> vertices)
protected void
startSchedulingInternal()
-
Methods inherited from class org.apache.flink.runtime.scheduler.DefaultScheduler
cancelAllPendingSlotRequestsForVertex, cancelAllPendingSlotRequestsInternal, cancelExecution, createFailureHandlingResultSnapshot, getNumberOfRestarts, handleGlobalFailure, notifyCoordinatorsAboutTaskFailure, recordTaskFailure
-
Methods inherited from class org.apache.flink.runtime.scheduler.SchedulerBase
acknowledgeCheckpoint, archiveFromFailureHandlingResult, archiveGlobalFailure, cancel, computeVertexParallelismStore, computeVertexParallelismStore, computeVertexParallelismStore, computeVertexParallelismStore, declineCheckpoint, deliverCoordinationRequestToCoordinator, deliverOperatorEventToCoordinator, failJob, getDefaultMaxParallelism, getExceptionHistory, getExecutionGraph, getExecutionJobVertex, getExecutionVertex, getJobGraph, getJobId, getJobTerminationFuture, getMainThreadExecutor, getResultPartitionAvailabilityChecker, getSchedulingTopology, notifyEndOfData, notifyKvStateRegistered, notifyKvStateUnregistered, registerJobMetrics, reportCheckpointMetrics, reportInitializationMetrics, requestCheckpointStats, requestJob, requestJobStatus, requestKvStateLocation, requestNextInputSplit, requestPartitionState, restoreState, setGlobalFailureCause, startCheckpointScheduler, startScheduling, stopCheckpointScheduler, stopWithSavepoint, transitionExecutionGraphState, transitionToRunning, transitionToScheduled, triggerCheckpoint, triggerSavepoint, updateAccumulators, updateTaskExecutionState
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.util.AutoCloseableAsync
close
-
Methods inherited from interface org.apache.flink.runtime.scheduler.SchedulerNG
requestJobResourceRequirements, updateJobResourceRequirements, updateTaskExecutionState
-
-
-
-
Constructor Detail
-
AdaptiveBatchScheduler
public AdaptiveBatchScheduler(org.slf4j.Logger log, JobGraph jobGraph, Executor ioExecutor, Configuration jobMasterConfiguration, Consumer<ComponentMainThreadExecutor> startUpAction, ScheduledExecutor delayExecutor, ClassLoader userCodeLoader, CheckpointsCleaner checkpointsCleaner, CheckpointRecoveryFactory checkpointRecoveryFactory, JobManagerJobMetricGroup jobManagerJobMetricGroup, SchedulingStrategyFactory schedulingStrategyFactory, FailoverStrategy.Factory failoverStrategyFactory, RestartBackoffTimeStrategy restartBackoffTimeStrategy, ExecutionOperations executionOperations, ExecutionVertexVersioner executionVertexVersioner, ExecutionSlotAllocatorFactory executionSlotAllocatorFactory, long initializationTimestamp, ComponentMainThreadExecutor mainThreadExecutor, JobStatusListener jobStatusListener, Collection<FailureEnricher> failureEnrichers, ExecutionGraphFactory executionGraphFactory, ShuffleMaster<?> shuffleMaster, Duration rpcTimeout, VertexParallelismAndInputInfosDecider vertexParallelismAndInputInfosDecider, int defaultMaxParallelism, BlocklistOperations blocklistOperations, JobManagerOptions.HybridPartitionDataConsumeConstraint hybridPartitionDataConsumeConstraint, Map<JobVertexID,ForwardGroup> forwardGroupsByJobVertexId, BatchJobRecoveryHandler jobRecoveryHandler) throws Exception
- Throws:
Exception
-
-
Method Detail
-
startSchedulingInternal
protected void startSchedulingInternal()
- Overrides:
startSchedulingInternal
in classDefaultScheduler
-
maybeRestartTasks
protected void maybeRestartTasks(FailureHandlingResult failureHandlingResult)
Modifies the vertices which need to be restarted. If any task needing restarting belongs to job vertices with unrecovered operator coordinators, all tasks within those job vertices need to be restarted once.- Overrides:
maybeRestartTasks
in classDefaultScheduler
-
resetForNewExecutions
protected void resetForNewExecutions(Collection<ExecutionVertexID> vertices)
- Overrides:
resetForNewExecutions
in classSchedulerBase
-
closeAsync
public CompletableFuture<Void> closeAsync()
Description copied from interface:AutoCloseableAsync
Trigger the closing of the resource and return the corresponding close future.- Specified by:
closeAsync
in interfaceAutoCloseableAsync
- Overrides:
closeAsync
in classSchedulerBase
- Returns:
- Future which is completed once the resource has been closed
-
onTaskFinished
protected void onTaskFinished(Execution execution, IOMetrics ioMetrics)
- Overrides:
onTaskFinished
in classDefaultScheduler
-
onTaskFailed
protected void onTaskFailed(Execution execution)
- Overrides:
onTaskFailed
in classDefaultScheduler
-
handleTaskFailure
protected void handleTaskFailure(Execution failedExecution, @Nullable Throwable error)
- Overrides:
handleTaskFailure
in classDefaultScheduler
-
allocateSlotsAndDeploy
public void allocateSlotsAndDeploy(List<ExecutionVertexID> verticesToDeploy)
Description copied from interface:SchedulerOperations
Allocate slots and deploy the vertex when slots are returned. Vertices will be deployed only after all of them have been assigned slots. The given order will be respected, i.e. tasks with smaller indices will be deployed earlier. Only vertices in CREATED state will be accepted. Errors will happen if scheduling Non-CREATED vertices.- Specified by:
allocateSlotsAndDeploy
in interfaceSchedulerOperations
- Overrides:
allocateSlotsAndDeploy
in classDefaultScheduler
- Parameters:
verticesToDeploy
- The execution vertices to deploy
-
resetForNewExecution
protected void resetForNewExecution(ExecutionVertexID executionVertexId)
- Overrides:
resetForNewExecution
in classSchedulerBase
-
getMarkPartitionFinishedStrategy
protected MarkPartitionFinishedStrategy getMarkPartitionFinishedStrategy()
- Overrides:
getMarkPartitionFinishedStrategy
in classSchedulerBase
-
computeDynamicSourceParallelism
public List<CompletableFuture<Integer>> computeDynamicSourceParallelism()
-
initializeVerticesIfPossible
@VisibleForTesting public void initializeVerticesIfPossible()
-
computeVertexParallelismStoreForDynamicGraph
@VisibleForTesting public static VertexParallelismStore computeVertexParallelismStoreForDynamicGraph(Iterable<JobVertex> vertices, int defaultMaxParallelism)
Compute theVertexParallelismStore
for all given vertices in a dynamic graph, which will set defaults and ensure that the returned store contains valid parallelisms, with the configured default max parallelism.- Parameters:
vertices
- the vertices to compute parallelism fordefaultMaxParallelism
- the global default max parallelism- Returns:
- the computed parallelism store
-
-