public class DefaultCompletedCheckpointStore<R extends ResourceVersion<R>> extends AbstractCompleteCheckpointStore
CompletedCheckpointStore. Combined with different
StateHandleStore, we could persist the completed checkpoints to various storage.
During recovery, the latest checkpoint is read from
StateHandleStore. If there is more
than one, only the latest one is used and older ones are discarded (even if the maximum number of
retained checkpoints is greater than one).
If there is a network partition and multiple JobManagers run concurrent checkpoints for the same program, it is OK to take any valid successful checkpoint as long as the "history" of checkpoints is consistent. Currently, after recovery we start out with only a single checkpoint to circumvent those situations.
|Constructor and Description|
|Modifier and Type||Method and Description|
Synchronously writes the new checkpoints to state handle store and asynchronously removes older ones.
Returns the max number of retained checkpoints.
Returns the current number of retained checkpoints.
This method returns whether the completed checkpoint store requires checkpoints to be externalized.
Shuts down the store.
findLowest, getSharedStateRegistry, unregisterUnusedState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public DefaultCompletedCheckpointStore(int maxNumberOfCheckpointsToRetain, StateHandleStore<CompletedCheckpoint,R> stateHandleStore, CheckpointStoreUtil completedCheckpointStoreUtil, Collection<CompletedCheckpoint> completedCheckpoints, SharedStateRegistry sharedStateRegistry, Executor executor)
maxNumberOfCheckpointsToRetain- The maximum number of checkpoints to retain (at least 1). Adding more checkpoints than this results in older checkpoints being discarded. On recovery, we will only start with a single checkpoint.
stateHandleStore- Completed checkpoints in external store
completedCheckpointStoreUtil- utilities for completed checkpoint store
executor- to execute blocking calls
public boolean requiresExternalizedCheckpoints()
public CompletedCheckpoint addCheckpointAndSubsumeOldestOne(CompletedCheckpoint checkpoint, CheckpointsCleaner checkpointsCleaner, Runnable postCleanup) throws Exception
checkpoint- Completed checkpoint to add.
PossibleInconsistentStateException- if adding the checkpoint failed and leaving the system in a possibly inconsistent state, i.e. it's uncertain whether the checkpoint metadata was fully written to the underlying systems or not.
public List<CompletedCheckpoint> getAllCheckpoints()
Returns an empty list if no checkpoint has been added yet.
public int getNumberOfRetainedCheckpoints()
public int getMaxNumberOfRetainedCheckpoints()
public void shutdown(JobStatus jobStatus, CheckpointsCleaner checkpointsCleaner) throws Exception
The job status is forwarded and used to decide whether state should actually be discarded
CheckpointsCleaner.cleanSubsumedCheckpoints(long, java.util.Set<java.lang.Long>, java.lang.Runnable, java.util.concurrent.Executor) should be called here to subsume unused state.
jobStatus- Job state on shut down
checkpointsCleaner- that will cleanup completed checkpoints if needed
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.