public class CheckpointCoordinator extends Object
Depending on the configured RecoveryMode
, the behaviour of the CompletedCheckpointStore
and CheckpointIDCounter
change. The default standalone
implementations don't support any recovery.
Modifier and Type | Field and Description |
---|---|
protected CheckpointIDCounter |
checkpointIdCounter
Checkpoint ID counter to ensure ascending IDs.
|
protected Object |
lock
Coordinator-wide lock to safeguard the checkpoint updates
|
Constructor and Description |
---|
CheckpointCoordinator(JobID job,
long baseInterval,
long checkpointTimeout,
ExecutionVertex[] tasksToTrigger,
ExecutionVertex[] tasksToWaitFor,
ExecutionVertex[] tasksToCommitTo,
ClassLoader userClassLoader,
CheckpointIDCounter checkpointIDCounter,
CompletedCheckpointStore completedCheckpointStore,
RecoveryMode recoveryMode) |
CheckpointCoordinator(JobID job,
long baseInterval,
long checkpointTimeout,
long minPauseBetweenCheckpoints,
int maxConcurrentCheckpointAttempts,
ExecutionVertex[] tasksToTrigger,
ExecutionVertex[] tasksToWaitFor,
ExecutionVertex[] tasksToCommitTo,
ClassLoader userClassLoader,
CheckpointIDCounter checkpointIDCounter,
CompletedCheckpointStore completedCheckpointStore,
RecoveryMode recoveryMode,
CheckpointStatsTracker statsTracker) |
Modifier and Type | Method and Description |
---|---|
ActorGateway |
createActivatorDeactivator(akka.actor.ActorSystem actorSystem,
UUID leaderSessionID) |
protected long |
getAndIncrementCheckpointId() |
protected ActorGateway |
getJobStatusListener() |
int |
getNumberOfPendingCheckpoints() |
int |
getNumberOfRetainedSuccessfulCheckpoints() |
Map<Long,PendingCheckpoint> |
getPendingCheckpoints() |
List<CompletedCheckpoint> |
getSuccessfulCheckpoints() |
boolean |
isShutdown() |
protected void |
onCancelCheckpoint(long canceledCheckpointId)
Callback on cancellation of a checkpoint.
|
protected void |
onFullyAcknowledgedCheckpoint(CompletedCheckpoint checkpoint)
Callback on full acknowledgement of a checkpoint.
|
protected void |
onShutdown()
Callback on shutdown of the coordinator.
|
boolean |
receiveAcknowledgeMessage(AcknowledgeCheckpoint message)
Receives an AcknowledgeCheckpoint message and returns whether the
message was associated with a pending checkpoint.
|
boolean |
receiveDeclineMessage(DeclineCheckpoint message)
Receives a
DeclineCheckpoint message and returns whether the
message was associated with a pending checkpoint. |
boolean |
restoreLatestCheckpointedState(Map<JobVertexID,ExecutionJobVertex> tasks,
boolean errorIfNoCheckpoint,
boolean allOrNothingState) |
protected void |
setJobStatusListener(ActorGateway jobStatusListener) |
void |
shutdown()
Shuts down the checkpoint coordinator.
|
void |
startCheckpointScheduler() |
void |
stopCheckpointScheduler() |
boolean |
triggerCheckpoint(long timestamp)
Triggers a new checkpoint and uses the given timestamp as the checkpoint
timestamp.
|
boolean |
triggerCheckpoint(long timestamp,
long nextCheckpointId)
Triggers a new checkpoint and uses the given timestamp as the checkpoint
timestamp.
|
protected final Object lock
protected final CheckpointIDCounter checkpointIdCounter
public CheckpointCoordinator(JobID job, long baseInterval, long checkpointTimeout, ExecutionVertex[] tasksToTrigger, ExecutionVertex[] tasksToWaitFor, ExecutionVertex[] tasksToCommitTo, ClassLoader userClassLoader, CheckpointIDCounter checkpointIDCounter, CompletedCheckpointStore completedCheckpointStore, RecoveryMode recoveryMode) throws Exception
Exception
public CheckpointCoordinator(JobID job, long baseInterval, long checkpointTimeout, long minPauseBetweenCheckpoints, int maxConcurrentCheckpointAttempts, ExecutionVertex[] tasksToTrigger, ExecutionVertex[] tasksToWaitFor, ExecutionVertex[] tasksToCommitTo, ClassLoader userClassLoader, CheckpointIDCounter checkpointIDCounter, CompletedCheckpointStore completedCheckpointStore, RecoveryMode recoveryMode, CheckpointStatsTracker statsTracker) throws Exception
Exception
protected void onShutdown()
protected void onCancelCheckpoint(long canceledCheckpointId)
protected void onFullyAcknowledgedCheckpoint(CompletedCheckpoint checkpoint)
public void shutdown() throws Exception
Exception
public boolean isShutdown()
public boolean triggerCheckpoint(long timestamp) throws Exception
timestamp
- The timestamp for the checkpoint.Exception
public boolean triggerCheckpoint(long timestamp, long nextCheckpointId) throws Exception
timestamp
- The timestamp for the checkpoint.nextCheckpointId
- The checkpoint ID to use for this checkpoint or -1
if
the checkpoint ID counter should be queried.Exception
public boolean receiveDeclineMessage(DeclineCheckpoint message) throws Exception
DeclineCheckpoint
message and returns whether the
message was associated with a pending checkpoint.message
- Checkpoint decline from the task managerException
public boolean receiveAcknowledgeMessage(AcknowledgeCheckpoint message) throws Exception
message
- Checkpoint ack from the task managerException
- If the checkpoint cannot be added to the completed checkpoint store.public boolean restoreLatestCheckpointedState(Map<JobVertexID,ExecutionJobVertex> tasks, boolean errorIfNoCheckpoint, boolean allOrNothingState) throws Exception
Exception
public int getNumberOfPendingCheckpoints()
public int getNumberOfRetainedSuccessfulCheckpoints()
public Map<Long,PendingCheckpoint> getPendingCheckpoints()
public List<CompletedCheckpoint> getSuccessfulCheckpoints() throws Exception
Exception
protected long getAndIncrementCheckpointId()
protected ActorGateway getJobStatusListener()
protected void setJobStatusListener(ActorGateway jobStatusListener)
public void startCheckpointScheduler()
public void stopCheckpointScheduler()
public ActorGateway createActivatorDeactivator(akka.actor.ActorSystem actorSystem, UUID leaderSessionID)
Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.