public class CheckpointFailureManager extends Object
Modifier and Type | Class and Description |
---|---|
static interface |
CheckpointFailureManager.FailJobCallback
A callback interface about how to fail a job.
|
Modifier and Type | Field and Description |
---|---|
static String |
EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE |
static int |
UNLIMITED_TOLERABLE_FAILURE_NUMBER |
Constructor and Description |
---|
CheckpointFailureManager(int tolerableCpFailureNumber,
CheckpointFailureManager.FailJobCallback failureCallback) |
Modifier and Type | Method and Description |
---|---|
void |
checkFailureCounter(CheckpointException exception,
long checkpointId) |
void |
handleCheckpointException(PendingCheckpoint pendingCheckpoint,
CheckpointProperties checkpointProperties,
CheckpointException exception,
ExecutionAttemptID executionAttemptID,
JobID job,
PendingCheckpointStats pendingCheckpointStats,
CheckpointStatsTracker statsTracker)
Failures on JM:
all checkpoints - go against failure counter.
|
void |
handleCheckpointSuccess(long checkpointId)
Handle checkpoint success.
|
public static final int UNLIMITED_TOLERABLE_FAILURE_NUMBER
public static final String EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE
public CheckpointFailureManager(int tolerableCpFailureNumber, CheckpointFailureManager.FailJobCallback failureCallback)
public void handleCheckpointException(@Nullable PendingCheckpoint pendingCheckpoint, CheckpointProperties checkpointProperties, CheckpointException exception, @Nullable ExecutionAttemptID executionAttemptID, JobID job, @Nullable PendingCheckpointStats pendingCheckpointStats, CheckpointStatsTracker statsTracker)
Failures on TM:
pendingCheckpoint
- the failed checkpoint if it was initialized already.checkpointProperties
- the checkpoint properties in order to determinate which handle
strategy can be used.exception
- the checkpoint exception.executionAttemptID
- the execution attempt id, as a safe guard.job
- the JobID.pendingCheckpointStats
- the pending checkpoint statistics.statsTracker
- the tracker for checkpoint statistics.public void checkFailureCounter(CheckpointException exception, long checkpointId)
public void handleCheckpointSuccess(long checkpointId)
checkpointId
- the failed checkpoint id used to count the continuous failure number
based on checkpoint id sequence.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.