Checkpoint#
CheckpointConfig#
Configuration that captures all checkpointing related settings.
Checks whether checkpointing is enabled. |
|
Gets the checkpointing mode (exactly-once vs. |
|
Sets the checkpointing mode ( |
|
Gets the interval in which checkpoints are periodically scheduled. |
|
Sets the interval in which checkpoints are periodically scheduled. |
|
Gets the maximum time that a checkpoint may take before being discarded. |
|
Sets the maximum time that a checkpoint may take before being discarded. |
|
Gets the minimal pause between checkpointing attempts. |
|
Sets the minimal pause between checkpointing attempts. |
|
Gets the maximum number of checkpoint attempts that may be in progress at the same time. |
|
Sets the maximum number of checkpoint attempts that may be in progress at the same time. |
|
This determines the behaviour of tasks if there is an error in their local checkpointing. |
|
Sets the expected behaviour for tasks in case that they encounter an error in their checkpointing procedure. |
|
Get the defined number of consecutive checkpoint failures that will be tolerated, before the whole job is failed over. |
|
|
This defines how many consecutive checkpoint failures will be tolerated, before the whole job is failed over. |
Returns whether checkpoints should be persisted externally. |
|
Returns whether unaligned checkpoints are enabled. |
|
Enables unaligned checkpoints, which greatly reduce checkpointing times under backpressure. |
|
Enables unaligned checkpoints, which greatly reduce checkpointing times under backpressure (experimental). |
|
Only relevant if |
|
Returns the alignment timeout, as configured via |
|
Only relevant if |
|
Returns the alignment timeout, as configured via |
|
Checks whether unaligned checkpoints are forced, despite currently non-checkpointable iteration feedback or custom partitioners. |
|
Checks whether unaligned checkpoints are forced, despite iteration feedback or custom partitioners. |
|
Cleanup behaviour for externalized checkpoints when the job is cancelled. |
CheckpointStorage#
Checkpoint storage defines how StateBackend
’s store their state for fault-tolerance
in streaming applications. Various implementations store their checkpoints in different fashions
and have different requirements and availability guarantees.
For example, JobManagerCheckpointStorage
stores checkpoints in the memory of the
JobManager. It is lightweight and without additional dependencies but is not scalable
and only supports small state sizes. This checkpoints storage policy is convenient for local
testing and development.
FileSystemCheckpointStorage
stores checkpoints in a filesystem. For systems like HDFS
NFS drives, S3, and GCS, this storage policy supports large state size, in the magnitude of many
terabytes while providing a highly available foundation for streaming applications. This
checkpoint storage policy is recommended for most production deployments.
Raw Bytes Storage
The CheckpointStorage creates services for raw bytes storage.
The raw bytes storage (through the CheckpointStreamFactory) is the fundamental service that simply stores bytes in a fault tolerant fashion. This service is used by the JobManager to store checkpoint and recovery metadata and is typically also used by the keyed- and operator- state backends to store checkpoint state.
Serializability
Implementations need to be serializable(java.io.Serializable), because they are distributed across parallel processes (for distributed execution) together with the streaming application code.
Because of that CheckpointStorage implementations are meant to be like _factories_ that create the proper state stores that provide access to the persistent layer. That way, the storage policy can be very lightweight (contain only configurations) which makes it easier to be serializable.
Thread Safety
Checkpoint storage implementations have to be thread-safe. Multiple threads may be creating streams concurrently.
|
The CheckpointStorage checkpoints state directly to the JobManager's memory (hence the name), but savepoints will be persisted to a file system. |
|
FileSystemCheckpointStorage checkpoints state as files to a filesystem. |
A wrapper of customized java checkpoint storage. |