pyflink.datastream.checkpoint_config.CheckpointConfig.set_checkpoint_storage#
- CheckpointConfig.set_checkpoint_storage(storage: pyflink.datastream.checkpoint_storage.CheckpointStorage) pyflink.datastream.checkpoint_config.CheckpointConfig [source]#
Checkpoint storage defines how stat backends checkpoint their state for fault tolerance in streaming applications. Various implementations store their checkpoints in different fashions and have different requirements and availability guarantees.
For example, JobManagerCheckpointStorage stores checkpoints in the memory of the JobManager. It is lightweight and without additional dependencies but is not highly available and only supports small state sizes. This checkpoint storage policy is convenient for local testing and development.
The FileSystemCheckpointStorage stores checkpoints in a filesystem. For systems like HDFS, NFS Drivs, S3, and GCS, this storage policy supports large state size, in the magnitude of many terabytes while providing a highly available foundation for stateful applications. This checkpoint storage policy is recommended for most production deployments.