Expiring Snapshots #
Table Store writers generates one or two snapshots per commit. Each snapshot may add some new data files or mark some old data files as deleted. However, the marked data files are not truly deleted because Table Store also supports time traveling to an earlier snapshot. They are only deleted when the snapshot expires.
Currently, expiration is automatically performed by Table Store writers when committing new changes. By expiring old snapshots, old data files and metadata files that are no longer used can be deleted to release disk space.
Snapshot expiration is controlled by the following table properties.
Option | Required | Default | Type | Description |
---|---|---|---|---|
snapshot.time-retained |
No | 1 h | Duration | The maximum time of completed snapshots to retain. |
snapshot.num-retained.min |
No | 10 | Integer | The minimum number of completed snapshots to retain. |
snapshot.num-retained.max |
No | Integer.MAX_VALUE | Integer | The maximum number of completed snapshots to retain. |
Please note that too short retain time or too small retain number may result in:
- Batch queries cannot find the file. For example, the table is relatively large and the batch query takes 10 minutes to read, but the snapshot from 10 minutes ago expires, at which point the batch query will read a deleted snapshot.
- Streaming reading jobs on table files (without the external log system) fail to restart. When the job restarts, the snapshot it recorded may have expired.