24ebd24c79175f7426f4c489dc5a006e75f09dfb | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-01-04 16:22:52+01:00
More accurate skipping of sstables in read path
This patch improves the following things:
1. SSTable metadata will store a covered slice instead of min/max clusterings. The difference is that for slices there is available the type of a bound rather than just a clustering. In particular it will provide the information whether the lower and upper bound of an sstable is opened or closed.
2. SSTable metadata will store a flag whether the SSTable contains any partition level deletions or not
3. The above two changes required to introduce a new major format for SSTables - oa
4. Single partition read command makes use of the above changes. In particular an sstable can be skipped when it does not intersect with the column filter, does not have partition level deletions and does not have statics; In case there are partition level deletions, but the other conditions are satisfied, only the partition header needs to be accessed (tests attached)
5. Skipping SSTables assuming those three conditions are satisfied has been implemented also for partition range queries (tests attached). Also added minor separate statistics to record the number of accessed sstables in partition reads because now not all of them need to be accessed.
6. Artificial lower bound marker is now an object on its own and is not implemented as a special case of range tombstone bound.
7. Extended the lower bound optimization usage due the 1 and 2
8. Do not initialize iterator just to get a cached partition and associated columns index. The purpose of using lower bound optimization was to avoid opening an iterator of an sstable if possible.
9. Add key range to stats metadata
[f369595b1c] Add fields to sstable version and placeholders in stats serializer
[f5c3f772e2] Add hasKeyRange and hasLegacyMinMax
[3cde51f4e1] Add partition level deletion presence marker to sstable stats
[67b2ee2152] Extract AbstractTypeSerializer
[c77b475d6c] Refactor slices intersection checking
[ceb5af3a38] Store min and max clustering as a slice in stats metadata as and improved min/max
[d1f8973929] Implement MetadataCollectorBench
[335369da84] Apply partition level deletion presence marker optimizations to single partition read command
[2497a009b9] Lower bound optimization - add slices and isReverseOrder fields to UnfilteredRowIteratorWithLowerBound
[e32ee31177] Lower bound optimization - Replace usage of RangeTombstoneMarker as a lower bound with ArtificialBoundMarker
[e213e712c4] Lower bound optimization - improve usage of lower bound optimization
[c4f93006b1] Apply read path improvements to partition range queries
[5fa462266c] Add key range to StatsMetadata
[79a7339ed4] Use key range from stats if possible
[266ed2749b] Added new sstables for LegacySSTableTest
patch by Jacek Lewandowski; reviewed by Branimir Lambov and C. Scott Andreas for CASSANDRA-18134
Co-authored-by: Branimir Lambov <blambov>
Co-authored-by: Sylvain Lebresne <pcmanus>
Co-authored-by: Jacek Lewandowski <jacek-lewandowski>
Co-authored-by: Jakub Zytka <jakubzytka>