949b760f5516c139591473038917247b1fd7f500 | Author: Mike Adamson <madamson@datastax.com>
| 2023-08-30 11:51:04+01:00
Add support for a vector search index in SAI
- Adds jbellis/jvector (1.0.2) library for DiskANN based indexes on floating point vectors
- Adds ORDER BY ANN OF capability to do ANN search and order the results by score
patch by Mike Adamson; reviewed by Andrés de la Peña, Jonathon Ellis for CASSANDRA-18715
Co-authored-by Jonathon Ellis jbellis@gmail.com
Co-authored-by Zhao Yang zhaoyangsingapore@gmail.com
655a2455ac29395b0a303e6ad7fc4d458b18932d | Author: Mike Adamson <madamson@datastax.com>
| 2023-07-28 17:38:20+01:00
Reduce size of per-SSTable index components for SAI
This patch removes the PRIMARY_KEY_TRIE component and adds KeyLookup.Cursor#clusteredSeekToKey() to
search for clustering keys within a partition. To do this a new on-disk component
PARTITION_SIZES has been added that holds the size of each partition in the SSTable.
patch by Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18673
ebea2ba6ade00a6f156787ca4ee36b2f8eb003ad | Author: Jonathan Ellis <jbellis@datastax.com>
| 2023-06-26 14:50:01-05:00
Upgrade to lucene-core 9.7.0
Notes on the upgrade path:
- RamIndexOutput is replaced with ResettableByteBuffersIndexOutput, an extension of ByteBuffersIndexOutput, which was the closest thing to a replacement of RamIndexOutput.
- Lucene exposes the code we needed from DirectReaders more or less directly in DirectReader now, so the old copied code has been deleted.
- Lucene changed its data files to be little endian, but to keep its compatibility story simple it retained BE for the header and footer ints. That's the cause of the changes in SAICodecUtils.
- We could gain a bit of performance making our own code natively little endian but that is too big of a change for this patch.
patch by Jonathan Ellis; reviewed by Andrés de la Peña, Caleb Rackliffe, and Mike Adamson for CASSANDRA-18494
eb208d3561eaf645f74f60b54c71ebe5bfc24c33 | Author: Mike Adamson <madamson@datastax.com>
| 2023-05-09 12:29:01+01:00
Add basic text analysis to SAI, including "case_sensitive", "normalize", and "ascii" modes
patch by Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18479
cde91e56f09d9ebf315c79c9a81b89f70f4eb724 | Author: Mike Adamson <madamson@datastax.com>
| 2023-01-19 14:24:46+00:00
In-memory index implementation with query path
This includes the following elements of the Storage Attached Index:
- Memtable-attached indexes backed by an in-memory trie structure for byte-comparable values
- Query path for the in-memory index
- Index status propagation
- Randomized testing for Memtable-attached indexes
patch my Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18058
Co-authored-by: Mike Adamson <madamson@datastax.com>
Co-authored-by: Caleb Rackliffe <calebrackliffe@gmail.com>