Mike Adamson last 3 years


 17 Collaborator
Brandon Williams , Mick Semb Wever , Štefan Miklošovič , Berenguer Blasi , Ekaterina Dimitrova , Caleb Rackliffe , David Capwell , Andrés de la Peña , Jacek Lewandowski , Branimir Lambov , Zhao Yang , Jonathan Ellis , Maxwell Guo , Piotr Kołaczkowski , Jeremiah Jordan , Jason Rutherglen , Jonathon Ellis

 20 Patch  5 Review
7aab61b06357ce0b59977715f82fed1ad24474b4, 7447ee5bddb31ea71a232a44d64dbb7dd0010708, 3b05051f8678c28bc9d93a89123c68f8d0b93b7b, 5ec0669bec0a1cda46b0079a6043bd6ef12b3056, 6a7bef12ecdf59e3a67c81b89c13e3c2bf7e19d8, 3928f2992f94aa2b4e56bee0f36a0bdb31087116, a9e6ed37874f2240039086309e7849bea42c07e2, 0e42b77c9735d1124fe0a5766447f29c891cdb5b, c4d11c4372906ae1dea9e6c31c1136f122e8a1b2, 9697be1131bd8bb2332199000ad55dad12524fd2, 949b760f5516c139591473038917247b1fd7f500, e45c1092f91edd63591f562b2120ea6a5fd3edd5, 9ce86e0ff8b6344b528a0640f9dafa23f97dd85a, 655a2455ac29395b0a303e6ad7fc4d458b18932d, 2531cb045897d5b771f79039d194a1f679d8629a, eb208d3561eaf645f74f60b54c71ebe5bfc24c33, cba3e19ccd81d705ca9f89c0eedab65824e9dd16, 6f125c80420f6d249b5414d886e1b4a93cc34e7f, e5e0f3a8441503107b1ca2128cf8366e5e44d893, cde91e56f09d9ebf315c79c9a81b89f70f4eb724 1d7bae3697b97e64de2c2b958427ef86a1b17731, d16e8d3653dce8ed767a040c06dbaabc47a9b474, 83203a14c400ff99cfb2a5b7e655a663ea882c2b, ebea2ba6ade00a6f156787ca4ee36b2f8eb003ad, ae537abc6494564d7254a2126465522d86b44c1e

7aab61b06357ce0b59977715f82fed1ad24474b4 | Author: Mike Adamson <mikea@apache.org>
 | 2024-02-26 12:42:53+00:00

    Use glove vectors instead of random vectors in vector tests
    - avoid randomisation to make tests more consistent
    - use heap_buffers for VectorDistributedTest for consistency with other tests
    
    patch by Mike Adamson; reviewed by Ekaterina Dimitrova for CASSANDRA-19185

1d7bae3697b97e64de2c2b958427ef86a1b17731 | Author: Caleb Rackliffe <calebrackliffe@gmail.com>
 | 2024-02-22 15:08:23-06:00

    Record latencies for SAI post-filtering reads against local storage
    
    patch by Caleb Rackliffe; reviewed by Mike Adamson for CASSANDRA-18940

7447ee5bddb31ea71a232a44d64dbb7dd0010708 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-12-21 09:12:58+00:00

    Avoid random IndexStreamingFailureTest failures
    
    Change how ByteBuddy injections are handled to avoid ByteBuddy
    failures after node restarts
    
    Patch by Mike Adamson; reviewed by Caleb Rackliffe for CASSANDRA-19084

3b05051f8678c28bc9d93a89123c68f8d0b93b7b | Author: Mike Adamson <madamson@datastax.com>
 | 2023-12-12 17:14:41+00:00

    Simplify segment building in SAI to use single in-memory structure
      This removes the RAMStringIndexer for literal indexes and replaces
      it with a SegmentTrieBuffer that replaces BlockBalancedTreeRamBuffer
      for literal and numeric indexes.
    
    patch by Mike Adamson; reviewed by Andrés de la Peña, Caleb Rackliffe for CASSANDRA-18598

5ec0669bec0a1cda46b0079a6043bd6ef12b3056 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-12-12 12:19:59+00:00

    Fix resource cleanup after SAI query timeouts
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe for CASSANDRA-19177

6a7bef12ecdf59e3a67c81b89c13e3c2bf7e19d8 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-11-28 10:48:23+00:00

    Fix SAI intersection queries
    
     - Fix comparison in PostingListRangeIterator for updating skip token
     - Fix binary search in KeyLookup.clusteredSeekToKey
     - Added new on-disk component for storing partition sizes by partition ID
    
     patch by Mike Adamson; reviewed by Caleb Rackliffe, Mick Semb Wever for CASSANDRA-19011

3928f2992f94aa2b4e56bee0f36a0bdb31087116 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-11-27 13:06:33+00:00

    Provide truncate task for SAI
    
    If a table is truncated during the initial build of an index there is a chance that the index build will get interrupted and it won't get marked queryable. This patch provides a truncate task for SAI that marks the index queryable during truncation.
    
     patch by Mike Adamson; reviewed by Caleb Rackliffe, Michael Semb Wever for CASSANDRA-19032

a9e6ed37874f2240039086309e7849bea42c07e2 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-11-24 15:26:00+00:00

    Fix broken indexing tests when using SAI
     - This fixes a number of broken tests when the default index is set to SAI
     - Composite partition indexes were being filtered prior to row filtering in the
       index searcher resulting in incorrect results
     - Static and non-static index intersection was failing because static primary keys
       were not comparing correctly against non-static primary keys
    
     patch by Mike Adamson; reviewed by Andres de la Peña, Michael Semb Wever for CASSANDRA-19034

0e42b77c9735d1124fe0a5766447f29c891cdb5b | Author: Mike Adamson <madamson@datastax.com>
 | 2023-11-10 14:49:41+00:00

    Improve code model around IndexContext
    
     - Replace IndexContext with IndexTermType and IndexDefinition
     - Move index specific managers, factories and metrics to StorageAttachedIndex
     - Refactor Expression to explicitly define indexed and unindexed expressions
    
     patch by Mike Adamson; reviewed by Andres de la Peña, Caleb Rackliffe for CASSANDRA-18166

c4d11c4372906ae1dea9e6c31c1136f122e8a1b2 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-10-30 09:46:52+00:00

    Fix VectorUpdateDeleteTest for JDK 17
      Removed use of reflection and directly set
      relevant property to avoid jdk 17 errors
    
    patch by Mike Adamson; reviewed by Stefan Miklosovic, Michael Semb Wever and Andrés de la Peña for CASSANDRA-18715

e45c1092f91edd63591f562b2120ea6a5fd3edd5 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-10-04 11:27:50+01:00

    Correctly remove Index.Group from IndexRegistry
    
    The Index.Group was being left in the list indexGroups in the SecondaryIndexManager because the incorrect
    key was being used to remove it from the map
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe and Zhao Yang for CASSANDRA-18905
    
    Co-authored-by: Zhao Yang <zhaoyangsingapore@gmail.com>

9697be1131bd8bb2332199000ad55dad12524fd2 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-09-28 16:54:31+01:00

    Fix dtests returning ordering columns that have not been selected
    
    patch by Mike Adamson; reviewed by adelapena, brandonwilliams and
    Jeremiah Jordan for CASSANDRA-18892

d16e8d3653dce8ed767a040c06dbaabc47a9b474 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
 | 2023-09-18 12:44:08+02:00

    Do not create sstable files before registering in txn
    
    Refactoring prevents the situation where some sstable components, like
    data or index, are created before the new sstable is registered with
    lifecycle transaction, which leads to a problem such that there is
    a short time when incomplete sstable components are present. At the same
    time, no transaction file is created, which leads to the possibility
    that the sstable can be recognized as completed by various
    transaction-aware listers.
    
    Patch by Jacek Lewandowski; reviewed by Branimir Lambov, Mike Adamson for CASSANDRA-18737

949b760f5516c139591473038917247b1fd7f500 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-08-30 11:51:04+01:00

    Add support for a vector search index in SAI
    
    - Adds jbellis/jvector (1.0.2) library for DiskANN based indexes on floating point vectors
    - Adds ORDER BY ANN OF capability to do ANN search and order the results by score
    
     patch by Mike Adamson; reviewed by Andrés de la Peña, Jonathon Ellis for CASSANDRA-18715
    
    Co-authored-by Jonathon Ellis jbellis@gmail.com
    Co-authored-by Zhao Yang zhaoyangsingapore@gmail.com

9ce86e0ff8b6344b528a0640f9dafa23f97dd85a | Author: Mike Adamson <madamson@datastax.com>
 | 2023-08-08 17:07:01+01:00

    SAI result retriever is filtering too many rows
    
    This patch fixes a bug in the SegmentMetadata that
    was only storing the partition key for min and max
    primary keys for a segment. It also contains some
    refactoring of the PrimaryKey to remove the deferred
    loading of PrimaryKeys by the PrimaryKeyMaps.
    
    Patch by Mike Adamson; reviewed by Caleb Rackliffe and Andrés de la Peña for CASSANDRA-18734

655a2455ac29395b0a303e6ad7fc4d458b18932d | Author: Mike Adamson <madamson@datastax.com>
 | 2023-07-28 17:38:20+01:00

    Reduce size of per-SSTable index components for SAI
    
    This patch removes the PRIMARY_KEY_TRIE component and adds KeyLookup.Cursor#clusteredSeekToKey() to
    search for clustering keys within a partition. To do this a new on-disk component
    PARTITION_SIZES has been added that holds the size of each partition in the SSTable.
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18673

83203a14c400ff99cfb2a5b7e655a663ea882c2b | Author: Caleb Rackliffe <calebrackliffe@gmail.com>
 | 2023-07-14 01:44:26-07:00

    Importer should build SSTable indexes successfully before making new SSTables readable
    
    - Avoid validation in response to SSTableAddedNotification, as it should already have been done somewhere else
    - Change SSTableWriter to prevent commit when a failure is thrown out of an index build
    
    patch by Caleb Rackliffe; reviewed by Mike Adamson and Andres de la Peña for CASSANDRA-18670

2531cb045897d5b771f79039d194a1f679d8629a | Author: Mike Adamson <madamson@datastax.com>
 | 2023-07-13 11:24:55+01:00

    Fix concurrency in bbtree reader by cloning state
    
    patch by Mike Adamson; reviewed by Andrés de la Peña and Caleb Rackliffe for CASSANDRA-18669

ebea2ba6ade00a6f156787ca4ee36b2f8eb003ad | Author: Jonathan Ellis <jbellis@datastax.com>
 | 2023-06-26 14:50:01-05:00

    Upgrade to lucene-core 9.7.0
    
    Notes on the upgrade path:
    - RamIndexOutput is replaced with ResettableByteBuffersIndexOutput, an extension of ByteBuffersIndexOutput, which was the closest thing to a replacement of RamIndexOutput.
    - Lucene exposes the code we needed from DirectReaders more or less directly in DirectReader now, so the old copied code has been deleted.
    - Lucene changed its data files to be little endian, but to keep its compatibility story simple it retained BE for the header and footer ints. That's the cause of the changes in SAICodecUtils.
    - We could gain a bit of performance making our own code natively little endian but that is too big of a change for this patch.
    
    patch by Jonathan Ellis; reviewed by Andrés de la Peña, Caleb Rackliffe, and Mike Adamson for CASSANDRA-18494

ae537abc6494564d7254a2126465522d86b44c1e | Author: David Capwell <dcapwell@apache.org>
 | 2023-06-21 15:27:26-07:00

    Added support for type VECTOR<type, dimension>
    
    patch by David Capwell; reviewed by Andres de la Peña, Maxwell Guo, Mike Adamson for CASSANDRA-18504

6f125c80420f6d249b5414d886e1b4a93cc34e7f | Author: Mike Adamson <madamson@datastax.com>
 | 2023-06-12 11:25:17+01:00

    Numeric on-disk index write and search
    
    Includes:
      - The disk/v1/kdtree package containing the
    kdtree writer and reader
      - The implementation code to tie these into
    the existing read and write paths. The main parts
    of this are the NumericIndexWriter and the
    NumericIndexSegmentSearcher
      - Additional testing for the new code
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18067
    
    Co-authored-by: Mike Adamson <madamson@datastax.com>
    Co-authored-by: Caleb Rackliffe <calebrackliffe@gmail.com>
    Co-authored-by: Piotr Kołaczkowski <pkolaczk@gmail.com>
    Co-authored-by: Jason Rutherglen <jason.rutherglen@gmail.com>
    Co-authored-by: Zhao Yang <zhaoyangsingapore@gmail.com>

cba3e19ccd81d705ca9f89c0eedab65824e9dd16 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-05-10 15:05:15+01:00

    Query all ranges at once for SAI distributed queries
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe, Andres de la Peña, and Berenguer Blasi for CASSANDRA-18515

eb208d3561eaf645f74f60b54c71ebe5bfc24c33 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-05-09 12:29:01+01:00

    Add basic text analysis to SAI, including "case_sensitive", "normalize", and "ascii" modes
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18479

e5e0f3a8441503107b1ca2128cf8366e5e44d893 | Author: Mike Adamson <mikeatdot@gmail.com>
 | 2023-04-13 17:23:13+01:00

    Literal on-disk index and index write path (#9)
    
    This commit contains the following additions
     to SAI:
     - The index write path and index building
       based around StorageAttachedIndexBuilder
       and StorageAttachedIndexWriter
     - The on-disk index versioning using the
       SSTable Descriptor analog IndexDescriptor
       with Version and OnDiskFormat
     - The literal on-disk index using the
       LiteralIndexWriter
    
    patch by Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18062
    
    Co-authored-by: Mike Adamson <mikeatdot@gmail.com>
    Co-authored-by: Caleb Rackliffe <calebrackliffe@gmail.com>
    Co-authored-by: Andres de la Peña <a.penya.garcia@gmail.com>
    Co-authored-by: Piotr Kołaczkowski <pkolaczk@gmail.com>
    Co-authored-by: Jason Rutherglen <jason.rutherglen@gmail.com>

cde91e56f09d9ebf315c79c9a81b89f70f4eb724 | Author: Mike Adamson <madamson@datastax.com>
 | 2023-01-19 14:24:46+00:00

    In-memory index implementation with query path
    
    This includes the following elements of the Storage Attached Index:
    - Memtable-attached indexes backed by an in-memory trie structure for byte-comparable values
    - Query path for the in-memory index
    - Index status propagation
    - Randomized testing for Memtable-attached indexes
    
    patch my Mike Adamson; reviewed by Caleb Rackliffe and Andres de la Peña for CASSANDRA-18058
    
    Co-authored-by: Mike Adamson <madamson@datastax.com>
    Co-authored-by: Caleb Rackliffe <calebrackliffe@gmail.com>