Alex Petrov last 1 month

3 Collaborator

Sam Tunnicliffe , Benedict Elliott Smith , David Capwell

3 Patch	2 Review
5785d86ddad62932e21eb427deede09c77352213, 7d46b7c6b005425ff40085d1fad9fa87988cebca, c55e251dbbeffa35c85aa2d9c1605ff93ac7a340	4f49ca5e29d9c7207654a1f3c4eac9c9f0b84e5e, f543851dffbf2907580f91608be3c1aadc4ccb95

5785d86ddad62932e21eb427deede09c77352213 | Author: Alex Petrov <oleksandr.petrov@gmail.com>
 | 2024-12-04 10:51:57+01:00

    Fix SimulatedDepsTest
    
    Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20114

7d46b7c6b005425ff40085d1fad9fa87988cebca | Author: Alex Petrov <oleksandr.petrov@gmail.com>
 | 2024-12-04 08:59:08+01:00

    Fix SimulatedAccordTaskTest
    
    Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20114

f543851dffbf2907580f91608be3c1aadc4ccb95 | Author: Benedict Elliott Smith <benedict@apache.org>
 | 2024-11-27 09:48:14+00:00

    Fixes
     - Fix notifying unmanaged after update redundant before/bootstrap
     - Do not infer invalid if we have a single round of replies with minKnown not decided and maxKnown erased - in this case store the knowledge for next request.
     - Fix SyncPoint topology selection
     - Fix CheckStatusOkFull.with(InvalidIf)
     - Fix NotifyWaitingOn
     - ExecuteTxn should only contact latest topology for follow-up requests
     - DurableBefore.min should not go backwards on new epoch topology, journal replay was not correctly handling PreApplied, partialTxn can be null if not owned
     - Fix notify pre-bootstrap that arrives post-bootstrap
     - Avoid GC race condition on Propagate where we can incorrectly infer a shard is stale
     - Ensure redundantBefore on previously-owned range does not imply redundant before for overlapping queries on still-owned range
     - Ensure we don't mark stale unless all of the quorum we contacted had erased, else we may have raced with the agreement and erase
     - Fix Invalidate when no route found for FetchData does not report to all requested local epochs
     - Fix WAS_OWNED_RETIRED without durableBefore at Universal can lead to assertions with RX that we permit to execute but that have not yet
     - Fix initialiseWaitingOn can in some cases transitively notify the command we're updating via maybeCleanup of dependencies, but the command isn't yet updated so isn't ready
     - Fix encountering a command that is pre-bootstrap, and for which we have locally 'applied' a supserseding RX, so that we do not know its outcome locally (so we do not cleanup the command), but also it must have been decided - and we should not respond with future dependencies.
     - Epoch failures on CoordinatePreAccept should trigger the CoordinatePreAccept failure handler
     - Use the shard bound rather than GC bound for fallback dependency
     - LatestDeps should be sliced to actual route, so as not to use both PreAccepted AND Stable deps as though Stable
     - Fix various callback issues with node.withEpoch and Recover/Propose.isDone
     - RecoverWithRoute can encounter a partially truncated transaction where the Deps for one shard are not committed. Must fetch LatestDeps.
     - Tighten LatestDeps semantics for Recover
     - CommandsForKey: do not restore pruned as APPLIED
     - Ensure prune points execute in the epoch in which they are declared
     - must merge all fast path votes including those from earlier epochs that may have witnessed a later transaction
     - Recoveries that know the transaction is committed a priori should skip the Accept phase
     - Maintain GC behaviour for redundant commands that are pre-bootstrap
     - don't apply ERASE to CommandsForKey to avoid breaking pruning
     - Introduce clearBefore to ProgressLog to more consistently handle cleaning up redundant transactions (and avoid triggering burn test invariants)
     - don't replay journal of a bootstrapping node in burn test
     - Recover, Accept or Commit reply from epoch that has been retired should be treated as Success rather than Redundant
     - Distinguish completely REDUNDANT+PRE_BOOTSTRAP from partially GC_BEFORE and REDUNDANT+PRE_BOOTSTRAP - latter can make stronger inferences based on the GC_BEFORE intersection (could perhaps be treated as simply GC_BEFORE)
     - RX must register historical transactions with CFK
     - CommandStore.bootstrapper must wait for coordinate sync via same mechanism as sync()
     - Don't start topology change for shard where all replicas are already bootstrapping
     - Reify executes et al in StoreParticipants
     - LocalListeners txn listener reentry may erase the entry entirely
     - use registerAt in AbstractRequest for expirations, use correct time for expiresAt in ListAgent
     - use txnId.epoch() for pruning, as must be before both txnId and executeAt of prune point for coordinating dependencies
     - compute accurate KnownMap when affected by bootstrap or staleness
     - upgradeTruncated should calculate Definition and Deps separately
     - Invalidate should not sort before Erased when calculating max reply or max knowledge reply
     - avoid another infinite loop at end of burn test
     - avoid another epoch loading edge case
     - pass through low/high epochs to ensure we propagate information to all waiting command stores
     - RX must adopt a non-pruned dependency that has a higher TxnId (if is itself behind prune point)
     - rejects should also be calculated on COMMITTED started before
     - remove Apply Factory wrapper for RX, redundant now we have CoordinationAdapters (and has faulty epoch logic)
     - for RX ensure we return maximum  writes for each epoch we intersect (same effectively as pruning logic)
     - rework updateUnmanaged to improve clarity
     - BeginRecovery constructor of LatestDeps should use touches() not owns() for compute localDeps
     - BeginRecovery superseding calculation was incorrectly treating startedBefore Committed and Accepted the same, when the point at which a dep should be known differs
     - Refactor Command visiting, porting C* integration to accord-core
     - RelationMultiMap Builder should resize keys and keyLimits independently
     - CommandsForKey Serialization moved to accord-core
     - losing ownership of range should trigger re-registration of unmanaged waiting on commit of a no-longer owned txn
    
    patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20172

c55e251dbbeffa35c85aa2d9c1605ff93ac7a340 | Author: Alex Petrov <oleksandr.petrov@gmail.com>
 | 2024-11-26 15:25:32+01:00

    Implement field saving/loading in AccordJournal
    
    Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20114

4f49ca5e29d9c7207654a1f3c4eac9c9f0b84e5e | Author: David Capwell <dcapwell@apache.org>
 | 2024-11-08 13:43:50-08:00

    TCM's Retry.Deadline#retryIndefinitely is dangerous if used with RemoteProcessor as the deadline does not impact message retries
    
    patch by David Capwell; reviewed by Alex Petrov, Sam Tunnicliffe for CASSANDRA-20059