47 Collaborator |
Brandon Williams , Štefan Miklošovič , Caleb Rackliffe , Andrés de la Peña , Berenguer Blasi , Ekaterina Dimitrova , Mick Semb Wever , Marcus Eriksson , David Capwell , Sam Tunnicliffe , Alex Petrov , Benjamin Lerer , Branimir Lambov , Yifan Cai , Benedict Elliott Smith , Francisco Guerrero , Maxim Muzafarov , Aleksey Yeschenko , Mike Adamson , Sylvain Lebresne , Jackson Fleming , C. Scott Andreas , Maxwell Guo , Jeremiah Jordan , Claude Warren , Bernardo Botella , Shailaja Koppu , Jason Rutherglen , Dan Jatnieks , Szymon Miężał , Szymon Miezal , Ningzi Zhan , Dimitar Dimitrov , Jakub Zytka , German Eichberger , reviewed by Brandon Williams , Paul Rütter , Lukasz Antoniak , Calib Rackliffe , Ben Manes , Amit Pawar , Adriano Bonacin , @jacek-lewandowski , @blambov , , , |
53 Patch |
27 Review |
ad26ffcd577a09c07fe92bd3ce78ee33dfe0a191,
f92998190ccfc688e22d035318848a2f61987585,
5be57829b03ef980933ba52ecc0549787f653da4,
d422eb1f353d27264665bfe3357dac1160814ea1,
3edca0041caf95a03453c533dc70bdc62e6dabd9,
f0ea12c6d7683697d9a5ca0c99c2b7dc3bc11230,
2fc2be54caf4f1861cac8e93146ace03de5e1db6,
3259bea5332ed7d28fac8709c854adfee596dda5,
dece96f21dbdbbb3176d6544d72bef3d44571dc6,
b7b2aa5de57a6433f3f861bcaefe467d64784d1b,
4d61359c214fbe8ee8b8edc822cad79f98b337bc,
7a2bfdc56d2441d27b467614c2b25fe915ae34bf,
9dbd63a5b9aa2f2398b02ee5c72d8e977f56867d,
86e07595f744eb2a250cf6c25ee7cb09c6dbd849,
ac71d0f56efda081cf3c602eae8897b64cf84dac,
8486d678b0ee89931627aa5a00a2c5577f93f0a0,
6de90bf75c2a5138f4ed72ff6ed588dc180e8a9d,
d16e8d3653dce8ed767a040c06dbaabc47a9b474,
3658ba58c7d0be0803cbd7480c73d46705c3372d,
15f355a0062148e1ca511e8fc515e0cba380790d,
b0530276809f2cec3b170446396b4bd0869948e7,
53d1644ff4142f4383a773408c142c34954063f5,
ad26ffcd577a09c07fe92bd3ce78ee33dfe0a191,
4ea7bb25b4079e951202762aeaabe1d23be5303c,
5143bd81e82c35ce686dd40860ec2aebe30aaf22,
fad1f7457032544ab6a7b40c5d38ecb8b25899bb,
f16fb6765b8a3ff8f49accf61c908791520c0d6e,
fe0e04c2319afab958b3da83e7b54c84bced9dc2,
f96659c5306e62666e21c371c2ded646dd51672b,
b7e1e44a909c3a1d11e9c387db680c74d31b879f,
24ebd24c79175f7426f4c489dc5a006e75f09dfb,
cfe9641fbec0dc62c9a0f4f156c702e2cfa6ad4e,
9bf128aaa33c21427c826dab82414e3772d2ba24,
9213335f59293926b2d643fa8a156a882495dd42,
76be530a364b376c1d69d8447230ad5cf023be7f,
a78db628b0bcae6b1d30829b7510093ec4bca0ef,
570732375e4186741388adb81afeab6f155f57b9,
c1d89c32d27921d1f77f05d29ee248b8922a4c76,
e966c45afcf8bef47df245ccb851386e5ce60505,
458bfd16c7ec759705f920e7ef9a8f2bb5a3f4b5,
77d6bbf25a59d44422f0cbee2631f2fca9170e1a,
05fa92475ccb2beb70a96ddee83c04b65a2cdbfb,
0040fea3797ea3e497691e9d1e2660711c60ac4d,
56ec75cc6524e7c514c64a1db839d9da738e42b6,
2b2c6decfafc6235ad537e72073fab2fd4467e2f,
a690f339ab0f2b98c69621ca5a0bad10ae9a7919,
d543dae2cd0d6540d95eb3252d79e75393fd993d,
3af85373d70fb5d549447f6520da1d11a228d71a,
4bf723424564206b33782f364c0edaeb9210532c,
bb5e2335ecdfa15fc2d8d6c21b1cf7933c9b5002,
ddc7ca342f8a84565a01d769c4ec71502a2278d6,
17f67484f4597e223a669dda4d52fb2cc2250ddf,
2aea316f85e68b4e4739b61260faf5ed91552d5f |
3755934e5224a6e9f826a0a594d415c36465d449,
7c55c73825e341315e520381968338d57afbb67a,
c53d3ac8c6a743b7e730d2ac358516842b024133,
2f836fa59687d79705c96d5836978c9266813780,
5be57829b03ef980933ba52ecc0549787f653da4,
5d46ff27968050e51425083fc3ab8b7d4a51fcd5,
0e4c2f4befa22caa68b34f95d0169b4685bc7e0d,
b59b832eba014e8d2fc93133cb3db41b509a1c26,
eb30005251cd8c10732ecab8365ebaa45f5fcbde,
4093be5295e15a3665b7ca49e9527785f5893bd9,
725655dda2776fef35567496a6e331102eb7610d,
c273017b256d385fa0904d410306c7677aca4726,
e9198d6a660c96c21fe1c5bf0bbf19fbfc619189,
903857b4ef01b577db2cbcf3ea9a9b194dede21c,
4ea7bb25b4079e951202762aeaabe1d23be5303c,
ffe4d85df23e22be78b8047e91e4a065c5c73c06,
f6509086483983176f82a4b72912927693b6e573,
58a3b12508f97e44d3812f6c97e5a969dc6b5a1b,
cfe9641fbec0dc62c9a0f4f156c702e2cfa6ad4e,
49dfb805e9045c856181d6c2ac3b586b98d1a82a,
7c55c73825e341315e520381968338d57afbb67a,
562cb26010659830dd1192939ac815a0f6cb3502,
30641ea7b6b8253651562aeb0102778a0f9a405b,
e966c45afcf8bef47df245ccb851386e5ce60505,
58a3b12508f97e44d3812f6c97e5a969dc6b5a1b,
1da18efb2ffd3f9efc3b8b178b2a8d38a6831056,
dc6918d5e68a0849e8e38b7a49d0a822b95a6781 |
2f836fa59687d79705c96d5836978c9266813780 | Author: Stefan Miklosovic <smiklosovic@apache.org>
| 2024-03-08 11:32:40+01:00
Set uuid_sstable_identifiers_enabled to true for cassandra-latest.yaml
patch by Stefan Miklosovic; reviewed by Branimir Lambov, Jacek Lewandowski for CASSANDRA-19460
2fc2be54caf4f1861cac8e93146ace03de5e1db6 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-12-13 10:44:29+01:00
Fix the correspondingMessagingVersion of SSTable format and improve TTL overflow tests coverage
Patch by Jacek Lewandowski; reviewed by Berenguer Blasi for CASSANDRA-19197
f0ea12c6d7683697d9a5ca0c99c2b7dc3bc11230 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-12-12 12:36:23+01:00
Add a startup check to fail startup when using invalid configuration with certain Kernel and FS type
Patch by Jacek Lewandowski; reviewed by Maxwell Guo, Stefan Miklosovic for CASSANDRA-19196
d422eb1f353d27264665bfe3357dac1160814ea1 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-12-08 10:41:50+01:00
Fix storage_compatibility_mode for streaming
- Rename and refactor a bit property for overriding storage compatibility mode
- Fix streaming for tools
- consider bulkloader as a client instead of tool
Patch by Jacek Lewandowski; reviewed by Berenguer Blasi, Branimir Lambov for CASSANDRA-19126
3edca0041caf95a03453c533dc70bdc62e6dabd9 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-12-05 11:17:43+01:00
Memoize Cassandra verion and add a backoff interval for failed schema pulls
Also, fixes MigrationCoordinatorTest and adds version assertions to Instance.startup
Patch by Jacek Lewandowski; reviewed by Ekaterina Dimitrova for CASSANDRA-18902
5be57829b03ef980933ba52ecc0549787f653da4 | Author: Szymon Miężał <szymon.miezal@datastax.com>
| 2023-11-08 17:41:45+01:00
Backport CASSANDRA-16418 to 3.x
When a node is decommissioned, it triggers data transfer to other nodes.
During this transfer process, receiving nodes temporarily hold token ranges in a pending state.
However, the current cleanup process doesn't account for these pending ranges when calculating token ownership,
leading to inadvertent cleanup of data already stored in SSTables.
To address this issue, this patch introduces two changes.
Firstly, it backports CASSANDRA-16418, introducing a preventive check in `StorageService#forceKeyspaceCleanup`.
This check disallows the initiation of cleanup when a node contains any pending ranges for the requested keyspace.
Secondly, it reintroduces a similar condition to test for the existence of pending ranges in `CompactionManager#performCleanup`.
This ensures the safety of this API as well.
Patch by Szymon Miezal; reviewed by Brandon Williams, Jacek Lewandowski for CASSANDRA-18824
Co-authored-by: Szymon Miezal <szymon.miezal@datastax.com>
Co-authored-by: Jacek Lewandowski <lewandowski.jacek@gmail.com>
7a2bfdc56d2441d27b467614c2b25fe915ae34bf | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-11-07 16:18:54+01:00
Fix incorrect seeking through the sstable iterator by IndexState
Patch by Jacek Lewandowski; reviewed by Alex Petrov and Maxim Muzafarov for CASSANDRA-18932
0e4c2f4befa22caa68b34f95d0169b4685bc7e0d | Author: Bereng <berenguerblasi@gmail.com>
| 2023-11-07 07:24:57+01:00
Default to nb instead of nc for sstable formats
patch by Berenguer Blasi; reviewed by Francisco Guerrero, Jacek Lewandowski, Michael Semb Wever for CASSANDRA-19010
b59b832eba014e8d2fc93133cb3db41b509a1c26 | Author: Stefan Miklosovic <smiklosovic@apache.org>
| 2023-10-26 12:21:44+02:00
Remove crc_check_chance from CompressionParams
patch by Stefan Miklosovic; reviewed by Maxwell Guo, Jacek Lewandowski, Branimir Lambov for CASSANDRA-18872
5d46ff27968050e51425083fc3ab8b7d4a51fcd5 | Author: Claude Warren <claude.warren@aiven.io>
| 2023-10-25 13:00:50+02:00
Remove dependency on Sigar in favor of OSHI
patch by Claude Warren; reviewed by Stefan Miklosovic, Jacek Lewandowski, Michael Semb Wever for CASSANDRA-16565
Co-authored-by: Stefan Miklosovic <smiklosovic@apache.org>
86e07595f744eb2a250cf6c25ee7cb09c6dbd849 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-10-20 12:35:33+02:00
Retrieve keyspaces metadata and schema version concistently in DescribeStatement
The fix makes the DescribeStatement to wait for the in-progress schema transformations to finish before returning the first page. This way, the metadata and schema version encoded in the result set metadata are guaranteed to be consistent.
Patch by Jacek Lewandowski; reviewed by Benjamin Lerer, Ekaterina Dimitrova for CASSANDRA-18921
ac71d0f56efda081cf3c602eae8897b64cf84dac | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-10-11 12:12:36+02:00
Fixed the inconsistency between distributedKeyspaces and distributedAndLocalKeyspaces
Patch by Jacek Lewandowski; reviewed by Benjamin Lerer, Berenguer Blasi, Ekaterina Dimitrova, Jeremiah Jordan for CASSANDRA-18747
3259bea5332ed7d28fac8709c854adfee596dda5 | Author: Amit Pawar <Amit.Pawar@amd.com>
| 2023-10-06 01:23:37+05:30
Enable Direct-IO feature for CommitLog files using Java native API's.
Patch by Amit Pawar and Jacek Lewandowski; reviewed by Branimir Lambov and Maxwell Guo for CASSANDRA-18464
Co-authored-by: Amit Pawar <Amit.Pawar@amd.com>
Co-authored-by: Jacek Lewandowski <lewandowski.jacek@gmail.com>
9dbd63a5b9aa2f2398b02ee5c72d8e977f56867d | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-10-05 09:09:07+02:00
Fix KeyCacheTest for cases when early open is disabled
Patch by Jacek Lewandowski; reviewed by Berenguer Blasi, Branimir Lambov, Caleb Rackliffe for CASSANDRA-18911
8486d678b0ee89931627aa5a00a2c5577f93f0a0 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-09-28 12:36:38+02:00
Fix CQLConnectionTest and SimpleClient
There are a couple of fixes in this patch. As explained on the ticket, some of the flaky failures are a race of two events that cause the termination of the connection. First is a legitimate close as a result of expected failures. In this case, the server sends the error message and closes the connection. The test expects to receive that message. However, the test was sending more messages that couldn't be received because the server already closed the connection, and they were bounced by the OS, causing immediate connection shutdown on the client side, even before it could receive the error message.
The fix is to stop sending the messages after sending the message, which is expected to cause a failure. Some other accompanying modifications include using Awaitility to wait for specific events and consider the configured maximum number of consecutive failures.
Also added some more logging to help investigate failures in the future.
Patch by Jacek Lewandowski; reviewed by Sam Tunnicliffe for CASSANDRA-16949
3658ba58c7d0be0803cbd7480c73d46705c3372d | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-09-20 11:44:41+02:00
JMH improvements - faster build and async profiler
- Don't create uber jar for microbenchmarks
- Add async profiler to jmh tests
- Benchmark classes names validation
- Add jmh.args property to make it possible passing extra args to JMH
- Add missing test/anttasks to idea configuration
Patch by Jacek Lewandowski; reviewed by Branimir Lambov, Maxim Muzafarov, Stefan Miklosovic for CASSANDRA-18871
d16e8d3653dce8ed767a040c06dbaabc47a9b474 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-09-18 12:44:08+02:00
Do not create sstable files before registering in txn
Refactoring prevents the situation where some sstable components, like
data or index, are created before the new sstable is registered with
lifecycle transaction, which leads to a problem such that there is
a short time when incomplete sstable components are present. At the same
time, no transaction file is created, which leads to the possibility
that the sstable can be recognized as completed by various
transaction-aware listers.
Patch by Jacek Lewandowski; reviewed by Branimir Lambov, Mike Adamson for CASSANDRA-18737
4093be5295e15a3665b7ca49e9527785f5893bd9 | Author: Brandon Williams <brandonwilliams@apache.org>
| 2023-09-08 09:02:55-05:00
Nodetool paxos-only repair is no longer incremental
Patch by Ningzi Zhan; reviewed by brandonwilliams, jlewandowski, and
Maxwell Guo for CASSANDRA-18466
725655dda2776fef35567496a6e331102eb7610d | Author: Brandon Williams <brandonwilliams@apache.org>
| 2023-09-08 09:02:55-05:00
Nodetool paxos-only repair is no longer incremental
Patch by Ningzi Zhan; reviewed by brandonwilliams, jlewandowski, and
Maxwell Guo for CASSANDRA-18466
b0530276809f2cec3b170446396b4bd0869948e7 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-09-08 11:20:30+02:00
Set the delay to 0 for unit tests and fix GuardrailDiskUsageTest
Patch by Jacek Lewandowski; reviewed by Berenguer Blasi and Jeremiah Jordan for CASSANDRA-18821
# Conflicts:
# test/distributed/org/apache/cassandra/distributed/test/ColumnMaskTest.java
# test/unit/org/apache/cassandra/cql3/CQLTester.java
eb30005251cd8c10732ecab8365ebaa45f5fcbde | Author: Ekaterina Dimitrova <ekaterina.dimitrova@datastax.com>
| 2023-08-29 11:52:08-04:00
Upgrade caffeine cache and fix CIDR permissions cache invalidation
patch by Ekaterina Dimitrova; reviewed by Jacek Lewandowski, Ben Manes, Yifan Cai, Shailaja Koppu for CASSANDRA-18805
4d61359c214fbe8ee8b8edc822cad79f98b337bc | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-08-22 10:47:29+02:00
CASSANDRA-18785: Add support for Sonar analysis
Patch by Jacek Lewandowski; reviewed by Brandon Williams, Maxim Muzafarov, Michael Semb Wever, Stefan Miklosovic for CASSANDRA-18785
3755934e5224a6e9f826a0a594d415c36465d449 | Author: Ekaterina Dimitrova <ekaterina.dimitrova@datastax.com>
| 2023-07-17 16:11:17-04:00
Drop JDK8 and add JDK17, remove eclipse-warnings in favor of Checker Framework and upgrade checkstyle
patch by Ekaterina Dimitrova; reviewed by Jeremiah Jordan, Berenguer Blasi, Michael Semb Wever and Jacek Lewandowski for CASSANDRA-18255
ad26ffcd577a09c07fe92bd3ce78ee33dfe0a191 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-07-17 12:22:33+02:00
Run checks in a separate task and fix build warnings
Patch by Jacek Lewandowski; reviewed by Mick Semb Wever and Stefan Miklosovic for CASSANDRA-18618
ad26ffcd577a09c07fe92bd3ce78ee33dfe0a191 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-07-17 12:22:33+02:00
Run checks in a separate task and fix build warnings
Patch by Jacek Lewandowski; reviewed by Mick Semb Wever and Stefan Miklosovic for CASSANDRA-18618
1da18efb2ffd3f9efc3b8b178b2a8d38a6831056 | Author: Mick Semb Wever <mck@apache.org>
| 2023-07-16 13:15:43+02:00
Fix upgrade_through_versions_test.py::TestUpgrade* tests
Make run the generated upgrade_through_versions_test tests on pytest >7.2.0
pytest-7.2.0 changed how markers were inherited, https://github.com/pytest-dev/pytest/issues/7792
Replace the marker with runtime pytest.skip call to ensure generated tests are run but not the base class.
Remove how internode_ssl was changing seeds to append the ssl storage port, it's not needed as the tests always already set enable_legacy_ssl_storage_port to true.
Filter upgrade steps by what JDKs they require and what the current JDK is (or what JAVA<jdk_version>_HOME vars are defined).
Replace any version in the multi-step upgrade path with the current code (when it matches). This enables forward upgrade testing.
patch by Mick Semb Wever; reviewed by Brandon Williams, Jacek Lewandowski for CASSANDRA-18499
c273017b256d385fa0904d410306c7677aca4726 | Author: Stefan Miklosovic <smiklosovic@apache.org>
| 2023-07-10 13:15:57+02:00
Add AzureSnitch
As we were implementing the snitch itself, we noticed that the constructors
for cloud-based snitches are unnecessarilly complicated and
we took the opportunity to make them simpler.
patch by Stefan Miklosovic; reviewed by German Eichberger and Jacek Lewandowski for CASSANDRA-18646
903857b4ef01b577db2cbcf3ea9a9b194dede21c | Author: Stefan Miklosovic <smiklosovic@apache.org>
| 2023-06-29 17:30:02+02:00
Deprecate CloudstackSnitch and remove duplicate code in snitches
The patch also refactors existing cloud snitches to get rid of the duplicate code,
this is the logical follow-up of CASSANDRA-16555 where AbstractCloudMetadataServiceConnector was introduced.
patch by Stefan Miklosovic; reviewed by Jacek Lewandowski, Jackson Fleming and Maxwell Guo for CASSANDRA-18438
4bf723424564206b33782f364c0edaeb9210532c | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-06-28 14:18:46+02:00
Fix native_transport_ssl_test.py::TestNativeTransportSSL::test_connect_to_ssl
With the change in Netty, namely https://github.com/netty/netty/pull/13314, it throws NotSslRecordException
53d1644ff4142f4383a773408c142c34954063f5 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-06-13 09:44:30-05:00
Upgraded to Netty 4.1.96
- Add Bouncycastle dependency
- Upgrade tcnative boringssl
- Add TLSv1.3 to encryption options tests
- Revert defaults after changes in Netty 4.1.75
- Remove Guava 18 from deps - we accidentally ended with Guava 30+ and 18 on the classpath because JimFS includes it as a transient dependency.
Patch by Jacek Lewandowski and Brandon Williams; reviewed by Ekaterina Dimitrova and Berenguer Blasi for CASSANDRA-17992
Co-authored-by: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Co-authored-by: Brandon Williams <driftx@gmail.com>
5143bd81e82c35ce686dd40860ec2aebe30aaf22 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-05-26 09:29:37+02:00
Track the amount of read data per row
If an sstable is corrupted in a nasty way, we may read invalid cell sizes and try to read much more data for a row than we should. In rare scenarios this can lead even to OOMs.
This simple fix adds tracking and limiting the amount of data that is read per row. Row has its size stored in preamble which can be used as a limit. If the deserialization code tries to read more than that, it will simply fail with EOF which will prevent more serious problems later.
Patch by Jacek Lewandowski; reviewed by Berenguer Blasi and Maxwell Guo for CASSANDRA-18513
# Conflicts:
# src/java/org/apache/cassandra/db/rows/UnfilteredSerializer.java
ffe4d85df23e22be78b8047e91e4a065c5c73c06 | Author: Bernardo Botella Corbi <contacto@bernardobotella.com>
| 2023-05-18 16:21:16-07:00
Use WithProperties in try-with-resources to improve properties handling in tests
patch by Bernardo Botella Corbi; reviewed by Stefan Miklosovic, Maxim Muzafarov and Jacek Lewandowski for CASSANDRA-18453
f6509086483983176f82a4b72912927693b6e573 | Author: Maxim Muzafarov <maxmuzaf@gmail.com>
| 2023-05-03 12:15:37+02:00
Moved system properties and envs to CassandraRelevantProperties and CassandraRelevantEnv respectively
Patch by Maxim Muzafarov; reviewed by Stefan Miklosovic and Jacek Lewandowski for CASSANDRA-17797
fe0e04c2319afab958b3da83e7b54c84bced9dc2 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-04-25 15:33:48+02:00
Fix sstable formats configuration
- refactored sstable format configuration
- sstable formats are discovered via ServiceLoader
- options configuration for sstable formats can be included in yaml
- yaml may include selected sstable format and version (version is not yet supported)
- auto saved caches refactored - they include additional metadata component which contains necessary mappings
patch by Jacek Lewandowski; reviewed by David Capwell for CASSANDRA-18441
fad1f7457032544ab6a7b40c5d38ecb8b25899bb | Author: Branimir Lambov <branimir.lambov@datastax.com>
| 2023-04-21 11:52:10+03:00
Rename the byte-comparable translation version to OSS50
Also fix some minor issues in ByteComparable.md
patch by Branimir Lambov and Jacek Lewandowski; reviewed by Caleb Rackliffe and Maxwell Guo for CASSANDRA-18398
f96659c5306e62666e21c371c2ded646dd51672b | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-03-14 14:29:39+01:00
Save host id to system.local and flush immediately after startup
patch by Adriano Bonacin and Jacek Lewandowski; reviewed by Stefan Miklosovic and Sam Tunnicliffe for CASSANDRA-18153
ddc7ca342f8a84565a01d769c4ec71502a2278d6 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-03-08 14:49:56+01:00
Repair the previous fix in offline_tools_test.py to work for < 4.2
patch by Jacek Lewandowski, reviewed by Brandon Williams and Ekaterina Dimitrova for CASSANDRA-18308
b7e1e44a909c3a1d11e9c387db680c74d31b879f | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-03-02 12:46:25+01:00
SSTable format API
Summary of the changes:
Format, reader and writer
---------------------------
There are a lot of refactorings around sstable related classes aiming to extract the most generic functionality to the top-level entities and push down implementation-specific stuff to the actual implementation. In Particular, the top-level, implementation agnostic classes/interfaces are SSTableFormat interface, SSTable, SSTableReader, SSTableWriter, IVerifier, and IScrubber. The rest of the codebase has been reviewed for explicit usages of big table format-specific usages of sstable classes and refactored. SSTable, SSTableReader, and SSTableWriter have their builders. Builders make a hierarchy that follows the same inheritance structure as readers and writers.
There are also partial implementations that add support for some features and may or may not be used by the custom implementations. They include:
- AbstractSSTableFormat - adds an implementation of some initialization methods - in practice, all of the format implementations should extend this class
- SSTableReaderWithFilter - add support for Bloom filter to the reader
- SortedTableWriter - generic implementation for a writer which writes partitions in the default order to the data file, supports Bloom filter and some index of partitions
- IndexSummarySupport - interface implemented by the readers using index summaries
- KeyCacheSupport - interface implemented by the readers using row key cache
Descriptor
---------------------------
Refactored the Descriptor class so that:
- All paths are created from the base directory File rather than from a String
- All the methods named *filename* producing full paths were made private; their current implementations are returning file names rather than paths (the naming was inconsistent)
- The usages of the `filenameFor` method were refactored to use the `fileFor` method
- The usages of the `fromFilename` method were refactored to use a `fromFileWithComponent(..., false).left` expression
In essence, the Descriptor class is no longer working on String-based paths.
Index summaries
---------------------------
Removed the index summary from the generic SSTableReader class and created an interface IndexSummarySupport to be implemented by the readers that need it. Methods in related classes that refer back to the reader were refactored to support just readers of the SSTableReader & IndexSummarySupport type. Therefore, we will no longer need to assume that the generic SSTableReader has anything to do with an index summary.
A new IndexSummaryComponent class encloses data fields from the index summary file (note that aside from the index summary itself, the file includes the first and last partition of the sstable). The class has been extracted to deal with those fields and have that logic in a single place.
Filter
---------------------------
Refactored IFilter and its serialization - in particular, added the `serialize` method to the IFilter interface and moved loading/saving logic to a separate utility class FilterComponent.
Extracted the SSTableReaderWithFilter abstract reader extending the generic SSTableReader with filter support.
Extracted bloom filter metrics into separate entities allowing to plug them in if the implementation uses a filter.
Cache
---------------------------
Refactored CacheService to support different key-cache values. CacheService now supports arbitrary IRowIndexEntry implementation as a key-cache value. A new version of the auto-saving cache was created ("g") because some information about the type of serialized row index entry needs to be known before it is deserialized (or skipped). Therefore, the SSTableFormat type ordinal number is stored, which is sufficient because the IRowIndexEntry serializer is specific to the sstable format type.
Similarly to the IndexSummarySupport, a new KeyCacheSupport interface has to be implemented to mark the reader as supporting key-cache. It contains the default implementation of several methods the rest of the system relies on when the key-cache is supported.
Other changes
---------------------------
- Fixed disabling chunk cache - enable(boolean) method in ChunkCache does not make any sense - it makes a false impression it can disable chunk cache once enabled, while in fact, it only clears it. Added setFileCacheEnabled to DatabaseDescriptor
- Made WrappingUnfilteredRowIterator an interface
- DataInputStreamPlus extends InputStream - this makes it possible for input stream-based inheritors of DataInputPlus to extend DataInputStreamPlus. It simplifies coding because sometimes we want to get DataInputPlus implementation extending InputStream as an argument.
- Table and keyspace metrics were made pluggable - in particular, added the ability for a certain format to register gauges that are specific only to that format and make no sense for others
- Implemented mmapped region extension for compressed data
- Refactored FileHandle so that it is no longer closable
- Implemented WrappingRebufferer
- Introduced the SSTable.Owner interface to make SSTable implementation not reference higher-level entities directly. SSTable accepts passing null as the owner when there is no owner (like sometimes in offline tools) or passing a mock when needed in tests.
Individual commits
---------------------------
[4a87cd36fe] Fix disabling chunk cache
[c84c75ccf3] Made WrappingUnfilteredRowIterator an interface
[253d2b828e] Add getType to SSTableFormat
[3f169dcc20] Remove getIndexSerializer from SSTableFormat
[05bae1833b] Pull down rowIndexEntrySerializer field
[da675f2809] Moved RowIndexEntry
[673f0c5c39] Reduce usages of RowIndexEntry
[c72538be91] Refactor CacheService to support for different key cache values
[54d33ee656] Minor refactoring of ColumnIndex
[93862df967] Just moved AbstractSSTableIterator to o.a.c.io.sstable.format
[9e4566a1de] Refactored AbstractSSTableIterator
[a4e61e80bb] Extracted IScrubber and IVerifier interfaces
[20f78c7419] Push down implementation of SSTableReader.firstKeyBeyond
[f2c24e5774] Moved SSTableReader.getSampleIndexesForRanges to IndexSummary
[b6c3a6c1ea] Moved SSTableReader.getKeySamples implementation to IndexSummary
[c4b90ebb33] Refactor InstanceTidier so that it is more generic
[918d5a9e74] Refactor dropping page cache
[a52fb4d558] Refactor sstable metrics
[f6d10f930f] NEW (fix up) - DataInputStreamPlus extends InputStream
[8f6a56d972] Getting rid of index summary in SSTableReader
[4a918bf725] Removed direct usages of primary index from SSTableReader
[358fa32602] Refactor KeyIterator so that it is sstable format agnostic
[14c09d89c2] Remove explicit usage of Components outside of format specific classes
[feff14e137] Move clone methods implementation from SSTableReader to BigTableReader
[64e9787b10] Move saveIndexSummary and saveBloomFilter to SSTableReaderBuilder
[ae71fe6ed8] Moved indexSummary field to BigTableReader and made it private
[df9fd8c4b9] Moved ifile field to BigTableReader and made it private
[2be6ea9ecf] Moved static open methods for BigTableReader to the reader factory
[bc0e55ac48] Minor refactoring around IFilter and its serialization
[5b95704beb] Minor refactorings around IndexSummary
[87812335e8] Extracted TOCComponent class to deal with TOC file
[fdad092a6a] Extracted CompressionInfoComponent class
[39b47e388d] Extracted StatsComponent as a helper for elements of SSTable metadata
[cdb55bff47] Fix SSTable.getMinimalKey
[b99c6d5805] Refactor FileHandle so that it is no longer closable
[77b7f7ace5] Implement WrappingRebufferer
[b6868914dd] Add progressPercentage to ProgressInfo
[7fd4956e5b] Moved copy/rename/hardLink methods from SSTableWriter to SSTable
[1ccc6bf148] Create generic SSTableBuilder and IOOptions
[da58a81102] Refactor SSTableReaderBuilder
[4501ddba1c] Refactor ColumnIndex
[d4f9e1a64b] Extracted non-big-table-specific functionality from BigTableWriter to SortedTableWriter
[379525d01e] Refactor BigTableZeroCopyWriter to SSTableZeroCopyWriter as it is not specific to big format
[8ac37f83bc] Extract EmptySSTableScanner out from BigTableScanner
[ee6673f1cf] Implement SSTableWriterBuilder
[bb26629235] Refactor opening early / final
[a327595015] Refactored SSTableWriter factory
[16ffd7334b] Extract non-big-format-specific logic from scrubber and verifier
[75e02db6af] Allow to specify the default SSTableFormat via system property
[a7b9d0d628] Small fixes around streaming
[407f977c36] Move guard collection size
[0529e57d2f] Remove explicit references to big format
[61509963ec] Unclassified minor changes
[da28d1af3a] Replaced getCreationTimeFor(Component) with getDataCreationTime()
[e99c834de6] !!! Reformatting
[882b7baa5a] Rename SSTableReader.maybePresent and fix its redundant usages
[b70c983bea] Implement mmapped region extension for compressed data
[d7ff3970de] Introduce SSTable.Owner interface
[e9feb9c462] Replaced getCreationTimeFor(Component) with getDataCreationTime()
[ee8082fb07] Created SSTableFormat.deleteOrphanedComponents
[e62950fd3d] Refactor metrics further
[cefa5b3814] Extract key cache support into separate entity
[dd55101ca1] Extracted SSTableReaderWithFilter
[510b651824] Implement customizable component types
[2be512d9fa] Pluggable SSTableFormat by making SSTableFormat.Type not an enum
[670836b55d] Refactor CRC and digest validators
[00c91103bc] Extract delete method to delete SSTables and purge row cache entries
[0819dc9fc2] Extracted trySkipFileCacheBefore(key) to SSTableReader
[732f841750] Added missing overrides in ForwardingSSTableReader
[db623218fd] Update DatabaseDescriptorRefTest
[c018c468e5] Cleanup
[eafc836242] Add @SuppressWarnings("resource") where needed
[3b7c911dd6] Documentation
patch by Jacek Lewandowski, reviewed by Branimir Lambov for CASSANDRA-17056
Co-authored-by: @jacek-lewandowski
Co-authored-by: @blambov
cfe9641fbec0dc62c9a0f4f156c702e2cfa6ad4e | Author: Stefan Miklosovic <smiklosovic@apache.org>
| 2023-02-17 15:42:03+01:00
Fix possible NoSuchFileException when removing a snapshot
patch by Stefan Miklosovic; reviewed by Jacek Lewandowski for CASSANDRA-18211
Co-authored-by: Jacek Lewandowski <lewandowski.jacek@gmail.com>
9bf128aaa33c21427c826dab82414e3772d2ba24 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-01-25 09:01:16+01:00
Improve unit tests performance
- Don't flush schema on every schema update in unit tests
- Use unix command to delete test data
- Shorten teardown
- Stable processor count presented by JMX on Jenkins, CircleCI and local
Patch by <jacek-lewandowski>, reviewed by <michaelsembwever> and <josh-mckenzie> for CASSANDRA-17427
49dfb805e9045c856181d6c2ac3b586b98d1a82a | Author: maxwellguo <cclive1601@gmail.com>
| 2023-01-16 19:49:38+01:00
Add compaction_properties column to system.compaction_history table and nodetool compactionhistory command
patch by Maxwell Guo; reviewed by Stefan Miklosovic and Jacek Lewandowski for CASSANDRA-18061
24ebd24c79175f7426f4c489dc5a006e75f09dfb | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-01-04 16:22:52+01:00
More accurate skipping of sstables in read path
This patch improves the following things:
1. SSTable metadata will store a covered slice instead of min/max clusterings. The difference is that for slices there is available the type of a bound rather than just a clustering. In particular it will provide the information whether the lower and upper bound of an sstable is opened or closed.
2. SSTable metadata will store a flag whether the SSTable contains any partition level deletions or not
3. The above two changes required to introduce a new major format for SSTables - oa
4. Single partition read command makes use of the above changes. In particular an sstable can be skipped when it does not intersect with the column filter, does not have partition level deletions and does not have statics; In case there are partition level deletions, but the other conditions are satisfied, only the partition header needs to be accessed (tests attached)
5. Skipping SSTables assuming those three conditions are satisfied has been implemented also for partition range queries (tests attached). Also added minor separate statistics to record the number of accessed sstables in partition reads because now not all of them need to be accessed.
6. Artificial lower bound marker is now an object on its own and is not implemented as a special case of range tombstone bound.
7. Extended the lower bound optimization usage due the 1 and 2
8. Do not initialize iterator just to get a cached partition and associated columns index. The purpose of using lower bound optimization was to avoid opening an iterator of an sstable if possible.
9. Add key range to stats metadata
[f369595b1c] Add fields to sstable version and placeholders in stats serializer
[f5c3f772e2] Add hasKeyRange and hasLegacyMinMax
[3cde51f4e1] Add partition level deletion presence marker to sstable stats
[67b2ee2152] Extract AbstractTypeSerializer
[c77b475d6c] Refactor slices intersection checking
[ceb5af3a38] Store min and max clustering as a slice in stats metadata as and improved min/max
[d1f8973929] Implement MetadataCollectorBench
[335369da84] Apply partition level deletion presence marker optimizations to single partition read command
[2497a009b9] Lower bound optimization - add slices and isReverseOrder fields to UnfilteredRowIteratorWithLowerBound
[e32ee31177] Lower bound optimization - Replace usage of RangeTombstoneMarker as a lower bound with ArtificialBoundMarker
[e213e712c4] Lower bound optimization - improve usage of lower bound optimization
[c4f93006b1] Apply read path improvements to partition range queries
[5fa462266c] Add key range to StatsMetadata
[79a7339ed4] Use key range from stats if possible
[266ed2749b] Added new sstables for LegacySSTableTest
patch by Jacek Lewandowski; reviewed by Branimir Lambov and C. Scott Andreas for CASSANDRA-18134
Co-authored-by: Branimir Lambov <blambov>
Co-authored-by: Sylvain Lebresne <pcmanus>
Co-authored-by: Jacek Lewandowski <jacek-lewandowski>
Co-authored-by: Jakub Zytka <jakubzytka>
f16fb6765b8a3ff8f49accf61c908791520c0d6e | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2023-01-03 18:17:13+01:00
Implementation of the trie-indexed SSTable format (BTI) as described in CEP-25.
Documentation in the included BTIFormat.md.
patch by Branimir Lambov and Jacek Lewandowski; reviewed by Caleb Rackliffe and Maxwell Guo for CASSANDRA-18398
9213335f59293926b2d643fa8a156a882495dd42 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2022-11-03 17:29:18+01:00
Fix Splitter sometimes creating more splits than requested
Spliter.splitOwnedRanges for some inputs creates an extra split. For example, when we request 7 ranges from 0..31 range, it will return 8 ranges. There is an assertion in that method which verifies whether it returns the requested number of splits. Since those numbers differs, when Cassandra is be started with assertions enabled, it would fail.
patch by Jacek Lewandowski; reviewed by Marcus Eriksson for CASSANDRA-18013
a78db628b0bcae6b1d30829b7510093ec4bca0ef | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2022-08-25 20:38:45+02:00
Fix scrubber falling into infinite loop
Fixes scrubber falling into infinite loop when the last partition is broken in data file and compression is enabled.
Patch by Jacek Lewandowski, reviewed by Brandon Williams, for CASSANDRA-17862
570732375e4186741388adb81afeab6f155f57b9 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2022-06-10 11:43:53+02:00
Fix a race condition where a keyspace can be opened while it is being removed
patch by Jacek Lewandowski; reviewed by Andrés de la Peña and Ekaterina Dimitrova for CASSANDRA 17658
c1d89c32d27921d1f77f05d29ee248b8922a4c76 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2022-06-10 11:43:53+02:00
Fix a race condition where a keyspace can be opened while it is being removed
patch by Jacek Lewandowski; reviewed by Andrés de la Peña and Ekaterina Dimitrova for CASSANDRA 17658
458bfd16c7ec759705f920e7ef9a8f2bb5a3f4b5 | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2022-04-26 14:43:49+02:00
Add information info whether sstables are dropped or not to SchemaChangeListener
patch by Jacek Lewandowski; reviewed by Alex Petrov for CASSANDRA-17582
4ea7bb25b4079e951202762aeaabe1d23be5303c | Author: Stefan Miklosovic <smiklosovic@apache.org>
| 2022-04-25 15:02:03+02:00
Add support for AWS Ec2 IMDSv2
patch by Stefan Miklosovic; reviewed by Jacek Lewandowski and Brandon Williams for CASSANDRA-16555
Co-authored-by: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Co-authored-by: Paul Rütter <paul@blueconic.com>
05fa92475ccb2beb70a96ddee83c04b65a2cdbfb | Author: jacek-lewandowski <lewandowski.jacek@gmail.com>
| 2022-03-25 13:09:34+00:00
Remove accidentally committed wrong legacy sstables
patch by Jacek Lewandowski; reviewed by Andrés de la Peña and Benjamin Lerer for CASSANDRA-17482
0040fea3797ea3e497691e9d1e2660711c60ac4d | Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
| 2022-01-24 11:51:13+01:00
Implement sstable generation identifier as uuid
Patch by Jacek Lewandowski; reviewed by Andrés de la Peña, Benjamin Lerer and Dan Jatnieks for CASSANDRA-17048
562cb26010659830dd1192939ac815a0f6cb3502 | Author: Branimir Lambov <branimir.lambov@datastax.com>
| 2021-11-11 15:39:21+02:00
MemtableTrie using multiple buffers
The replaces the size doubling and copying required to grow the trie
with an allocation of a new buffer. This improves the cost of expansion
at the expense of increasing individual read and write costs.
patch by Branimir Lambov; reviewed by Jason Rutherglen, Jacek Lewandowski, Andres de la Peña and Caleb Rackliffe for CASSANDRA-17240
7c55c73825e341315e520381968338d57afbb67a | Author: Branimir Lambov <branimir.lambov@datastax.com>
| 2021-01-20 15:42:36+02:00
Adds a trie-based memtable implementation
patch by Branimir Lambov; reviewed by Jason Rutherglen, Jacek Lewandowski, Andres de la Peña and Caleb Rackliffe for CASSANDRA-17240
7c55c73825e341315e520381968338d57afbb67a | Author: Branimir Lambov <branimir.lambov@datastax.com>
| 2021-01-20 15:42:36+02:00
Adds a trie-based memtable implementation
patch by Branimir Lambov; reviewed by Jason Rutherglen, Jacek Lewandowski, Andres de la Peña and Caleb Rackliffe for CASSANDRA-17240
30641ea7b6b8253651562aeb0102778a0f9a405b | Author: Branimir Lambov <branimir.lambov@datastax.com>
| 2021-01-11 16:02:12+02:00
Provides the Trie interface with MemtableTrie implementation
also includes functionality to merge, intersect and iterate on tries.
patch by Branimir Lambov; reviewed by Jason Rutherglen, Jacek Lewandowski, Andres de la Peña and Calib Rackliffe for CASSANDRA-17240
e966c45afcf8bef47df245ccb851386e5ce60505 | Author: jacek-lewandowski <jacek.lewandowski@datastax.com>
| 2020-11-06 14:59:56+01:00
ByteComparable API
Provides an API for converting all values of types that can be used in
primary keys to byte sequences that can be compared lexicographically
by unsigned byte value (i.e. byte-comparable sequences) and back.
patch by Branimir Lambov, Dimitar Dimitrov and Jacek Lewandowski;
reviewed by Caleb Rackliffe, Dimitar Dimitrov, Jacek Lewandowski and Aleksey Yeschenko for CASSANDRA-6936