38 Collaborator |
Brandon Williams , Mick Semb Wever , Štefan Miklošovič , Berenguer Blasi , Ekaterina Dimitrova , Caleb Rackliffe , David Capwell , Andrés de la Peña , Yifan Cai , Josh McKenzie , Alex Petrov , Jacek Lewandowski , Jon Meredith , Dinesh Joshi , Maxim Muzafarov , Doug Rohrer , Andy Tolbert , Maxwell Guo , Abe Ratnofsky , Saranya Krishnakumar , Bernardo Botella , Jyothsna Konisa , James Berragan , Arjun Ashok , Maulin Vasavada , Yuriy Semchyshyn , jberragan , jkonisa , Shailaja Koppu , reviewed by Yifan Cai , reviewed by Francisco Guerrero for CASSANDRA-19119 , cclive1601 , Yifan , Saranya , Francisco , Doug , Dinesh , Andrew Tolbert |
113 Patch |
64 Review |
fe025c7f79e76d99e0db347518a7872fd4a114bc,
25291ff3fd99f92cdb0a7d5d2125442282d42ff8,
bb68141861e77623f0d0b13f72846651a71c1017,
a0af41f666c23a840d9df3f06729ed5fd2c06cd1,
8c89e2adb7680ecb4dd3cb2a562206fb8cb50d4a,
26c374da4f03e4a6b64e414805cd92f3eb0a36c6,
6ffa43f68b8d10ca84d4a00bf81269527b4e14df,
9c796dfb272daa3ce57a2dc5cbeadd9273e1ac72,
83c169ec9e36324f27bf562951362f4a03c3c688,
9184dd5a998366dc2b5c18d4954b13b033efcf80,
a250126f0f277b43a18cb665ccd02a105271bc33,
a9725b681b948f2122f3d48b96a5c4e7403d2c39,
ffc4c89c3df7ad0ae73ebefdcb7e15a2790c0a52,
e773bbd9c6c52a2fedc127cb7ab77a1fdbeb63d2,
61be4d836213f708d9a29e59b9ef1df0bebef29a,
543608ba39d5803b963d14821abe193ff0796b4f,
df16b3750dc2c1b6b9bcdece6f81dfd3de7ebdfa,
0dc5a289e8dd586150253d951e6e229480c0ffc8,
e0a61f73b9b9d14db3e68aafb38257a7689557b9,
0448f15e3db392f2f60db332fabf6309aa3d5089,
58dc085bbb33bba84becd6adc3b31e9325eda4eb,
a424d836811508078f0761fd4650c31e330e1886,
ba7e58c43c334260f296a28079ad3a1d5a3e3605,
1afdb68bea7aef495a3cdc895b778fc4ea2c72e0,
995d1683eb688a75d53cbaef4b1507905dcf24e0,
453107f9a0a7c00b21299f426fb24dda82d735eb,
7f8db256377ff4e449a83282b2561ce5e7b74adf,
971941cac0ca62d03503509b92d035624388ffba,
e95786a077e1137dcaae206854986987edc6a71e,
2a25c5defc413e6511a2fceee16e87d090b961c8,
a57aeb63ee65a06f1b9a2dc7eb77684f6853ebf2,
77c815071a66fb53b97e9e07695417004dd88804,
4a6b8c9cfe0c6286d12c7d561941a24c25a206ef,
20795db4d708b9287e0a2281695923bfb6fa9138,
f848cd063e5e1671c84807615f5eae809253971d,
af0060f1325bf8edf74171f60326a3427e13e01d,
b5570109c19acaf91281fd7901041c0c2b1f3b6c,
1d7b3f10722b52482d2123a4784dda9c92949137,
e329f3232fae867879aba2cd0a766404aaf9a427,
cf09cded72dd04e272f56fba0b7d9cceb0c4f894,
49723c720e0ae5762d77dfb7568c9b290a877560,
888a546f84790c0a0b1b930e682cf597caaa0d61,
529171b1f6dee277a9087eb9da7242ce17873643,
ad936f6482aee2a05fa45ba4fdd06267958298f6,
b16b89f7f4a15441af757fa5a84b7e1e0420dcba,
31fa33fcb446e522947f899d948de4042be04c62,
c1a6225dac62db94fd3faee92513e84ba3b9b3b7,
d8e9e2359db1f5561eb872d28ab16098b5a62c1d,
36d48b209c84470930d5ce3d4f4ab1c406dae60e,
8045f8eedad4510ecbf66bb739d464eec741aced,
c7b170cf2e7764159a3d9cf4ca0abc6db1659e51,
f8605b3c3dfdfdc5ef1b8327f4fd657efca8f9b7,
6b650d956ed9ca57b99065cdc3c2c81d2ef0c2d1,
63292010803875af6496ce7c787f404e66311375,
ccda6798caae7eb7ef11ac9ee248f2e80720a6e2,
4be61d482c9a717fa4ebe2fc4b7d09a230dac68a,
cf796be039b9b08364de11fd7fec8a00cf699616,
ca8dd2e0381a19ede99ce4b70959ce7a584ea0d2,
20caf6efdd50c276e22f3a681b87e883891877b8,
a4ca352a1720744a2424c64c498f31b2460924e8,
ab1884d2eedace79857489cb9dbe455ada9e4ad1,
3cbb3d19c6f043b3a20e4933e5ff7a0e3d58f0a9,
9912a620a0e67d9aa723037aaf5237598a895eb7,
2cc2eb5844e1f7ec54b2934589c2f4a6a3d226bc,
40c3fac3891013457035fcfc4944b27535d5d701,
5d2cbaf5cb810f53689bf227e2c1f78a9a2b2e9f,
38cdacb2e7418e2aefbcffb1754dcd324c46028d,
02cc6548f291528e9749a51d103463f9552f4b4e,
30c04eb38a796183643bdcbaff8f425d90ebf671,
6e358acfce071cad16ac88c15dc2229bbb8a7944,
5471b66c1e69e057bbbe75e4ffe67c1891cd9495,
1759eb535ff2dbd3dea3fa8626eba0fe70ecb113,
05c0bbe29f75d678596af09abb0c68d15e93f7ba,
07deb07d2e57a32c67643385403980a9a3c3fc95,
a7a2c29e990acb2363eb7a15cc4b970ebdc04753,
54abba8d7d870da5055bef79a51cf52fad980deb,
24a08f22707901f7641e48f0c26e54b05c0e03c3,
2e233ec579e4d2a23021116027d75e776e7ad9ec,
7a5e710a2173e492907edc4094e052157a562103,
bc219cbf75bdbfdc7a95b3160ef17332c9274b44,
67281b31010791fa7f0d02dd0f776862e15846d3,
b31a451873eb411788a6d94c3cacf881f5c3cb86,
8dce35f1cb3c204be669548ee286055b12e67fe9,
16a0e5f3193c9764d897a98126a2c3c8b4c498d5,
45de9e08e7dc177f1f3273456b62af7ee0f5dbdb,
f4014c06d7668541010d59cc932970e9ebfc36f5,
86420f9d52991fb148b322031df55494669532d3,
aea798dc7e517af520a403d4d86f3bc6bed65092,
690101840d4d8f9c656bb0ca114f6619af80e1cf,
47fdb6448b6956249790d5dc7bb76b699d35c079,
cbbf33d001b6f953be5654f00d7dfb54011a7619,
295358095db80ced4b8f54f603f7bd9833a8f175,
d1d0dd70951c9997ca7f9eeb184da64a0eb8fed7,
c00c454d698e5a29caf58e61ed52ab48d08fd7fe,
d28442ae712c1597052493aa3d2353a2de2495c2,
46c35d0ef2efb66512133a7913df9936b0a80dc8,
dc0e79b9c483562ec0920d69e886715eb329c426,
fc08d45b283e701aa6d558e99cd18318394b0de7,
d949d8c2b9813c3e8429ece34c364a356bd7d6eb,
e82fceaecfe5ea04ac3ddff92be5a6a41456333c,
550bdfa1c6082537e2cfb93449128a61dbe3a1fb,
c7c3bbca2c7cb415b39689e924fa2357c239f043,
457b36bcb3c8a865cca83ca6c402246798113ab4,
82b3c0a79c9322142738a4ec2ff7d4d4c0be2370,
69766bca399cc779e0f2f8e859e39f7e29a17b7a,
02d9136cfa72c8990120eca0f4fe5f52587bceb5,
ee1c83722bfb1155bef762cdfb2c86034857f2d0,
cbae09ca71b9eb9a581b77c23844da21474b095a,
bd0b41fb82134844a15fbb43126424d96706d08e,
f0fae2deeee20df15ac1105af2163af2a7e7953d,
b87b0edd310d1ef93c507bbbb1ae51e1b0b319c6,
7764214d1fb44fb6139a622f403bb05610e8f7b1,
1633cd9c6c3d88d5c66825fab76a369266509f7e |
c853efffa8b173a3afe1b966456bb77db5a68883,
f410b0fa0bc5adbb674654a0e27b02282971cfec,
659558c980c67a80287ca7ccdfc8a70b1a56b7e2,
ca729f6fad70eb07dcf590274e8664b865af2d42,
98a0b54c4025ef21aa3fb56f1962c4771e095652,
d336dda1123af0c272c69e42b6214577e30447e1,
4120b8ce4f1bc7bd7ce101e4e298fc2211a21fe0,
c09d0d929baeaa02f3438313c7979ccf6b4b3c5a,
d5cea135c98bb98b16b215d309ead22e86f1329f,
bfcb21fbebfef14fbfe626bfd39d66f5e5c51018,
f46444b6285fad5453a4ab845b873fc03942ba76,
50273d98e4780b57da37400752eab69e65cd41bc,
0e4c2f4befa22caa68b34f95d0169b4685bc7e0d,
b9586501a6b6cdfe465302448018785652c9b966,
4bfca2badb3284657a65d8910a4f77eaf7689b31,
8bfe0e5878c64ed25591aae50643187bc8ab7241,
e5e13c02ccf386093153fd6824fd85ef7bd24eb3,
dd08314ed654aafa60b2a82fc4953aac171ba3ef,
ed3901823a5fe9f8838d8b592a1b7703b12e810b,
26dd119679605bf61ad3caa24a70509e5be5aac9,
3a6f6907314670fdb2b316db8f08ffd85da88851,
945a4fc23ac1f60b8380be3b60aef89caf3daba2,
dad3e86dfe73ae1ba4aa5a23cf8194bed3f46322,
dd536f2e70118cd5d0c319f5be3e54e3d50eb288,
a3ca0897c2190c2c18992ca2b7e5255318ff3eba,
3bcc5297bd3115ff5949a9295eed6a9ad03fd096,
4624a17098e055e0abf9a6025451d4352cb9c147,
ff9ac41b4695c1df59f5293f69e0d3a1ce0da9f4,
f123406e458c0112145f37dcd3f8c20ba47c949d,
8655ca54a5d0749fccb2ad6a06ec230e8b0de24e,
9f0f475ba87df6c631029748bfafb4169bfa6465,
cfe293dadcf7a1d4491591cfd39fc410a8fa52ba,
dbbd211cd420eb185d0579f16f5d46abc7bafeb4,
e168011c40de2ca48d138514640838067e61feea,
3023a204c8ef16f886bd3dc219f7534b7edbaf2a,
bac08796181979afef4cc518789a380edef500f0,
84d84fe36b0d6e250c3d221c28c40b6925e4c222,
458a3630f882ae2b2a9cee272cf85ca7ff42f5cd,
798182a6fda562538c2f44e4f3f92a7cb68cd81c,
2a693d721182c1d354ff1323b1324fe06ac03f36,
a242b352c28947427a9bfc30295a487017439fd9,
466e7bf5e160d1667c12ea1de1b79ba27670aba4,
aea798dc7e517af520a403d4d86f3bc6bed65092,
c5d6dfd1bc9b682d704d28f77807ba72317b1944,
164243e78f1557a34bc699ebc716b532781d6422,
6ce33604bbd9acbee092ab3c4f7f11c0d434f730,
a13532272051d4e4608f92d53bdd997103e8ea19,
cf6de14d5b96ea173d6a1b2dad9bb64d563df06c,
c3e8803b3331bc7ef81797ac52a8417524f67edc,
d61e44f78fa4ba5ec395e1e39c507d666fddefd1,
e0ae9d7484e242f6af495aac2cb4d8dc121fba89,
047d13806078bafdc3954273b0e240dbbb976bd4,
8c20b452dd0728a6fad6d276a7be9fa1b9274495,
b5ba5fad4df490d1b7d47889361db910589409b8,
fa6df8e2c09ad3d27bfe8c0ce016c839094630f6,
e8fb77f4813b469d73d39c84acf1e1fe7a40702b,
0aaf5659028dd874c8d666c636f11eae63c429e6,
672d66a64a21e23c4d81c089b426360c2bb708b7,
bbfca46129992e83055ba9b0b4f836871eef0990,
680cc9395c55a88217f2de975f62ad588e8c95d5,
87a729feb4660f57bacb2a4be73e1bb2d509578b,
deebdf97ad01f23550d7d3b42d98c7bf111e2f95,
f24951ab6ea2b1e9af4013b030675c70d31adb90,
9523a38b3f1b5bc4313e2949896ddc1fff58afbe |
58dc085bbb33bba84becd6adc3b31e9325eda4eb | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-12-30 15:31:48-08:00
ninja fix: update CHANGES for 8d88fba753db9a464bb0b562b87bd0d5b6271c69 (CASSSIDECAR-158)
a424d836811508078f0761fd4650c31e330e1886 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-12-30 15:28:42-08:00
CASSSIDECAR-182: Refactor access to delegate methods to simplify (#168)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSSIDECAR-182
ba7e58c43c334260f296a28079ad3a1d5a3e3605 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-12-19 15:55:38-08:00
CASSSIDECAR-174: Mechanism to have a reduced number of Sidecar instances run operations (#160)
Patch by Francisco Guerrero; Reviewed by Yifan Cai, Doug Rohrer for CASSSIDECAR-174
1afdb68bea7aef495a3cdc895b778fc4ea2c72e0 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-12-12 16:48:45-08:00
CASSSIDECAR-178: Stopping Sidecar can take a long time (#162)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSSIDECAR-178
995d1683eb688a75d53cbaef4b1507905dcf24e0 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-12-06 13:02:34-08:00
CASSSIDECAR-122: yaml configuration defaults to a file that doesn't e… (#153)
* CASSSIDECAR-122: yaml configuration defaults to a file that doesn't exist
Patch by Francisco Guerrero; Reviewed by Jon Haddad, Yifan Cai for CASSSIDECAR-122
dd536f2e70118cd5d0c319f5be3e54e3d50eb288 | Author: Yifan Cai <ycai@apache.org>
| 2024-11-15 15:31:14-08:00
CASSANDRA-20066: Expose detailed bulk write failure message for better insight (#92)
Patch by Yifan Cai; Reviewed by Doug Rohrer, Francisco Guerrero for CASSANDRA-20066
a3ca0897c2190c2c18992ca2b7e5255318ff3eba | Author: Yifan Cai <ycai@apache.org>
| 2024-11-05 14:22:45-08:00
CASSANDRA-19994: Add dataTransferApi and TwoPhaseImportCoordinator for coordinated write (#91)
Patch by Yifan Cai; Reviewed by Doug Rohrer, Francisco Guerrero for CASSANDRA-19994
3bcc5297bd3115ff5949a9295eed6a9ad03fd096 | Author: jberragan <jberragan@gmail.com>
| 2024-10-17 10:02:25-07:00
CASSANDRA-19980: Remove SparkSQL dependency from CassandraBridge so that it can be used independent from Spark (#88)
Patch by James Berragan; Reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19980
453107f9a0a7c00b21299f426fb24dda82d735eb | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-10-09 15:45:42-07:00
CASSANDRASC-147: Expose vert.x filesystem options configuration (#138)
By default, vert.x will attempt to resolve files from the application classpath
when it is unable to find them in the local filesystem. Additionally, by default
vert.x will cache any files that it reads from the classpath into the local
filesystem.
For Sidecar, this optimization is unnecessary as Sidecar doesn't package anything
in the classpath that might be used while running the application.
In this commit, we disable this optimization by default, but expose configuration
options to tune these options on need-basis.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-147
f123406e458c0112145f37dcd3f8c20ba47c949d | Author: Yifan Cai <ycai@apache.org>
| 2024-09-11 21:03:47-07:00
CASSANDRA-19909: Add writer options COORDINATED_WRITE_CONFIG to define coordinated write to multiple Cassandra clusters (#79)
The option specifies the configuration (in JSON) for coordinated write.
See org.apache.cassandra.spark.bulkwriter.coordinatedwrite.CoordinatedWriteConf.
When the option is present, SIDECAR_CONTACT_POINTS, SIDECAR_INSTANCES and LOCAL_DC are ignored if they are present.
Patch by Yifan Cai; Reviewed by Doug Rohrer, Francisco Guerrero for CASSANDRA-19909
7f8db256377ff4e449a83282b2561ce5e7b74adf | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-09-05 15:10:37-07:00
CASSANDRASC-145: Move root project into a subproject in gradle (#136)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-145
9f0f475ba87df6c631029748bfafb4169bfa6465 | Author: Arjun Ashok <arjun_ashok@apple.com>
| 2024-09-05 13:24:01-07:00
CASSANDRA-19873: Removes checks for blocked instances from bulk-write path (#76)
Patch by Arjun Ashok; Reviewed by Yifan Cai, Francisco Guerrero for CASSANDRA-19873
cfe293dadcf7a1d4491591cfd39fc410a8fa52ba | Author: Yifan Cai <ycai@apache.org>
| 2024-08-30 11:08:00-07:00
CASSANDRA-19842: Consistency level check incorrectly passes when majority of the replica set is unavailable for write (#75)
Patch by Yifan Cai; Reviewed by Doug Rohrer, Francisco Guerrero for CASSANDRA-19842
e168011c40de2ca48d138514640838067e61feea | Author: Yifan Cai <ycai@apache.org>
| 2024-08-07 17:02:38-07:00
CASSANDRA-19806: Stream sstable eagerly when bulk writing to reclaim local disk space sooner (#69)
Patch by Yifan Cai; Reviewed by Francisco Guerrero for CASSANDRA-19806
971941cac0ca62d03503509b92d035624388ffba | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-08-06 15:06:30-07:00
Fixes updating traffic shaping options throws IllegalStateException (#130)
Patch by Francisco Guerrero; Reviewed by Saranya Krishnakumar, Arjun Ashok, Yifan Cai for CASSANDRASC-140
3023a204c8ef16f886bd3dc219f7534b7edbaf2a | Author: jberragan <jberragan@gmail.com>
| 2024-08-03 07:51:52+01:00
CASSANDRA-19807: Improve the core bulk reader test system to match actual and expected rows by concatenating the partition keys with the serialized hex string instead of utf-8 string (#70)
Patch by James Berragan; Reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19807
84d84fe36b0d6e250c3d221c28c40b6925e4c222 | Author: jberragan <jberragan@gmail.com>
| 2024-07-22 13:38:28-07:00
CASSANDRA-19791: Remove other uses of Apache Commons lang for hashcode, equality and random string generation (#67)
Patch by James Berragan; Reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19791
458a3630f882ae2b2a9cee272cf85ca7ff42f5cd | Author: jberragan <jberragan@gmail.com>
| 2024-07-17 14:29:21-07:00
CASSANDRA-19778: Split out BufferingInputStream stats into separate i… (#66)
Split BufferingInputStream stats into separate interface so class level generics are not required for the Stats interface
Patch by James Berragan; Reviewed by Bernardo Botella, Francisco Guerrero, Yifan Cai for CASSANDRA-19778
798182a6fda562538c2f44e4f3f92a7cb68cd81c | Author: Yifan Cai <ycai@apache.org>
| 2024-07-16 21:44:18-07:00
CASSANDRA-19772: Deprecate option SIDECAR_INSTANCES and replace with SIDECAR_CONTACT_POINTS (#63)
This patch introduces a new option SIDECAR_CONTACT_POINTS for both bulk writer and reader. The option name better describes the purpose, which is to specify the initial contact points to discover the cluster topology. The existing option SIDECAR_INSTANCES are used for the same purpose and it is now deprecated.
In addition, it allows including the port value in the addresses when defining SIDECAR_CONTACT_POINTS
Patch by Yifan Cai; Reviewed by Francisco Guerrero for CASSANDRA-19772
2a693d721182c1d354ff1323b1324fe06ac03f36 | Author: Yifan Cai <ycai@apache.org>
| 2024-07-16 15:20:52-07:00
CASSANDRA-19774: Bump Cassandra Sidecar version (#65)
Update Cassandra Sidecar commit sha: 55a9efee30555d3645680c6524043a6c9bc1194b
Patch by Yifan Cai; Reviewed by Francisco Guerrero for CASSANDRA-19774
a242b352c28947427a9bfc30295a487017439fd9 | Author: jberragan <jberragan@gmail.com>
| 2024-07-12 14:57:38-07:00
CASSANDRA-19748: Refactoring to introduce new cassandra-analytics-common module with minimal dependencies (#62)
- Add new module cassandra-analytics-common with no dependencies on Spark or Cassandra and minimal standard dependencies (Guava, Jackson, Commons Lang Kryo etc)
- Move standalone classes to cassandra-analytics-common module.
Some additional refactoring and clean up:
- Rename SSTableInputStream -> BufferingInputStream
- Rename SSTableSource -> CassandraFileSource
- Introduce CassandraFile interface to be the implementing class for SSTable and CommitLog.
- Generalize IStats to work across different CassandraFile types
- Rename methods in StreamScanner to make the API clearer.
- Move ComplexTypeBuffer, ListBuffer, MapBuffer, SetBuffer, UdtBuffer to standalone classes
- Delete unused classes RangeTombstone, ReplciaSet and CollectionElement.
- Remove commons lang as a dependency
- Rename Rid to RowData
Patch by James Berragan; Reviewed by Bernardo Botella, Dinesh Joshi, Francisco Guerrero, Yifan Cai, Yuriy Semchyshyn for CASSANDRA-19748
b31a451873eb411788a6d94c3cacf881f5c3cb86 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-06-27 15:51:15-07:00
CASSANDRA-19727: Bulk writer fails validation stage when writing to a cluster using RandomPartitioner (#61)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19727
8dce35f1cb3c204be669548ee286055b12e67fe9 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-06-20 14:24:21-07:00
CASSANDRA-19716: Invalid mapping when timestamp is used as a partition key during bulk writes (#60)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19716
ca729f6fad70eb07dcf590274e8664b865af2d42 | Author: Bernardo Botella <contacto@bernardobotella.com>
| 2024-06-07 11:44:31-07:00
CASSANDRA-19685 - Add auto_hints_cleanup_enabled to web documentation
patch by Bernardo Botella; reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19685
bb68141861e77623f0d0b13f72846651a71c1017 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2024-05-29 13:38:02-07:00
CASSANDRA-19669: Audit Log entries are missing identity for mTLS connections
Patch by Francisco Guerrero; Reviewed by Bernardo Botella, Andrew Tolbert, Dinesh Joshi for CASSANDRA-19669
f4014c06d7668541010d59cc932970e9ebfc36f5 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-05-10 13:17:04-07:00
CASSANDRA-19626 Fix NullPointerException when reading static column with null values (#58)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19626
e95786a077e1137dcaae206854986987edc6a71e | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-05-08 18:04:05-07:00
CASSANDRASC-129: Remove tableId from list snapshot response (#120)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-129
466e7bf5e160d1667c12ea1de1b79ba27670aba4 | Author: Yifan Cai <52585731+yifan-c@users.noreply.github.com>
| 2024-05-03 10:11:18-07:00
CASSANDRA-19616: Integrate with the latest sidecar client (#56)
The patch updates the analytics code to consume the latest sidecar client after CASSANDRASC-127
Patch by Yifan Cai; Reviewed by Francisco Guerrero for CASSANDRA-19616
2a25c5defc413e6511a2fceee16e87d090b961c8 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-25 12:39:46-07:00
CASSANDRASC-125: Import Queue pendingImports metrics is reporting an … (#117)
The pending imports metric does not aggregate across all keyspaces/tables, in this commit
we aggregate the queue sizes and report on a per host basis.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-125
77c815071a66fb53b97e9e07695417004dd88804 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-23 10:44:28-07:00
CASSANDRASC-123: Add missing method to retrieve the InetSocketAddress to DriverUtils (#114)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-123
aea798dc7e517af520a403d4d86f3bc6bed65092 | Author: Yifan Cai <52585731+yifan-c@users.noreply.github.com>
| 2024-04-22 15:46:08-07:00
CASSANDRA-19563: Support bulk write via S3 (#53)
This commit adds a configuration (writer) option to pick a transport other than the previously-implemented "direct upload to all sidecars" (now known as the "Direct" transport). The second transport, now being implemented, is the "S3_COMPAT" transport, which allows the job to upload the generated SSTables to an S3-compatible storage system, and then inform the Cassandra Sidecar that those files are available for download & commit.
Additionally, a plug-in system was added to allow communications between custom transport hooks and the job, so the custom hook can provide updated credentials and out-of-band status updates on S3-related issues.
Co-Authored-By: Yifan Cai <ycai@apache.org>
Co-Authored-By: Doug Rohrer <drohrer@apple.com>
Co-Authored-By: Francisco Guerrero <frankgh@apache.org>
Co-Authored-By: Saranya Krishnakumar <saranya_k@apple.com>
Patch by Yifan Cai, Doug Rohrer, Francisco Guerrero, Saranya Krishnakumar; Reviewed by Francisco Guerrero for CASSANDRA-19563
4a6b8c9cfe0c6286d12c7d561941a24c25a206ef | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-21 14:22:36-07:00
CASSANDRASC-94 Reduce filesystem call while streaming SSTables (#91)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-94
690101840d4d8f9c656bb0ca114f6619af80e1cf | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-08 14:33:50-07:00
CASSANDRA-19526: Optionally enable TLS in the server and client for Analytics testing
All integration tests today run without TLS, which is generally fine because they run locally. However,
it is helpful to be able to start up the sidecar with TLS enabled in the integration test framework so
that third-party tests could connect via secure connections for testing purposes.
Co-authored-by: Doug Rohrer <drohrer@apple.com>
Co-authored-by: Francisco Guerrero <frankgh@apache.org>
Patch by Doug Rohrer, Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19526
20795db4d708b9287e0a2281695923bfb6fa9138 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-03 15:16:57-07:00
CASSANDRASC-116: Allow for JmxClient to be extensible
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-116
cbbf33d001b6f953be5654f00d7dfb54011a7619 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-03 11:08:32-07:00
CASSANDRA-19519: Migrate remaining integration tests to the single dtest cluster per class model (#49)
Additionally, we remove the usused test framework code after migrating the tests
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19519
295358095db80ced4b8f54f603f7bd9833a8f175 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-02 16:26:40-07:00
CASSANDRA-19513: Refactor Cassandra bridge (#48)
This commit splits the bridge implementation from the shaded `cassandra-all` library. This separation
allows for better integration of a different `cassandra-all` implementations. Additionally, it better
separates the actual bridge code from the Cassandra code.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19513
d1d0dd70951c9997ca7f9eeb184da64a0eb8fed7 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-02 12:01:49-07:00
Ninja fix for CASSANDRA-19340
Revert "Make sure bridge exists"
This reverts commit 98baab1b8f0d5d7eb93f8d13db3b0a7a985fb03a.
We revert this commit because the commit message was lost during merge.
We immediately add the same commit with the correct commit message, to
avoid rewriting git history.
c00c454d698e5a29caf58e61ed52ab48d08fd7fe | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-04-01 12:11:52-07:00
CASSANDRA-19507 Fix bulk reads of multiple tables that potentially have the same data file name (#47)
When reading multiple data frames using bulk reader from different tables, it is possible to encounter a data
file name being retrieved from the same Sidecar instance. Because the `SSTable`s are cached in the `SSTableCache`,
it is possible that the `org.apache.cassandra.spark.reader.SSTableReader` uses the incorrect `SSTable` if it was
cached with the same `#hashCode`.
In this patch, the equality takes into account the keyspace, table, and snapshot name.
Additionally, we implement the `hashCode` and `equals` method in `org.apache.cassandra.clients.SidecarInstanceImpl` to utilize the `SSTableCache` correctly. Once the methods are implemented, the issue originally described in JIRA is surfaced.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19507
d28442ae712c1597052493aa3d2353a2de2495c2 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-03-27 13:32:39-07:00
CASSANDRA-19500 Fix XXHash32Digest calculated digest value (#46)
This PR bumps the Sidecar version to the current latest HEAD of Sidecar. Bumping the
version surfaced an issue with the way we are producing digest strings for the XXHash32
implementation. The hash value is not masked and this causes the negative sign to be
forwarded producing the incorrect hash result.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19500
164243e78f1557a34bc699ebc716b532781d6422 | Author: Arjun Ashok <arjun_ashok@apple.com>
| 2024-03-22 16:22:44-07:00
CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation (#41)
Patch by Arjun Ashok; Reviewed by Doug Rohrer, Yifan Cai, Francisco Guerrero for CASSANDRA-19418
f848cd063e5e1671c84807615f5eae809253971d | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-03-21 15:26:06-07:00
CASSANDRASC-107: Improve logging for slice restore task (#108)
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-107
af0060f1325bf8edf74171f60326a3427e13e01d | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-03-07 16:25:18-08:00
CASSANDRASC-113 Fix flaky JmxClientTest (#105)
In this PR, we fix the race condition that occurs when determining the port number to use for the registry.
Currently, the port is determined in the `availablePort` method, where a socket is determined by using port
0. The OS will assign a port number for the socket, but we immediately close the socket, and use the determined
port number to run the test. This PR brings a better approach by directly using port 0 while creating the
registry, thus avoiding the intermediate step and directly using the port that originally was assigned
to the registry without releasing it until the end of the test.
Additionally in this PR, we rename the integration test JmxClientTest which name is colliding with the
unit test. This allows for a better IDE integration and debugging experience.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-113
6ce33604bbd9acbee092ab3c4f7f11c0d434f730 | Author: Saranya Krishnakumar <saranya_k@apple.com>
| 2024-03-06 14:32:22-08:00
CASSANDRA-19424 Check for expired certificate during start up validation (#43)
patch by Saranya Krishnakumar; reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19424
a13532272051d4e4608f92d53bdd997103e8ea19 | Author: Yifan Cai <52585731+yifan-c@users.noreply.github.com>
| 2024-03-05 11:06:36-08:00
CASSANDRA-19452 Use constant reference time during bulk read process (#44)
patch by Yifan Cai; reviewed by Francisco Guerrero, James Berragan for CASSANDRA-19452
46c35d0ef2efb66512133a7913df9936b0a80dc8 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-02-19 20:50:16-08:00
CASSANDRA-19411: Bulk reader fails to produce a row when regular column values are null
Bulk Reader won't emit a row when the regular column values are all `null`. For example,
a schema `PK` = `a`, `b` ; `CK` = `c`, `d` ; and columns = `e`, `f`.
| a | b | c | d | e | f |
| --- | --- | --- | --- | ---- | ---- |
| pk1 | pk2 | ck1 | ck2 | null | null |
When queried from Analytics bulk reader, it won't produce a row.
This issue also occurs when the projected regular column values are all `null`, where
other non-projected columns might have some values.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19411
a0af41f666c23a840d9df3f06729ed5fd2c06cd1 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2024-02-15 13:19:28-08:00
CASSANDRA-18951: Add option for MutualTlsAuthenticator to restrict the certificate validity period
In this commit, we introduce two new optional options for the `server_encryption_options`
and the `client_encryption_options`. The options are `max_certificate_validity_period` and
`certificate_validity_warn_threshold`. Both options can be configured as a duration
configuration parameter as defined by the `DurationSpec` (see CASSANDRA-15234). The resolution
for these new properties is minutes.
When specified, the certificate validation implementation will take that information
and reject certificates that are older than the maximum allowed certificate validity period,
translating into a rejection from the authenticating user.
The `certificate_validity_warn_threshold` option can be configured to emit warnings (log entries)
when the certificate exceeds the validity threshold.
patch by Francisco Guerrero; reviewed by Andy Tolbert, Abe Ratnofsky, Dinesh Joshi for CASSANDRA-18951
c3e8803b3331bc7ef81797ac52a8417524f67edc | Author: Yifan Cai <ycai@apache.org>
| 2024-02-13 09:52:57-08:00
CASSANDRA-19285 Fix flaky Host replacement tests and shrink tests
The flakiness is caused by inspecting a class whose classloader is already closed. The fix is to include the those classes in the sharedClassLoader, so that the classLoader is not closed during the test.
patch by Yifan Cai; reviewed by Francisco Guerrero for CASSANDRA-19285
b5570109c19acaf91281fd7901041c0c2b1f3b6c | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-02-12 21:13:23-08:00
CASSANDRASC-104 Relocate Sidecar common classes in vertx-client-shaded
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-104
fc08d45b283e701aa6d558e99cd18318394b0de7 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-01-31 14:35:34-08:00
CASSANDRA-19351 No longer need to synchronize on Schema.instance after Cassandra 4.0.12
We no longer need to synchronize on the `Schema.instance` in Analytics after the release of Cassandra
4.0.12, that includes a synchronization fix in https://issues.apache.org/jira/browse/CASSANDRA-18317.
This commit cleans up TODOs pending on that code being released.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19351
dc0e79b9c483562ec0920d69e886715eb329c426 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-01-31 13:44:23-08:00
CASSANDRA-19369 Use XXHash32 for digest calculation of SSTables
This commit adds the ability to use the newly supported in Cassandra Sidecar XXhash32 digest algorithm.
The commit allows for backwards compatibility to perform MD5 checksumming, but it now defaults to XXHash32.
A new Writer option is added:
```
.option(WriterOptions.DIGEST.name(), "XXHASH32") // or
.option(WriterOptions.DIGEST.name(), "MD5")
```
This option defaults to XXHash32, when not provided, but it can be configured to use the legacy MD5 algorithm.
Path by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19369
4120b8ce4f1bc7bd7ce101e4e298fc2211a21fe0 | Author: Andy Tolbert <6889771+tolbertam@users.noreply.github.com>
| 2024-01-31 11:06:59-06:00
Expose auth mode in system_views.clients, nodetool clientstats, metrics
Adds 'authenticationMode' and 'metadata' fields to AuthenticatedUser to add context
about how the user was authenticated and updates system_views.clients,
nodetool clientstats (behind --verbose flag) to include this information.
Also adds new metrics to ClientMetrics to help operators identify which
authentication modes are being used.
patch by Andy Tolbert; reviewed by Francisco Guerrero, Stefan Miklosovic for CASSANDRA-19366
c09d0d929baeaa02f3438313c7979ccf6b4b3c5a | Author: Andy Tolbert <andy_tolbert@apple.com>
| 2024-01-30 16:41:54-08:00
Allow CQL client certificate authentication to work without sending an AUTHENTICATE request
patch by Andy Tolbert; reviewed by Abe Ratnofsky, Dinesh Joshi, Francisco Guerrero, Jyothsna Konisa for CASSANDRA-18857
1d7b3f10722b52482d2123a4784dda9c92949137 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-01-29 10:13:50-08:00
CASSANDRASC-98: Improve logging for traffic shaping / rate limiting configurations
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-98
e329f3232fae867879aba2cd0a766404aaf9a427 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-01-25 10:30:52-08:00
CASSANDRASC-97: Add support for additional digest validation during SSTable upload
In this commit we add the ability to support additional digest algorithms for verification
during SSTable uploads. We introduce the `DigestVerifierFactory` which now supports
XXHash32 and MD5 `DigestVerifier`s.
This commit also adds support for XXHash32 digests. Clients can now send the XXHash32 digest
instead of MD5. This would allow both the clients and server the flexibility to utilize a more
performant algorithm.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-97
e0ae9d7484e242f6af495aac2cb4d8dc121fba89 | Author: Yifan Cai <ycai@apache.org>
| 2024-01-24 15:38:41-08:00
CASSANDRA-19334 Upgrade to Cassandra 4.0.12 and remove BufferMode and BatchSize options
In cassandra-all:4.0.12, improvements were made for the CQLSSTableWriter. The sorted writer now can produce size-capped SSTables. It replaces the need for the unsorted sstable writer, which has to buffer and sort data on flushing. The dataset to write in the spark application is already sorted. By avoiding using the unsorted writer, it prevents wasting CPU time on sorting the sorted data. Since the sorted sstable writer does not need to buffer data, its size estimation is more accurate than the unsorted one, meaning the produced sstables files are closer to the expectation.
By removing the unsorted sstable writer, it no longer requires the RowBufferMode option.
By supporting size-capping in sorted writer, it no longer requires the BatchSize option.
Patch by Yifan Cai; reviewed by Francisco Guerrero for CASSANDRA-19334
d949d8c2b9813c3e8429ece34c364a356bd7d6eb | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-01-22 09:00:52-08:00
CASSANDRA-19275 Fix flaxy host replacement tests and shrink tests
This patch fixes flaky tests when a `BindException` occurs during cluster provisioning.
When a `BindException` is encountered, cluster provisioning is retried for up-to
`MAX_CLUSTER_PROVISION_RETRIES`.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19275
cf09cded72dd04e272f56fba0b7d9cceb0c4f894 | Author: Francisco Guerrero <frankgh@apache.org>
| 2024-01-19 17:38:55-08:00
CASSANDRASC-96 Fix typo in foundation package under common org.apache.cassandra.sidecar
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-96
fa6df8e2c09ad3d27bfe8c0ce016c839094630f6 | Author: Arjun Ashok <arjun_ashok@apple.com>
| 2024-01-16 12:24:02-08:00
CASSANDRA-19272 Add new writer option for blocklisted instances and corresponding integration tests
Patch by Arjun Ashok; Reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19272
d5cea135c98bb98b16b215d309ead22e86f1329f | Author: Caleb Rackliffe <calebrackliffe@gmail.com>
| 2024-01-08 15:23:35-06:00
Revert unnecessary read lock acquisition when reading ring version in TokenMetadata introduced in CASSANDRA-16286
patch by Caleb Rackliffe; reviewed by Francisco Guerrero for CASSANDRA-19107
550bdfa1c6082537e2cfb93449128a61dbe3a1fb | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-12-19 12:50:43-08:00
CASSANDRA-19251 Speed up integration tests
This commit introduces an opinionated way to run integration tests where a test class
reuses the same in-jvm dtest cluster, and it offers certain ordering that help running
tests faster.
The test setup does the following:
- Find the Cassandra version to run
- Provision a cluster for the test
- Initialize schemas required for tests
- Start the Sidecar service
The above approach guarantess that Sidecar is ready once the setup method completes,
which means we no longer need to spend time waiting for schema propagation. This
optimization also helps in reducing test time.
The drawback of this approach is that if we need the cluster to be in some state for
testing, for example a node needs to be in joining state while executing the bulk test
then, that cluster can only be used for tests in that state. Which means that testing
different states of the cluster requires a new test class.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19251
888a546f84790c0a0b1b930e682cf597caaa0d61 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-12-13 16:45:39-08:00
CASSANDRASC-88: Allow DriverUtils to be pluggable
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-88
529171b1f6dee277a9087eb9da7242ce17873643 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-12-11 13:49:36-08:00
CASSANDRASC-87: Add JMX health checks during the periodic health checks
In this commit, we add health checks based on the JMX connectivity to the managed
Cassandra instances. Additionally, we construct the NodeSettings object based on
JMX. This allows the Sidecar process to be able to determine an adapter for the
node even if the node is in joining state, or its binary port has been disabled.
Co-authored-by: Doug Rohrer <doug@therohrers.org>
Co-authored-by: Francisco Guerrero <frankgh@apache.org>
Patch by Doug Rohrer, Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-87
d61e44f78fa4ba5ec395e1e39c507d666fddefd1 | Author: Yuriy Semchyshyn <yuriy@semchyshyn.com>
| 2023-11-29 17:49:29-06:00
CASSANDRA-19377 Startup Validation Failures when Checking Sidecar Connectivity
patch by Yuriy Semchyshyn; reviewed by Francisco Guerrero, Yifan Cai for CASSANDRA-19377
49723c720e0ae5762d77dfb7568c9b290a877560 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-27 09:13:21-08:00
CASSANDRASC-85: Expose TTL option for the create snapshot endpoint
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-85
ad936f6482aee2a05fa45ba4fdd06267958298f6 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-15 17:09:49-08:00
CASSANDRASC-82: Expose additional SSL configuration options for the Sidecar Service
Patch by Francisco Guerrero; Reviewed by Doug Rohrer, Yifan Cai for CASSANDRASC-82
b16b89f7f4a15441af757fa5a84b7e1e0420dcba | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-15 16:45:32-08:00
CASSANDRASC-84: Expose additional node settings
Sidecar exposes settings from the Cassandra node via the node settings API endpoint. The information exposed is
limited, and we need to start exposing additional information from the `system.local` table, for example
`datacenter` information, owned token ranges, and the local address and port for the native protocol. This
information can be consumed by Sidecar itself, as well as the Cassandra Analytics library.
In this commit, we expose additional settings for the node.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-84
c7c3bbca2c7cb415b39689e924fa2357c239f043 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-14 16:28:14-08:00
CASSANDRA-19031: Fix bulk writing when using identifiers that need quotes
Cassandra treats all identifiers as lower case unless explicitly quoted by the users,
(i.e. keyspace names, table names, column names, etc). We can define a case-sensitive
identifier or we can use a reserved word as an identifier by quoting it during DDL
creation.
In the analytics library, bulk writing fails when we encounter these identifiers. In
this commit, we fix the issue by property propagating the information about whether
identifiers need to be quoted by exposing a new dataframe option (`quote_identifiers`).
When set to `true`, it will _maybe_ quote the keyspace/table/column names and it will
properly be able to write data when using mixed-case or reserved words in the
identifiers.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19031
31fa33fcb446e522947f899d948de4042be04c62 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-14 14:42:17-08:00
CASSANDRASC-76: Sidecar does not handle keyspaces and table names with mixed case
Cassandra Sidecar does not properly handle API requests for endpoints that have keypaces
or table names that need to be quoted, for example when names have mixed case or when
the name is a reserved keyword in Cassandra.
In this commit, we perform special handling when the keyspace or table names are quoted
in the path params. We add tests to ensure that handling is correct.
Additionally, we fix the validation for keyspaces and table names without quotes and add
special validation for quoted names.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-76
457b36bcb3c8a865cca83ca6c402246798113ab4 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-13 16:16:36-08:00
CASSANDRA-19024 Fix bulk reading when using identifiers that need quotes
Cassandra treats all identifiers as lower case unless explicitly quoted by the users,
(i.e. keyspace names, table names, column names, etc). We can define a case-sensitive
identifier or we can use a reserved word as an identifier by quoting it during DDL
creation.
In the analytics library, bulk reads fail when we encounter these identifiers. In this,
commit, we fix the issue by properly propagating information about whether identifiers
need to be quoted by exposing a new data frame option (`quote_identifiers`). When set to
`true`, it will maybe quote the keyspace/table and it will properly be able to read data
when these situations are encountered.
Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19024
c1a6225dac62db94fd3faee92513e84ba3b9b3b7 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-11-07 10:13:06-08:00
CASSANDRASC-83: Require gossip to be enabled for ring and token ranges mapping endpoints
The ring and token ranges mapping endpoints leverage JMX to produce the payload. Ring and token ranges
mapping information is derived from gossip information available to the Cassandra instance. If gossip
has been disabled, the information might provide an inconsistent view of the cluster which is
undesirable. We should instead return a 503 (Service Unavailable) to the client whenever the client
tries to access these endpoints and gossip has been disabled.
In this commit, we ensure the endpoints return a 503 (Service Unavailable) when gossip is disabled.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Josh McKenzie, Yifan Cai for CASSANDRASC-83
0e4c2f4befa22caa68b34f95d0169b4685bc7e0d | Author: Bereng <berenguerblasi@gmail.com>
| 2023-11-07 07:24:57+01:00
Default to nb instead of nc for sstable formats
patch by Berenguer Blasi; reviewed by Francisco Guerrero, Jacek Lewandowski, Michael Semb Wever for CASSANDRA-19010
87a729feb4660f57bacb2a4be73e1bb2d509578b | Author: Saranya Krishnakumar <saranya_k@apple.com>
| 2023-11-06 13:32:01-08:00
CASSANDRA-19903: Get Sidecar port through CassandraContext
Patch by Saranya Krishnakumar; Reviewed by Dinesh Joshi, Francisco Guerrero, Josh McKenzie for CASSANDRA-19903
d8e9e2359db1f5561eb872d28ab16098b5a62c1d | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-10-27 10:13:24-07:00
CASSANDRASC-81: Improve TokenRangeReplicasResponse payload
The `TokenRangeReplicasResponse` returns a list with `ReplicaMetadata`. This information is used by clients to lookup
replica metadata. The lookup is done with the replica information, which consists of the `ip:port`.
Clients are looping over the `ReplicaMetadata` list and matching IP and port to retrieve the metadata object. Instead,
we can improve the payload by changing the data structure from a list to a map, and have clients lookup by the replica
(ip + port), without having to loop.
Patch by Francisco Guerrero; Reviewed by Arjun Ashok, Dinesh Joshi, Yifan Cai for CASSANDRASC-81
36d48b209c84470930d5ce3d4f4ab1c406dae60e | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-10-27 09:38:11-07:00
CASSANDRASC-80: HealthCheckPeriodicTask execute never completes the promise when instances are empty
When the HealthCheckPeriodicTask executes, and the instances are null or empty, the promise never completes.
This prevents subsequent scheduled health checks to take place because the PeriodicTaskExecutor will only
schedule the new task only if no other tasks are active. This makes the HealthCheckPeriodicTask to never
perform health checks when this condition is encountered.
In this commit, we fix the issue by completing the promise when this condition is encountered.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-80
8c89e2adb7680ecb4dd3cb2a562206fb8cb50d4a | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2023-10-17 10:18:06-07:00
Correct comment for nc SSTable format
Patch by frankgh; reviewd by brandonwilliams and bereng for
CASSANDRA-18933
0aaf5659028dd874c8d666c636f11eae63c429e6 | Author: Arjun Ashok <arjun_ashok@apple.com>
| 2023-10-09 07:53:40-07:00
CASSANDRA-18852 - Changes to make bulk writer resilient to cluster resize operations
Patch by Arjun Ashok, Saranya Krishnakumar; Reviewed by Yifan Cai, Francisco Guerrero, Doug Rohrer for CASSANDRA-18852
Co-authored-by: Arjun Ashok <arjun_ashok@apple.com>
Co-authored-by: Saranya Krishnakumar <saranya_k@apple.com>
8045f8eedad4510ecbf66bb739d464eec741aced | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-10-04 12:01:58-07:00
CASSANDRASC-77: Upgrade vertx to version 4.4.6 to bring hot reloading and traffic shaping options
Vertx 4.4.6 brings two features that we integrate into Sidecar.
1. Hot reloading of SSL certificates. This allows a running cluster to reload
certificates without having to restart the service.
2. Traffic shaping options. This allows to introduce protections for the
service. It allows configuring ingress/egress limits.
Additionally, this patch introduces the SidecarServerEvents messaging. It
leverages vertx's EventBus to publish and consume messages when server
starts, server stops, on CQL connection ready, or CQL disconnection,
and when all CQL connections are ready.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-77
b9586501a6b6cdfe465302448018785652c9b966 | Author: Jon Meredith <jonmeredith@apache.org>
| 2023-09-21 16:07:29-06:00
Internode legacy SSL storage port certificate is not hot reloaded on update
patch by Jon Meredith; reviewed by Dinesh Joshi, Francisco Guerrero for CASSANDRA-18681
c7b170cf2e7764159a3d9cf4ca0abc6db1659e51 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-09-21 14:16:56-07:00
CASSANDRASC-74: Stream sstable components API fails on secondary index files
In this commit, we fix streaming secondary index SSTable component files. We add
tests to validate that the index files can be streamed. We also add compatibility
for older clients that don't have the fix.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-74
4bfca2badb3284657a65d8910a4f77eaf7689b31 | Author: Bereng <berenguerblasi@gmail.com>
| 2023-09-15 09:24:07+02:00
IDEA to mark unused imports as error
patch by Berenguer Blasi; reviewed by Caleb Rackliffe, Francisco Guerrero, Jacek Lewandowski, Maxim Muzafarov, Stefan Miklosovic for CASSANDRA-18853
f8605b3c3dfdfdc5ef1b8327f4fd657efca8f9b7 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-09-02 12:35:54-07:00
CASSANDRASC-72: Split unit tests and integration tests in CircleCI config
As the number of integration tests continues to grow, it is desirable to split
unit tests and integration tests. It is also desirable to parallelize integration
tests, each integration test should be its own job. The goals of this improvement
are:
- Fail fast on checkstyle or minor errors
- Speed up test runtime by running integration tests in parallel
- Isolate failing tests to specific combinations of Cassandra
In this commit, we run unit tests individually for both java 8 and java 11. We split
integration tests into its own jobs, split per java version and cassandra version
combination.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-72
6b650d956ed9ca57b99065cdc3c2c81d2ef0c2d1 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-08-25 14:46:03-07:00
CASSANDRASC-71: Allow configuring permissions for uploaded SSTables
This commit introduces a new configuration for SSTable uploads (`file_permissions`) which allows
an operator to configure the desired file permissions used for files that are uploaded via SSTable
upload.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-71
6ffa43f68b8d10ca84d4a00bf81269527b4e14df | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2023-08-25 11:10:48-06:00
Support Dynamic Port Allocation for in-jvm dtest framework
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Jon Meredith, Yifan Cai for CASSANDRA-18722
63292010803875af6496ce7c787f404e66311375 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-08-18 10:38:19-07:00
CASSANDRASC-69: Refactor Sidecar configuration
This commit refactors the Sidecar configuration. It separates the configuration
objects from the cluster objects. For example, by default `InstancesConfig` will
be provisioned from the configured `InstanceConfiguration`.
POJOs now represent the configuration from the yaml files. Adding new configurations
entails adding new POJOs and their relationships as well as updating the default
yaml configuration file.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-69
f24951ab6ea2b1e9af4013b030675c70d31adb90 | Author: Yuriy Semchyshyn <yuriy@semchyshyn.com>
| 2023-08-14 14:09:12-05:00
CASSANDRA-18810: Cassandra Analytics Start-Up Validation
Patch by Yuriy Semchyshyn; Reviewed by Dinesh Joshi, Francisco Guerrero, Yifan Cai for CASSANDRA-18810
9c796dfb272daa3ce57a2dc5cbeadd9273e1ac72 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2023-07-28 09:26:20-07:00
Skip ColumnFamilyStore#topPartitions initialization when client or tool mode
This commit skips the initialization of `topPartitions` in `org.apache.cassandra.db.ColumnFamilyStore`
when running in client or tool mode. The `TopPartitionTracker` class will attempt to query the system
keyspace, which when running in client or tool mode will not be part of the KeyspaceMetadata. This
causes a warning to be printed out with a stacktrace that can be misleading. The warning is similar to
this:
```
WARN org.apache.cassandra.db.SystemKeyspace: Could not load stored top SIZES partitions for ...
org.apache.cassandra.db.KeyspaceNotDefinedException: keyspace system does not exist
at org.apache.cassandra.schema.Schema.validateTable(Schema.java:xxx) ~[?:?]
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:xxx) ~[?:?]
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:xxx) ~[?:?]
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:xxx) ~[?:?]
at org.apache.cassandra.cql3.QueryProcessor.parseAndPrepare(QueryProcessor.java:xxx) ~[?:?]
...
```
In this commit, we check whether we run in client or tool mode, and skip initialization
of `topPartitions` in those cases.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-18697
82b3c0a79c9322142738a4ec2ff7d4d4c0be2370 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-07-25 12:41:10-07:00
CASSANDRA-18692 Fix bulk writes with Buffered RowBufferMode
When setting Buffered RowBufferMode as part of the `WriterOption`s,
`org.apache.cassandra.spark.bulkwriter.RecordWriter` ignores that configuration and instead
uses the batch size to determine when to finalize an SSTable and start writing a new SSTable,
if more rows are available.
In this commit, we fix `org.apache.cassandra.spark.bulkwriter.RecordWriter#checkBatchSize`
to take into account the configured `RowBufferMode`. And in specific to the case of the
`UNBUFFERED` RowBufferMode, we check then the batchSize of the SSTable during writes, and for
the case of `BUFFERED` that check will take no effect.
Co-authored-by: Doug Rohrer <doug@therohrers.org>
Patch by Francisco Guerrero, Doug Rohrer; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-18692
ccda6798caae7eb7ef11ac9ee248f2e80720a6e2 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-07-21 09:42:17-07:00
CASSANDRASC-67 Fix relocation of native libraries for vertx-client-shaded
OpenSSL is currently unavailable in the vertx-client-shaded library. This is due to the
incorrect relocation of the netty native libraries under META-INF/native/libnetty.
In this commit we fix the relocation of the native libraries and we ensure that the
relocation is properly configured by adding tests that guarantee OpenSSL loads correctly
with the correct relocation.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-67
4be61d482c9a717fa4ebe2fc4b7d09a230dac68a | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-07-18 08:29:10-07:00
CASSANDRASC-66 Fix builds in Apache CI
Currently, tests running under {{org.apache.cassandra.sidecar.HealthServiceSslTest}}
[are failing](https://ci-cassandra.apache.org/job/cassandra~sidecar/42/) when running
inside ASF's CI. Logs are showing that some resources (keystore and truststore) are
not found. This is causing the tests to fail.
In this commit, we read the resource from the stream, which is guaranteed to exist
as long as the resource exists and the resource name is correct, then we write the
resource to a temporary directory, and use the file name to set the keystore and
truststore path as part of the configuration options.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-66
ca8dd2e0381a19ede99ce4b70959ce7a584ea0d2 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-07-11 04:30:59-07:00
CASSANDRASC-64: File descriptor is not being closed on MD5 checksum
This commit fixes an issue where the file descriptors are not being closed during
the MD5 checksum during SSTable upload. This can potentially cause the JVM to run
out of available file descriptors during execution, and it would force an operator
to restart the Sidecar process to return to a working state.
In this commit, we ensure the file is closed after the checksum is calculated.
Additionally, this commit fixes a rare ConcurrentModificationException encountered
in `SSTableImporter` whree the `importQueuePerHost` does not use a thread-safe map.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-64
cf796be039b9b08364de11fd7fec8a00cf699616 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-27 21:20:13-07:00
CASSANDRASC-63: Support credential rotation in JmxClient
JMX credentials in a Cassandra instance can be rotated on a cadence, on every bounce, or by some other
means. In those cases, the `JmxClient` will no longer be able to connect to the instance completely
losing the ability to talk to that instance.
In this commit, we allow the `JmxClient` to support credential changes to be continue to talk to the
Cassandra instance uninterrupted without any potential downtime to the Sidecar service.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-63
# Conflicts:
# CHANGES.txt
02d9136cfa72c8990120eca0f4fe5f52587bceb5 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-06-27 10:28:04-07:00
CASSANDRA-18631: Add Release Audit Tool (RAT) plugin to Analytics
This commit adds the Release Audit Tool (RAT) plugin to `build.gradle` which adds a new task
`rat`. This new task makes sure that the license headers are valid and present in the source
files during the `check` task.
To run the RAT plugin, you can run:
```
./gradlew rat
```
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Michael Semb Wever for CASSANDRA-18631
69766bca399cc779e0f2f8e859e39f7e29a17b7a | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-06-27 10:03:56-07:00
CASSANDRA-18662: Fix cassandra-analytics-core-example
This commit fixes the `SampleCassandraJob` available under the `cassandra-analytics-core-example`
subproject.
Fix checkstyle issues
Fix serialization issue in SidecarDataTransferApi
The `sidecarClient` field in `SidecarDataTransferApi` is declared as transient,
this is causing NPEs coming from executors while trying to perform an SSTable
upload.
This commit completely avoids serializing the `dataTransferApi` field in the
`CassandraBulkWriterContext`, and lazily initializing it during the `transfer()`
method invocation. We guard the initialization to a single thread by making the
`tranfer()` method synchronized. The `SidecarDataTransferApi` can be recreated
when needed using the already serialized `clusterInfo`, `jobInfo`, and `conf`
fields.
Fix setting ROW_BUFFER_MODE to BUFFERED
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-18662
20caf6efdd50c276e22f3a681b87e883891877b8 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-27 05:56:07-07:00
CASSANDRASC-61: Add Release Audit Tool (RAT) plugin to Sidecar
This commit adds the Release Audit Tool (RAT) plugin to `build.gradle` which
adds a new build task `rat`. This new task will make sure that the license
headers are present in source files during the `check` task.
To run the RAT plugin, you can run:
```
./gradlew rat
```
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Michael Semb Wever for CASSANDRASC-61
a4ca352a1720744a2424c64c498f31b2460924e8 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-23 16:23:26-07:00
CASSANDRASC-59: Expose JMX host and port from JMXClient
In this commit, we expose the JMX host and port from JMXClient to make it available
to implementations of the `ICassandraAdapter`.
This information can be valuable as different implementations and integrations need
to have this information available during the application execution.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-59
ab1884d2eedace79857489cb9dbe455ada9e4ad1 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-22 10:52:10-07:00
CASSANDRASC-58: Support retries in Sidecar Client on Invalid Checksum
In rare occasions an SSTable upload will receive a corrupted SSTable. Bit flips
are expected to occur occassionally while transmitting SSTables from client to
server.
This commit adds support for retries in Sidecar Client when a checksum mismatch
is encountered during SSTable upload. Allowing for clients to retry
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-58
40c3fac3891013457035fcfc4944b27535d5d701 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-21 11:08:37-07:00
CASSANDRASC-57: Remove RESTEasy
This commit removes RESTEasy and associated dependencies from the project. Vertx's handler
has been now preferred and there are no more endpoints on RESTEasy in this project.
patch byFrancisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-57
9912a620a0e67d9aa723037aaf5237598a895eb7 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-19 16:41:10-07:00
CASSANDRASC-56: Create staging directory if it doesn't exists
During SSTable upload, the upload will fail if the configured staging directory does not
exist. When this occurs an operator must manually create the directory, which increases
the configuration toil.
In this commit, we automatically create the staging directory if it doesn't exists during
SSTable upload. This improves the overall operational experience when running the Sidecar.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-56
9523a38b3f1b5bc4313e2949896ddc1fff58afbe | Author: jkonisa <jkonisa@apple.com>
| 2023-06-15 13:31:01-07:00
CASSANDRA-18605 Adding support for TTL & Timestamps for bulk writes
This commit introduces a new feature in Spark Bulk Writer to support writes with
constant/per_row based TTL & Timestamps.
Patch by Jyothsna Konisa; Reviewed by Dinesh Joshi, Francisco Guerrero, Yifan Cai for CASSANDRA-18605
cbae09ca71b9eb9a581b77c23844da21474b095a | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-06-14 11:52:55-07:00
CASSANDRA-18600 Add NOTICE.txt file
The NOTICE.txt file is currently missing in the repository. This commit adds the file to
comply with ASF's guidance.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Michael Semb Wever, Berenguer Blasi for CASSANDRA-18600
2cc2eb5844e1f7ec54b2934589c2f4a6a3d226bc | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-14 11:50:45-07:00
CASSANDRASC-54 Add NOTICE.txt file
The NOTICE.txt file is currently missing in the repository. This commit adds the file to
comply with ASF's guidance.
patch byFrancisco Guerrero; reviewed by Dinesh Joshi, Michael Semb Wever for CASSANDRASC-54
deebdf97ad01f23550d7d3b42d98c7bf111e2f95 | Author: Doug Rohrer <drohrer@apple.com>
| 2023-06-14 13:33:29-04:00
CASSANDRA-18759: Use in-jvm dtest framework from Sidecar for testing
This commit introduces the use of the in-jvm dtest framework for testing
Analytics workloads. It can spin up a Cassandra cluster, including the
necessary Sidecar process, to test writing to and reading from Cassandra
using the analytics library.
Additional changes made in this commit include
* Use concurrent collections in MockBulkWriterContext (Fixes flaky test StreamSessionConsistencyTest)
The StreamSessionConsistency test uses the MockBulkWriter context, but it wasn't originally used
(before this test was added) in a multi-threaded environment. Because of this, it would occasionally
throw ConcurrentModificationExceptions, which would cause the stream test to fail in a
non-deterministic way. This commit adds the use of concurrent/synchronous collections to the
MockBulkWriterContext to make sure it doesn't throw these spurious errors.
* Make the StartupValidation system thread-safe by using TreadLocals
instead of static collections, and clearing them once validation is
complete.
Patch by Doug Rohrer; Reviewed by Dinesh Joshi, Francisco Guerrero, Yifan Cai for CASSANDRA-18759
3cbb3d19c6f043b3a20e4933e5ff7a0e3d58f0a9 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-06-12 09:34:45-07:00
CASSANDRASC-53 Ignore unknown properties during Sidecar client deserialization
This commit modifies the way the `DecodableRequest` handles unknown properties in the
JSON payload. To support the evolution of the server API, we allow the Sidecar Client
to be more flexible when it encounters unknown properties, and we make it ignore these
new properties.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRASC-53
f0fae2deeee20df15ac1105af2163af2a7e7953d | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-06-08 12:40:22-07:00
CASSANDRA-18578 Add circleci configuration yaml for Cassandra Analytics
This commit adds the CircleCI configuration yaml to test against all the existing
profiles
- cassandra-analytics-core-spark2-2.11-jdk8
- cassandra-analytics-core-spark2-2.12-jdk8
- cassandra-analytics-core-spark3-2.12-jdk11
- cassandra-analytics-core-spark3-2.13-jdk11
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-18578
ee1c83722bfb1155bef762cdfb2c86034857f2d0 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-06-07 12:40:50-07:00
CASSANDRA-18574: Fix sample job documentation after Sidecar changes
This commit fixes the README file with documentation to setup and run the Sample job provided in the repository.
During Sidecar review, there was a suggestion to change the yaml property `uploads_staging_dir` to `staging_dir`.
That change however was not reflected as part of the sample job README.md.
patch by Francisco Guerrero; reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-18574
7764214d1fb44fb6139a622f403bb05610e8f7b1 | Author: Francisco Guerrero <frankgh@apache.org>
| 2023-05-24 14:21:59-07:00
CASSANDRA-18548: Add the .asf.yaml file
This commit adds the .asf.yaml file to control notifications and github settings
for the Cassandra Analytics project.
Patch by Francisco Guerrero; Reviewed by Brandon Williams, Yifan Cai for CASSANDRA-18548
b87b0edd310d1ef93c507bbbb1ae51e1b0b319c6 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-05-23 13:56:48-07:00
CASSANDRA-18545: Provide a SecretsProvider interface to abstract the secret provisioning
This commit introduces the SecretsProvider interface that abstracts the secrets provisioning.
This way different implementations of the SecretsProvider can be used to provide SSL secrets
for the Analytics job. We provide an implementation, SslConficSecretsProvider, which provides
secrets based on the configuration for the job.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-18545
38cdacb2e7418e2aefbcffb1754dcd324c46028d | Author: Dinesh Joshi <djoshi@apache.org>
| 2023-05-19 15:34:05-07:00
CEP-28: Implement Bulk API endpoints and introduce the Sidecar Client to Support Cassandra Analytics
This commit implements the remaining endpoints needed to perform Bulk Analytics operations that allow
reading and writing data from Cassandra in Bulk. The new endpoints include:
- Endpoint to create snapshots
- Endpoint to clear a snapshot
- Endpoint to upload SSTable components
- Endpoint to clean up uploads for SSTable components
- Endpoint to import SSTable components
- Endpoint to retrieve gossip info
- Endpoint to retrieve the time skew for the server
- Endpoint to retrieve the ring information
Sidecar Client
Introduces the fully featured sidecar client to access Cassandra Sidecar endpoints.
It offers support for retries and Sidecar instance selection policies. The client
project itself is technology-agnostic, but we provide a vertx implementation for
the `HttpClient`. The Sidecar vertx-client can be published as a shaded-jar to be
consumed by clients where the dependencies can cause issues, especially in environments
where the dependencies are not always controlled by the consumers (for example Spark).
Patch by Doug, Francisco, Saranya, Yifan, Dinesh; reviewed by Dinesh Joshi and Yifan Cai for CASSANDRA-16222
Co-authored-by: Saranya Krishnakumar <saranya_k@apple.com>
Co-authored-by: Yifan Cai <ycai@apache.org>
Co-authored-by: Francisco Guerrero <francisco.guerrero@apple.com>
Co-authored-by: Doug Rohrer <drohrer@apple.com>
Co-authored-by: Dinesh Joshi <djoshi@apache.org>
1633cd9c6c3d88d5c66825fab76a369266509f7e | Author: Dinesh Joshi <djoshi@apache.org>
| 2023-05-19 14:57:47-07:00
CEP-28: Apache Cassandra Analytics
This is the initial commit for the Apache Cassandra Analytics project
where we support reading and writing bulk data from Apache Cassandra from
Spark.
Patch by James Berragan, Doug Rohrer; Reviewed by Dinesh Joshi, Yifan Cai for CASSANDRA-16222
Co-authored-by: James Berragan <jberragan@apple.com>
Co-authored-by: Doug Rohrer <drohrer@apple.com>
Co-authored-by: Saranya Krishnakumar <saranya_k@apple.com>
Co-authored-by: Francisco Guerrero <francisco.guerrero@apple.com>
Co-authored-by: Yifan Cai <ycai@apache.org>
Co-authored-by: Jyothsna Konisa <jkonisa@apple.com>
Co-authored-by: Yuriy Semchyshyn <ysemchyshyn@apple.com>
Co-authored-by: Dinesh Joshi <djoshi@apache.org>
5d2cbaf5cb810f53689bf227e2c1f78a9a2b2e9f | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2023-05-05 10:48:02-07:00
CASSANDRASC-50: Deprecate the sidecar cassandra health endpoint containing instance segment
This commit deprecates the Cassandra Health endpoint containing the instance segment
in the path. This endpoint is currently unused and it is replaced by the health
endpoint with the `instanceId` query string parameter. Since the `instanceId` is optional
we move the path param (mandatory) to the query param (optional).
This commit also moves the CassandraHealthService from jax RS to a vertx Handler.
It also moves the HealthService to a inlined handler, simplifying the service.
patch by Francisco Guerrero; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-50
26c374da4f03e4a6b64e414805cd92f3eb0a36c6 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2023-03-09 12:11:20-08:00
Synchronize CQLSSTableWriter#build on the Schema.instance object
In this commit the `org.apache.cassandra.io.sstable.CQLSSTableWriter#build` method synchronizes on the
`Schema.instance` object (instead of the `CQLSSTableWriter.class`) to prevent concurrent schema operations
to fail when the offline tools also updates the schema.
For example, a table creation operation, which modifies the keyspace tables metadata, might end up
missing the update when a concurrent call to the `CQLSSTableWriter#build` method is accessing the
singleton Schema instance.
Patch by Francisco Guerrero, reviewed by Yifan Cai, Maxwell Guo, Alex Petrov for CASSANDRA-18317.
02cc6548f291528e9749a51d103463f9552f4b4e | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-11-02 12:54:54-07:00
CASSANDRASC-47: Introduce JMX foundation in Sidecar
In this commit, we introduce the JMX foundation in Sidecar to enable the ability to communicate
with the Cassandra process. This commit adds new configuration parameters to configure the
a `org.apache.cassandra.sidecar.common.JmxClient`. This client is available as part of the Cassandra
delegate.
A new interface is introduced and exposed through the `org.apache.cassandra.sidecar.common.ICassandraAdapter`.
The new interface `StorageOperations` is intended to interface with the Cassandra StorageService.
This commit provides an example implementation of the `takeSnapshot` method, which is also found
in the Cassandra code base. This should allow us to interact with the Cassandra process to create
snapshots.
A fix is required in the `CassandraSidecarDaemon` class, where the `healthCheck` runs the first time
only after the configured health check frequency (millis) has passed. This causes issues in the unit
tests as well as the actual execution of the service, as the `adapter` will be `null` until the first
health check is performed. To fix this issue, we perform a health check right after the server
successfully starts up.
Additional integration tests are added for testing the JMX integration with Cassandra. In the test,
we spin up a new Cassandra container using `testcontainers` and we perform validation against the
StorageService in Cassandra.
Co-authored-by: Doug Rohrer <drohrer@apple.com>
patch by Francisco Guerrero, Doug Rohrer; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-47
05c0bbe29f75d678596af09abb0c68d15e93f7ba | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-10-14 11:48:23-07:00
CASSANDRASC-46: Migrate minikube to testcontainers for integration tests
The existing Cassandra Sidecar integration testing suite uses Minikube to provision a
Cassandra service and it performs tests against the running database. This database runs
on Minikube.
A new alternative is to use [testcontainers](https://www.testcontainers.org), which requires
Docker to run tests. The concept is similar, but the benefit is that *testcontainers* is
well integrated into the java ecosystem and it works well with junit5. By replacing Minikube
with `testcontainers` we can simplify the setup process and reduce the complexity for running
integration tests.
Additionally, `testcontainers` is supported as part of the testing infrastructure inside
[CircleCI](https://www.testcontainers.org/supported_docker_environment/continuous_integration/circle_ci/).
Currently, our Minikube tests are broken both locally and in CircleCI. Moving to `testcontainers`
would also unblock us for further development.
Minor fix when running testcontainers
patch by Francisco Guerrero; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-46
30c04eb38a796183643bdcbaff8f425d90ebf671 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-10-10 18:48:31-07:00
CASSANDRASC-45: Delegate methods to the RateLimiter
Sidecar offers a `SidecarRateLimiter` class that internally uses the
`com.google.common.util.concurrent.RateLimiter`. In this commit, we expose public methods
of the `RateLimiter` class using the delegate pattern. These methods will allow us to tweak
the settings of the `RateLimiter` that are available to us
patch by Francisco Guerrero; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-45
5471b66c1e69e057bbbe75e4ffe67c1891cd9495 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-10-10 14:35:12-07:00
CASSANDRASC-44 Refactor health check to use vertx timer
Vertx API offers a periodic timer that integrates with it's internal thead pooling
mechanism. In this commit, we utilize vertx's periodic timer in favor of using a
`Executors.newSingleThreadScheduledExecutor()` on each delegate.
Another benefit of this approach is that if the cluster topology changes, i.e.
node replacement, cluster expansion / shrink, then the health checks will be performed
against the actual nodes in the cluster, assuming we receive an updated view of the
cluster when invoking the `Configuration#getInstancesConfig()#instances()` method.
We no longer need to worry about decommissioning the single thread executors running
on each delegate.
patch by Francisco Guerrero; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-44
6e358acfce071cad16ac88c15dc2229bbb8a7944 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-10-07 16:39:12-07:00
CASSANDRASC-43 Add Schema API
This commit introduces the Schema API. Two new endpoints are added:
- /api/v1/schema/keyspaces
This endpoint returns the SchemaResponse with the full schema for all keyspaces.
- /api/v1/schema/keyspaces/:keyspace
This endpoint returns the SchemaResponse with the full schema for the requested keyspace.
patch by Francisco Guerrero; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-43
dad3e86dfe73ae1ba4aa5a23cf8194bed3f46322 | Author: Brandon Williams <brandonwilliams@apache.org>
| 2022-08-22 11:09:48-05:00
Ignore new error from CASSANDRA-17805 when attempting to replace a live node
Patch by brandonwilliams; reviewed by dcapwell and frankgh for
CASSANDRA-17847
83c169ec9e36324f27bf562951362f4a03c3c688 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-08-19 10:20:57-07:00
Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle
patch by Francisco Guerrero; reviewed by Ekaterina Dimitrova, Yifan Cai for CASSANDRA-17677
9184dd5a998366dc2b5c18d4954b13b033efcf80 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-08-15 09:23:56-07:00
When doing a host replacement, we need to check that the node is a live node before failing with "Cannot replace a live node..."
patch by Francisco Guerrero; reviewed by Brandon Williams, David Capwell for CASSANDRA-17805
e5e13c02ccf386093153fd6824fd85ef7bd24eb3 | Author: Ekaterina Dimitrova <ekaterina.dimitrova@datastax.com>
| 2022-08-01 17:35:16-04:00
Fix default value for compaction_throughput_mb_per_sec in Config class to match the one in cassandra.yaml
patch by Ekaterina Dimimtrova; reviewed by Francisco Guerrero, Michael Semb Wever for CASSANDRA-17790
07deb07d2e57a32c67643385403980a9a3c3fc95 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-07-13 10:39:06-07:00
CASSANDRASC-40 Fix search in list snapshot endpoint
This commit fixes test setup in SnapshotUtils. Because of the incorrect test setup
the execution is providing incorrect results. For example, assume the following path
/cassandra-test/data/ks/tbl/snapshots/test-snapshot
The test was configuring data directories as ["/cassandra-test/data"], but in a real
execution data directories is provided as ["/cassandra-test"]. This is causing the
endpoint to return incorrect values in the JSON payload.
Additionally, the response was providing the port for Cassandra and not the Sidecar
port.
a7a2c29e990acb2363eb7a15cc4b970ebdc04753 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-07-08 17:09:35-07:00
CASSANDRAC-39: Allow Cassandra input validation to be configurable
It is sometimes desirable to allow configuring the Cassandra input validation in case
we want to further restrict the existing validations. In this commit, we introduce the
ability to inject the ValidationConfiguration. A default YamlValidationConfiguration is
provided and the sidecar.yaml is updated to reflect the existing default validation.
dd08314ed654aafa60b2a82fc4953aac171ba3ef | Author: Ekaterina Dimitrova <ekaterina.dimitrova@datastax.com>
| 2022-06-30 17:06:58-04:00
Uncomment prepared_statements_cache_size, key_cache_size, counter_cache_size, index_summary_capacity which were
commented out by mistake in a previous patch;
Fix breaking change with cache_load_timeout; cache_load_timeout_seconds <=0 and cache_load_timeout=0 are equivalent
and they both mean disabled;
Deprecate public method setRate(final double throughputMbPerSec) in Compaction Manager in favor of
setRateInBytes(final double throughputBytesPerSec);
Revert breaking change removal of StressCQLSSTableWriter.Builder.withBufferSizeInMB(int size). Deprecate it in favor
of StressCQLSSTableWriter.Builder.withBufferSizeInMiB(int size);
Fix precision issues, add new -m flag (for nodetool/setstreamthroughput, nodetool/setinterdcstreamthroughput,
nodetool/getstreamthroughput and nodetoo/getinterdcstreamthroughput), add new -d flags (nodetool/getstreamthroughput,
nodetool/getinterdcstreamthroughput, nodetool/getcompactionthroughput);
Fix a bug with precision in nodetool/compactionstats;
Deprecate StorageService methods and add new ones for stream_throughput_outbound, inter_dc_stream_throughput_outbound,
compaction_throughput_outbound in the JMX MBean `org.apache.cassandra.db:type=StorageService`;
Removed getEntireSSTableStreamThroughputMebibytesPerSec in favor of new getEntireSSTableStreamThroughputMebibytesPerSecAsDouble
in the JMX MBean `org.apache.cassandra.db:type=StorageService`;
Removed getEntireSSTableInterDCStreamThroughputMebibytesPerSec in favor of getEntireSSTableInterDCStreamThroughputMebibytesPerSecAsDouble
in the JMX MBean `org.apache.cassandra.db:type=StorageService`
Patch by Ekaterina Dimitrova; reviewed by Caleb Rackliffe, Francisco Guerrero for CASSANDRA-17225
a9725b681b948f2122f3d48b96a5c4e7403d2c39 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-06-29 11:15:10-07:00
Fix AbstractCell#toString throws MarshalException for cell in collection
patch by Francisco Guerrero; reviewed by Caleb Rackliffe, Yifan Cai for CASSANDRA-17695
54abba8d7d870da5055bef79a51cf52fad980deb | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-06-29 09:10:28-07:00
CASSANDRASC-38 Add endpoint to list snapshot files
This commit adds two new endpoints to allow listing snapshot files.
The first endpoint takes a snapshot name as a path parameter, and searches for
all the snapshots matching the provided name. The result lists all the snapshot
files for all matching snapshots. Additionally, secondary index files can be
included in the response by providing the query param
`includeSecondaryIndexFiles=true`.
```
/api/v1/snapshots/:snapshot
```
The second endpoint takes a keyspace, table name, and snapshot name and searches
for a unique snapshot matching the provided snapshot name in the given keyspace
and table name. The results lists the snapshot files matching the given keyspace,
table name, and snapshot name. Similarly to the first endpoint, secondary index
files can be included in the response by providing the query param
`includeSecondaryIndexFiles=true`.
```
/api/v1/keyspace/:keyspace/table/:table/snapshots/:snapshot
```
24a08f22707901f7641e48f0c26e54b05c0e03c3 | Author: Saranya Krishnakumar <saranya_k@apple.com>
| 2022-06-27 10:47:26-07:00
Optimize file path builder and have separate handler for streaming file
patch by Francisco Guerrero, Saranya Krishnakumar; reviewed by Yifan Cai, Dinesh Joshi for CASSANDRASC-37
ed3901823a5fe9f8838d8b592a1b7703b12e810b | Author: Jyothsna Konisa <jkonisa@apple.com>
| 2022-05-24 10:21:16-07:00
Adding support for TLS client authentication for internode communication
patch by Jyothsna Konisa; reviewed by Bernardo Botella, Francisco Guerrero, Jon Meredith, Maulin Vasavada, Yifan Cai for CASSANDRA-17513
ffc4c89c3df7ad0ae73ebefdcb7e15a2790c0a52 | Author: Doug Rohrer <drohrer@apple.com>
| 2022-05-17 15:09:16-04:00
Fix issue where frozen maps may not be serialized in the correct order
patch by Doug Rohrer, Francisco Guerrero and Yifan Cai; reviewed by Andrés de la Peña and Caleb Rackliffe for CASSANDRA-17623
Co-authored-by: Doug Rohrer <drohrer@apple.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Yifan Cai <ycai@apache.org>
2e233ec579e4d2a23021116027d75e776e7ad9ec | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-04-27 10:24:00-07:00
CASSANDRASC-36: Support for ErrorHandler in Sidecar
This commit adds a default `ErrorHandler` to the `Route#failureHandler()`. The default implementation
for Sidecar is `ErrorHandlerImpl`.
Additionally, we allow for custom implementations of the `ErrorHandler` to be injected (via module
overrides). This allows downstream projects to provide custom implementations of the `ErrorHandler` to
fit the specific needs of the project.
Patch by Francisco Guerrero; Reviewed by Dinesh Joshi, Saranya Krishnakumar and Yifan Cai for CASSANDRASC-36
bc219cbf75bdbfdc7a95b3160ef17332c9274b44 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-03-29 12:09:06-07:00
Add the CONTRIBUTING.md doc with guidelines and best practices
The document delves into the contribution guidelines, source code best practices
and source code style.
61be4d836213f708d9a29e59b9ef1df0bebef29a | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-03-16 01:31:00+01:00
expose gossip information in system_views.gossip_info virtual table
patch by Francisco Guerrero; reviewed by Stefan Miklosovic and Yifan Cai for CASSANDRA-17002
This commit adds a new virtual table that exposes the gossip information in tabular format.
The information is the same as the information presented through the `nodetool gossipinfo`
command, but the virtual table splits the version and value from `VersionedValue` into two
different columns. This is intented to help clients reading the vtable without the need of
parsing the version:value information (as it currently stands in gossipinfo).
The token value does not have a column. This is consistent with the gossipinfo output which
always renders ":<hidden>" for the Token value. Only the token_version column is available.
7a5e710a2173e492907edc4094e052157a562103 | Author: Francisco Guerrero <francisco.guerrero@apple.com>
| 2022-03-11 15:43:24-08:00
CASSANDRASC-34: Allow for LoggerHandler to be injected
Currently, `vertxRouter` adds an instance of `LoggerHandler` to the top level route.
This is prescriptive and it doesn't allow for a different implementation of the LoggerHandler
to be injected.
In this commit, `LoggerHandler` is created in the `MainModule` as a singleton and then
injected in the `vertxRouter` method. This allows for a new implementation of the `LoggerHandler`
to be provided.
543608ba39d5803b963d14821abe193ff0796b4f | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-02-07 11:25:45-08:00
Instance failed to start up due to NPE in StartupClusterConnectivityChecker
patch by Francisco Guerrero; reviewed by Stefan Miklosovic, Yifan Cai for CASSANDRA-17347
df16b3750dc2c1b6b9bcdece6f81dfd3de7ebdfa | Author: David Capwell <dcapwell@apache.org>
| 2022-02-04 10:15:58-08:00
When streaming sees a ClosedChannelException this triggers the disk failure policy
patch by David Capwell, Francisco Guerrero; reviewed by Caleb Rackliffe, Dinesh Joshi for CASSANDRA-17116
0448f15e3db392f2f60db332fabf6309aa3d5089 | Author: David Capwell <David Capwell>
| 2022-02-04 10:15:46-08:00
When streaming sees a ClosedChannelException this triggers the disk failure policy
patch by David Capwell, Francisco Guerrero; reviewed by Caleb Rackliffe, Dinesh Joshi for CASSANDRA-17116
945a4fc23ac1f60b8380be3b60aef89caf3daba2 | Author: Shailaja Koppu <s_koppu@apple.com>
| 2022-02-01 09:53:49-08:00
Add a virtual table for exposing prepared statements metrics
patch by Shailaja Koppu; reviewed by Ekaterina Dimitrova, Francisco Guerrero, Yifan Cai for CASSANDRA-17224
e773bbd9c6c52a2fedc127cb7ab77a1fdbeb63d2 | Author: cclive1601 <cclive1601@gmail.com>
| 2022-01-21 11:07:37-08:00
Reject snapshot names with special character
Patch by Maxwell Guo, Saranya Krishnakumar and Francisco Guerrero; review
by Benjamin Lerer, Chris Lohfink, Stefan Miklosovic for CASSANDRA-15297
This commit also adds unit tests for the snapshot nodetool command.
Co-authored-by: Saranya Krishnakumar <saranya_k@apple.com>
Co-authored-by: Francisco Guerrero <francisco.guerrero@apple.com>
0dc5a289e8dd586150253d951e6e229480c0ffc8 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-01-14 16:13:00-08:00
Preserve tests that use BigInt numbers
Patch by Francisco Guerrero; reviewed by brandonwilliams and ycai for
CASSANDRA-17133
e0a61f73b9b9d14db3e68aafb38257a7689557b9 | Author: Francisco Guerrero <frank.guerrero@gmail.com>
| 2022-01-14 15:36:35-08:00
Revert "Add unix time conversion functions"
This reverts commit 8ddcd43b0cfcebfda882a238532d00905fe85eb8.