CASSANDRA-20567: SAI marks an index as non-empty when a partial partition/row modifications is flushed due to repair #4105

dcapwell · 2025-04-18T22:00:02Z

No description provided.

dcapwell · 2025-04-18T22:01:31Z

src/java/org/apache/cassandra/index/sai/disk/v1/MemtableIndexWriter.java

@@ -218,7 +218,7 @@ private void flushVectorIndex(long startTime, Stopwatch stopwatch) throws IOExce
    private void completeIndexFlush(long cellCount, long startTime, Stopwatch stopwatch) throws IOException
    {
        // create a completion marker indicating that the index is complete and not-empty
-        ColumnCompletionMarkerUtil.create(indexDescriptor, indexIdentifier, false);
+        ColumnCompletionMarkerUtil.create(indexDescriptor, indexIdentifier, cellCount == 0);


this is the actual patch for the bug fix. Repair is causing multiple sstables to be generated, and the tests do partial partition/row updates, so SAI sees a column as having data, but it didn't write anything for this range... this leads to cellCount == 0

dcapwell · 2025-04-18T22:05:08Z

test/distributed/org/apache/cassandra/distributed/test/cql3/SingleNodeTableWalkTest.java

@@ -371,6 +371,8 @@ public void test() throws IOException
                                                                        .add(this::selectTokenRange))
                                  .addIf(State::hasEnoughMemtable, StatefulASTBase::flushTable)
                                  .addIf(State::hasEnoughSSTables, StatefulASTBase::compactTable)
+                                  .addAllIf(BaseState::allowRepair, b -> b.add(StatefulASTBase::incrementalRepair)
+                                                                          .addIf(BaseState::isConsistent, StatefulASTBase::previewRepair))


preview repairs fails if it finds a mismatch, so we can only call it safely in the test when we know we should be consistent.

for accord, we flush during the prepare phase, which is before we do any barriers, so we can't just say "accord makes us consistent", because we might not be (you need to go through accord to be consistent).

dcapwell · 2025-04-18T22:05:42Z

test/distributed/org/apache/cassandra/distributed/test/cql3/StatefulASTBase.java

@@ -144,7 +149,7 @@ protected static String nextKeyspace()

    protected void clusterConfig(IInstanceConfig config)
    {
-
+        config.set("repair.retries.max_attempts", Integer.MAX_VALUE);


these are jvm-dtest clusters, retry non stop until it works... these tests don't do chaos testing, so its really the "happy path" case

dcapwell · 2025-04-18T22:22:48Z

test/distributed/org/apache/cassandra/fuzz/sai/AccordFullMultiNodeSAITest.java

@@ -25,7 +25,7 @@
 import org.apache.cassandra.harry.gen.SchemaGenerators;
 import org.apache.cassandra.service.consensus.TransactionalMode;

-@Ignore("CASSANDRA-20567: Repair is failing due to missing SAI index files when using zero copy streaming")
+@Ignore("It was believed that these tests were failing due to CASSANDRA-20567, but in fixing that issue it was found that the tests are still failing!  Harry is detecting an incorrect response...")


was really hoping fixing this patch would fix these tests, so we have to root cause more.

I had a ton of seeds from CI, and non reproduced locally =(

dcapwell · 2025-04-18T22:23:19Z

test/unit/org/apache/cassandra/repair/FuzzTestBase.java

@@ -557,53 +554,12 @@ static RepairOption previewOption(RandomSource rs, Cluster.Node coordinator, Str

    private static RepairOption repairOption(RandomSource rs, Cluster.Node coordinator, String ks, Gen<List<String>> tablesGen, Gen<RepairType> repairTypeGen, Gen<PreviewType> previewTypeGen, Gen<RepairParallelism> repairParallelismGen)
    {
-        RepairType type = repairTypeGen.next(rs);


rather than having different tests randomize repair stuff, i pulled this out into RepairGenerators so this logic could be reused

dcapwell · 2025-04-18T22:24:35Z

test/unit/org/apache/cassandra/utils/CassandraGenerators.java

            if (nextBoolean(rnd))
-                options.put(LeveledCompactionStrategy.LEVEL_FANOUT_SIZE_OPTION, SourceDSL.integers().between(1, 100).generate(rnd).toString());
+            {
+                // there is a relationship between sstable size and fanout, so respect it


one seed randomly hit this case, so fix it so we don't happen again...

dcapwell · 2025-04-22T16:43:25Z

i am going to move the fuzz test logic out of this patch, as it keeps being flakey due to ZCS allocating direct memory and failing. I am testing removing this allocation (which should make the test no longer flakey), but that has nothing to do with SAI so best to do as another ticket.

…tion/row modifications is flushed due to repair

…o flush

dcapwell commented Apr 18, 2025

View reviewed changes

dcapwell force-pushed the CASSANDRA-20567 branch from 0d6d5a8 to 77c6170 Compare April 22, 2025 16:58

dcapwell mentioned this pull request Apr 22, 2025

CASSANDRA-20577: zero copy streaming allocates direct memory that isnt used, but does help to fragment the memory space #4107

Open

dcapwell added 4 commits April 23, 2025 14:21

CASSANDRA-20567: SAI marks an index as non-empty when a partial parti…

8d5a39c

…tion/row modifications is flushed due to repair

remove CASSANDRA-20577 fromt he diff

f07c626

added static columns... why not

eb2450a

forgot to index the column

bdbc5a1

dcapwell force-pushed the CASSANDRA-20567 branch from db40871 to bdbc5a1 Compare April 23, 2025 21:21

if the intersect(memtable, rowMapping) is empty, dont bother to try t…

355edeb

…o flush

maedhroz approved these changes Apr 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CASSANDRA-20567: SAI marks an index as non-empty when a partial partition/row modifications is flushed due to repair #4105

CASSANDRA-20567: SAI marks an index as non-empty when a partial partition/row modifications is flushed due to repair #4105

dcapwell commented Apr 18, 2025

dcapwell Apr 18, 2025

dcapwell Apr 18, 2025

dcapwell Apr 18, 2025

dcapwell Apr 18, 2025

dcapwell Apr 18, 2025

dcapwell Apr 18, 2025

dcapwell commented Apr 22, 2025

CASSANDRA-20567: SAI marks an index as non-empty when a partial partition/row modifications is flushed due to repair #4105

Are you sure you want to change the base?

CASSANDRA-20567: SAI marks an index as non-empty when a partial partition/row modifications is flushed due to repair #4105

Conversation

dcapwell commented Apr 18, 2025

dcapwell Apr 18, 2025

Choose a reason for hiding this comment

dcapwell Apr 18, 2025

Choose a reason for hiding this comment

dcapwell Apr 18, 2025

Choose a reason for hiding this comment

dcapwell Apr 18, 2025

Choose a reason for hiding this comment

dcapwell Apr 18, 2025

Choose a reason for hiding this comment

dcapwell Apr 18, 2025

Choose a reason for hiding this comment

dcapwell commented Apr 22, 2025