Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.1.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.1.5
-
Untriaged
-
0
-
Unknown
Description
If we have a legacy DCP consumer, which has not enabled collections (or SeqnoAdvanced) and we need to stream to seqno above the _default collection seqno, we just end the stream (MB-48759).
However, if the client has requested an endSeqno which requires streaming items from memory, and during backfill we cannot reach the seqno of the registered cursor in the CheckpointManager, we set pendingBackfill = true (ActiveStream::registerCursor) when re-registering the cursor from markDiskSnapshot and reschedule the backfill.
If the reason we cannot reach the seqno of the cursor is that the purgeSeqno has advanced past the highSeqno of the _default collection, we'd end up rescheduling a backfill with startSeqno > purgeSeqno and fail.
—
Let's look at an example:
Seqno | 1 2 3 4
|
Coll | _ _ _ A
|
Type | S S D S
|
|
Legend:
|
_ default collection
|
A other collection
|
S store
|
D delete
|
We purge up to seqno 3 (so the tombstone is now purged and purgeSeqno = 3).
If we have a stream from zero to infinity, we register a cursor in the CheckpointManager and set the pendingBackfill flag to true, so we can resume the streaming after the disk backfill.
The first backfill is a disk backfill which can only process items 1 and 2, not 3, because 3 has been purged.
When we reschedule a backfill, it starts at the last backfill seqno + 1, so 3. In the DCPBackfillBySeqnoDisk::create code we check if startSeqno <= purgeSeqno and fail in that case, with reason=Rollback.
As noted above, the stream request is from 0, so asking the consumer to rollback creates a loop, until more mutations to the _default collections are made, past the purgeSeqno.
Issue | Resolution |
A rollback loop affected legacy clients when collections were used and a tombstone newer than the last mutation in the default collection was purged. | The lastReadSeqno is now Incremented when the client is not collection-aware. |