As identified during SyncReplication testing, ActiveStream::processItems() can fail to correctly mark a snapshot as a Checkpoint boundary, if a checkpoint_start item is the last one it processes.
DCP Snapshot represents an ordered sequence of mutations / deletions. It doesn't necessarily map 1:1: to a checkpoint, because the ActiveStream may fetch items from the CheckpointManager before the Checkpoint has been closed. For example, consider an open Checkpoint which currently contains seqnos 1..2 (but can still have items added); and a DCP client with a cursor point at the beginning of it:
If an ActiveStream is ready to process data, it will call getAllItemsForCursor which will advance the cursor to the current end of the Checkpoint, and return 3 queued_items:
These are handled by processItems and result in the following DCP mutations being added to the readyQ:
At this point the cursor is pointing after the current Checkpoint contents:
Now, consider some more items are added to the checkpoint manager and a new Checkpoint added:
ActiveStream wants to process more data, calls getAllItemsForCursor and receives the following queued_items:
These are again handled by processItems; and should result in the next SnapshotMarker having the MARKER_FLAG_CHK flag set - to indicate this is the start of a new Checkpoint. However, it cannot yet add a SnapshotMarker to the readyQ because SnapshotMarkers must define a non-empty seqno range - i.e. we cannot create one until at least one Mutation is also ready to process.
The bug is in this handling - ActiveStream::processItems fails to account for this "pending" SnapshotMarker and ignores it. When ActiveStream::processItems is next called (with one more more mutations) the created Snapshot marker does not have MARKER_FLAG_CHK flag set.
In the context of normal mutations (non-SyncWrites) this doesn't really matter - while the replica won't create a Checkpoint when it should, there's no significant consequence to this, other than the replica may end up with larger (and more costly to maintain) checkpoints. However for SyncWrites this breaks the model, as we cannot de-duplicate SyncWrite related items (prepare_sync_write / commit_sync_write).
|For Gerrit Dashboard: MB-32862|
|104260,5||MB-32862: Correctly encode DCP SnapMarker if last item processed is chk_start||master||kv_engine||Status: MERGED||+2||+1|