Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56084

Non-collection aware client might enter a rollback loop if _default purgeSeqno above _default highSeqno

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      If we have a legacy DCP consumer, which has not enabled collections (or SeqnoAdvanced) and we need to stream to seqno above the _default collection seqno, we just end the stream (MB-48759). 

      However, if the client has requested an endSeqno which requires streaming items from memory, and during backfill we cannot reach the seqno of the registered cursor in the CheckpointManager, we set pendingBackfill = true (ActiveStream::registerCursor) when re-registering the cursor from markDiskSnapshot and reschedule the backfill. 

      If the reason we cannot reach the seqno of the cursor is that the purgeSeqno has advanced past the highSeqno of the _default collection, we'd end up rescheduling a backfill with startSeqno > purgeSeqno and fail. 

      Let's look at an example:

      Seqno | 1 2 3 4
      Coll  | _ _ _ A
      Type  | S S D S
       
      Legend:
      _ default collection
      A other collection
      S store
      D delete

      * Example is simplified: we'd have SystemEvents for the collection create somewhere.

      We purge up to seqno 3 (so the tombstone is now purged and purgeSeqno = 3).

      If we have a stream from zero to infinity, we register a cursor in the CheckpointManager and set the pendingBackfill flag to true, so we can resume the streaming after the disk backfill.

      The first backfill is a disk backfill which can only process items 1 and 2, not 3, because 3 has been purged.

      When we reschedule a backfill, it starts at the last backfill seqno + 1, so 3. In the DCPBackfillBySeqnoDisk::create code we check if startSeqno <= purgeSeqno and fail in that case, with reason=Rollback.

      As noted above, the stream request is from 0, so asking the consumer to rollback creates a loop, until more mutations to the _default collections are made, past the purgeSeqno.

       

      Issue Resolution
      A rollback loop affected legacy clients when collections were used and a tombstone newer than the last mutation in the default collection was purged. The lastReadSeqno is now Incremented when the client is not collection-aware.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            vesko.karaganev Vesko Karaganev
            vesko.karaganev Vesko Karaganev
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty