Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55101

Max Visible Seqno and High Completed Seqno could be incorrect for an incomplete snapshot

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      Issue noted during code review whilst developing another feature - theoretical, unproven etc...

      The max visible seqno (MVS) and high-completed seqno (HCS) are special 'markers' tracked per vbucket for different purposes. For example MVS denotes the highest committed seqno, e.g. a prepare or abort will not 'increment' this counter. This is used so that for example a DCP client can use getAllVBSeqnos which will return the MVS - then the DCP client (when not enabling prepare/abort) doesn't 'miss' events on their stream.

      The MVS/HCS are replicated in snapshot markers and this is where this issue could occur.

      If a cluster was building a new replica, it will send a disk snapshot from active to replica, the marker which is sent ahead of the data will include MVS/HCS, the replica vbucket will read these and place them into VBucket objects - then the snapshot items are transmitted.

      Next we could imagine that a new problem occurs and dataloss is accepted, the replica which is being built is forced to become active. We could assume it has received 1 mutation, yet the MVS and HCS are millions of seqnos higher....

      I cannot see that there is any recovery or protection in place, we would of course be in a bad place because of the accepted dataloss, but the vbucket state is now out of sync. E.g. getAllVBSeqs will report the huge MVS, yet a new DCP stream will never reach such a seqno.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            jwalker Jim Walker
            jwalker Jim Walker
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty