Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53189

Set failover table branch point based on in-memory state

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • Morpheus
    • 6.6.2, 6.6.5, 7.0.4, 7.1.1
    • couchbase-bucket
    • None
    • 1

    Description

      Summary

      We should investigate using the last complete in-memory snapshot for calculating the failover table branch point, instead of the last on-disk persisted snapshot.

      Context

      During failover, when a replica vBucket is promoted to active KV-Engine creates a failover table entry to identify the new history. The branch point is created at the most recent consistent point which has been persisted to disk - see KVBucket:: setVBucketState_UNLOCKED:

      KVBucket::setVBucketState_UNLOCKED

          if (to == vbucket_state_active && oldstate != vbucket_state_active &&
              transfer == TransferVB::No) {
              // Changed state to active and this isn't a transfer (i.e.
              // takeover), which means this is a new fork in the vBucket history
              // - create a new failover table entry.
              const snapshot_range_t range = vb->getPersistedSnapshot();
              auto highSeqno = range.getEnd() == vb->getPersistenceSeqno()
                                       ? range.getEnd()
                                       : range.getStart();
              vb->createFailoverEntry(highSeqno);
      

      Note the highlighted lines - if a complete snapshot has been persisted then we place the failover branch point at the end of that snapshot, if not we place the branch point at the start of that snapshot - i.e. the previous consistent point.

      Historically this made sense as a vBucket state change was persisted to disk asynchronously with respect to the sequence of mutations - i.e. we would persist the setVBState immediately, "in the middle" of the outstanding mutations of the current snapshot, and hence we could only consider the on-disk state when determining the failover branch point.

      Proposal

      Since MB-35331 (https://review.couchbase.org/c/kv_engine/+/113904) included in v6.5.0, the vBucket state change is recorded in a meta-item and enqueued (in-order) in the CheckpointManager. As such, if we happen to have a complete Snapshot in-memory (which is not yet persisted to disk) then we should be able to set the failover table branch point at the end of the complete in-memory snapshot.

      This allows us to move the failover table branch point to a higher sequence number than we currently do, but still have a valid, consistent branch point, which in turn should reduce the amount of rollback a DCP consumer may need to perform when re-connecting to this newly-promoted active VB.

      (See MB-53172 for an example scenario where this was significant.)

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              owend Daniel Owen
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty