Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Unresolved
Priority: Major
Fix Version/s: Morpheus
Affects Version/s: 6.6.2, 6.6.5, 7.0.4, 7.1.1
Component/s: couchbase-bucket
Labels:
None

Story Points:
1

Description

Summary

We should investigate using the last complete in-memory snapshot for calculating the failover table branch point, instead of the last on-disk persisted snapshot.

Context

During failover, when a replica vBucket is promoted to active KV-Engine creates a failover table entry to identify the new history. The branch point is created at the most recent consistent point which has been persisted to disk - see KVBucket:: setVBucketState_UNLOCKED:

KVBucket::setVBucketState_UNLOCKED
if (to == vbucket_state_active && oldstate != vbucket_state_active &&
transfer == TransferVB::No) {
// Changed state to active and this isn't a transfer (i.e.
// takeover), which means this is a new fork in the vBucket history
// - create a new failover table entry.
const snapshot_range_t range = vb->getPersistedSnapshot();
auto highSeqno = range.getEnd() == vb->getPersistenceSeqno()
? range.getEnd()
: range.getStart();
vb->createFailoverEntry(highSeqno);

Note the highlighted lines - if a complete snapshot has been persisted then we place the failover branch point at the end of that snapshot, if not we place the branch point at the start of that snapshot - i.e. the previous consistent point.

Historically this made sense as a vBucket state change was persisted to disk asynchronously with respect to the sequence of mutations - i.e. we would persist the setVBState immediately, "in the middle" of the outstanding mutations of the current snapshot, and hence we could only consider the on-disk state when determining the failover branch point.

Proposal

Since ~~MB-35331~~ (https://review.couchbase.org/c/kv_engine/+/113904) included in v6.5.0, the vBucket state change is recorded in a meta-item and enqueued (in-order) in the CheckpointManager. As such, if we happen to have a complete Snapshot in-memory (which is not yet persisted to disk) then we should be able to set the failover table branch point at the end of the complete in-memory snapshot.

This allows us to move the failover table branch point to a higher sequence number than we currently do, but still have a valid, consistent branch point, which in turn should reduce the amount of rollback a DCP consumer may need to perform when re-connecting to this newly-promoted active VB.

(See ~~MB-53172~~ for an example scenario where this was significant.)

Attachments

Issue Links

is duplicated by

MB-23451 Rollback to a point which captures full snapshot in checkpoint mgr (but not persisted)

Resolved

relates to

MB-53172 [6.6.5 build 10104] - Secondary Index rollback to zero after KV node auto failover

Resolved

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Daniel Owen

Reporter:: Dave Rigby (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Due:: 31/May/23

Created:: 29/Jul/22 8:08 AM

Updated:: 07/Jul/23 1:36 AM

Gerrit Reviews

There are no open Gerrit changes

Set failover table branch point based on in-memory state

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty