Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: 6.6.1
Affects Version/s: 6.6.0
Component/s: couchbase-bucket
Labels:
- approved-for-6.6.1
- request-dev-verify

Triage:
Untriaged
Story Points:
1
Is this a Regression?:
Unknown

Description

Since commit 2bd86cd710b5f8f07feb908b29b3f4a48b383b8c for ~~MB-35889~~, we no longer add key for disk checkpoints into the keyIndexes (committedKeyIndex and metaKeyIndex). This is to decrease memory usage when receiving disk snapshots as keys within the disk snapshot should have been deduped already and assumes that the disk checkpoint will always set the MARKER_FLAG_CHK flat in the DCP snapshot marks flags. However, this was only introduced in 6.5.0 which means when a 6.6.0 node has a DCP Consumer we might receive a snapshot marker with MARKER_FLAG_DISK set but not MARKER_FLAG_CHK. This then will allow the consumer to update an open memory checkpoint to a disk checkpoint if the snapshot marker's start_seqno > 0. Effectively merging the disk and memory checkpoint ranges (as can be seen below).

void PassiveStream::processMarker(SnapshotMarker* marker) {

...

        // We could be connected to a non sync-repl, so if the max-visible is

        // not transmitted (optional is false), set visible to snap-end

        auto visibleSeq =

                marker->getMaxVisibleSeqno().value_or(marker->getEndSeqno());

        if (marker->getFlags() & MARKER_FLAG_DISK && vb->getHighSeqno() == 0) {

            vb->setReceivingInitialDiskSnapshot(true);

            ckptMgr.createSnapshot(cur_snapshot_start.load(),

                                   cur_snapshot_end.load(),

                                   hcs,

                                   checkpointType,

                                   visibleSeq);

        } else {

            if (marker->getFlags() & MARKER_FLAG_CHK ||

                vb->checkpointManager->getOpenCheckpointId() == 0) {

                ckptMgr.createSnapshot(cur_snapshot_start.load(),

                                       cur_snapshot_end.load(),

                                       hcs,

                                       checkpointType,

                                       visibleSeq);

            } else {

                // If we are reconnecting then we need to update the snap end

                // and potentially the checkpoint type as We do not send the

                // CHK snapshot marker flag for disk snapshots.

                ckptMgr.updateCurrentSnapshot(

                        cur_snapshot_end.load(), visibleSeq, checkpointType);

....

Then when processing the queued_items this disk checkpoint we call Checkpoint::queueDirty(), we might find the key in the committedKeyIndex:

QueueDirtyStatus Checkpoint::queueDirty(const queued_item& qi,

                                        CheckpointManager* checkpointManager) {

..

        // Check in the appropriate key index if an item already exists.

        auto& keyIndex =

                qi->isCommitted() ? committedKeyIndex : preparedKeyIndex;

        auto it = keyIndex.find(makeIndexKey(qi));

        // Before de-duplication could discard a delete, store the largest

        // "rev-seqno" encountered

        if (qi->isDeleted() &&

            qi->getRevSeqno() > maxDeletedRevSeqno.value_or(0)) {

            maxDeletedRevSeqno = qi->getRevSeqno();

        // Check if this checkpoint already has an item for the same key

        // and the item has not been expelled.

        if (it != keyIndex.end()) {

            if (it->second.mutation_id > highestExpelledSeqno) {

                // Normal path - we haven't expelled the item. We have a valid

                // cursor position to read the item and make our de-dupe checks.

                const auto currPos = it->second.position;

                if (!(canDedup(*currPos, qi))) {

                    return QueueDirtyStatus::FailureDuplicateItem;

....

                addItemToCheckpoint(qi);

                // Reduce the size of the checkpoint by the size of the

                // item being removed.

                queuedItemsMemUsage -= ((*currPos)->size());

                // Remove the existing item for the same key from the list.

                toWrite.erase(currPos);

            } else {

Then the key we find in the commitedKeyIndex (currPos, is erased from the toWrite CheckpointQueue. However, this still leaves a value in the commitKeyIndex. As we don't update the committedKeyIndex 's entry to point to the new queued_item and leave it pointing to the freed queued_item that was in the toWrite queue. Due to the fact that isDiskCheckpoint() will return true.

    if (qi->getKey().size() > 0 && !isDiskCheckpoint()) {

        ChkptQueueIterator last = end();

        // --last is okay as the list is not empty now.

        index_entry entry = {--last, qi->getBySeqno()};

        // Set the index of the key to the new item that is pushed back into

        // the list.

        if (qi->isCheckPointMetaItem()) {

            // Insert the new entry into the metaKeyIndex

            auto result = metaKeyIndex.emplace(makeIndexKey(qi), entry);

            if (!result.second) {

                // Did not manage to insert - so update the value directly

                result.first->second = entry;

        } else {

            // Insert the new entry into the keyIndex

            auto& keyIndex =

                    qi->isCommitted() ? committedKeyIndex : preparedKeyIndex;

            auto result = keyIndex.emplace(makeIndexKey(qi), entry);

            if (!result.second) {

                // Did not manage to insert - so update the value directly

                result.first->second = entry;

Fix:
To fix this we need to ensure that we continue adding values to the keyIndexes if the checkpoint was a CheckpointType::Memory and then is updated to CheckpointType::Disk by CheckpointManager::updateCurrentSnapshot().

Attachments

Issue Links

relates to

MB-35889 Replication can get stuck when checkpoint memory overhead is very high

Closed

MB-42780 [Upgrade] Rebalance_in failed with reason "bulk_set_vbucket_state_failed :: sync_shutdown_many_i_am_trapping_exits"

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Ashwin Govindarajulu

Reporter:: Richard deMellow

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 03/Sep/20 8:20 AM

Updated:: 15/Jan/21 2:28 AM

Resolved:: 15/Sep/20 9:07 AM

Gerrit Reviews

There are no open Gerrit changes

Show There are 4 closed Gerrit changes

Hide There are 4 closed Gerrit changes

MB-41283: Fix crash due to keyIndexes pointing to freed queued_items: Gerrit Review:

Merge branch 'mad-hatter' into master: Gerrit Review:

MB-42780: Logically revert MB-41283: Gerrit Review:

MB-42780: Logically revert MB-41283: Gerrit Review:

Crash in checkpoint code due to keyIndexes pointing to freed queued_items

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty