DeleteWithMeta endlessly attempts to bgfetch tempNonExistent items

Description

delWithMeta (used by xdcr/backup/replication internally) may get stuck in a loop of attempting to bgfetch a value for a non-existent item in order to to preserve xattrs.

See the two noted MBs for context on why this is here.

Scenario as follows:

  • Item with system xattrs is deleted, pruned value (retaining the system attrs) is persisted, value removed from the HashTable

  • Incoming request queues a meta bgfetch, adds a tempInitial item in the HashTable

  • meta bgfetch finds the delete, calls restoreMeta which will set the datatype and some other meta fields, changing the item from tempInitial->tempDeleted. This is still not considered resident.

  • TIme passes, and the delete is purged from disk

  • A non-meta (i.e., wants the value too) bgfetch is requested

  • bgfetch finds nothing on disk, changes the item from tempDeleted->tempNonExistent

Now we have a temp item, with datatype set (could be xattrs), but nothing on disk for that key.

The above snippet would request a bgfetch for that item, but no further progress will be made - there is nothing on disk to fetch, and the item already indicates this as it is tempNonExistent. The fetcher would read from disk, and notify the op it has completed, the frontend would attempt the delWithMeta again, queuing another bgfetch etc.

This most obviously manifests as one (or more) front-end memcached threads spinning at 100% CPU.

 

Issue

Resolution

XDCR or restore from backup entered an endless loop if attempting to overwrite a document which was deleted or expired some time ago with a deleteWithMeta operation. This was due to a specific unanticipated state in memory which increased CPU usage, and connection became unusable for further operations.

deleteWithMeta is now resilient to temporary non-existent values with xattr datatype.

Components

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Activity

Show:

CB robot August 23, 2023 at 3:01 PM

Build couchbase-server-8.0.0-1384 contains kv_engine commit 6e4a296 with commit message:
: Ensure deleteWithMeta does not bgfetch for non-existent items

CB robot August 23, 2023 at 3:01 PM

Build couchbase-server-8.0.0-1384 contains kv_engine commit 23419cb with commit message:
,,: Merge branch 7.1.4 into 7.1.x

CB robot August 23, 2023 at 3:01 PM

Build couchbase-server-8.0.0-1384 contains kv_engine commit 7e098a5 with commit message:
,,: Merge branch '7.1.4' into neo

CB robot August 23, 2023 at 1:59 PM

Build couchbase-server-7.6.0-1402 contains kv_engine commit 6e4a296 with commit message:
: Ensure deleteWithMeta does not bgfetch for non-existent items

CB robot August 23, 2023 at 1:59 PM

Build couchbase-server-7.6.0-1402 contains kv_engine commit 23419cb with commit message:
,,: Merge branch 7.1.4 into 7.1.x

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

No

Triage

Untriaged

Due date

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created May 17, 2023 at 1:21 PM
Updated September 2, 2024 at 11:13 AM
Resolved May 25, 2023 at 3:31 PM
Instabug