DeleteWithMeta endlessly attempts to bgfetch tempNonExistent items

Description

delWithMeta (used by xdcr/backup/replication internally) may get stuck in a loop of attempting to bgfetch a value for a non-existent item in order to to preserve xattrs.

See the two noted MBs for context on why this is here.

Scenario as follows:

Item with system xattrs is deleted, pruned value (retaining the system attrs) is persisted, value removed from the HashTable
Incoming request queues a meta bgfetch, adds a tempInitial item in the HashTable
meta bgfetch finds the delete, calls restoreMeta which will set the datatype and some other meta fields, changing the item from tempInitial->tempDeleted. This is still not considered resident.
TIme passes, and the delete is purged from disk
A non-meta (i.e., wants the value too) bgfetch is requested
bgfetch finds nothing on disk, changes the item from tempDeleted->tempNonExistent

Now we have a temp item, with datatype set (could be xattrs), but nothing on disk for that key.

The above snippet would request a bgfetch for that item, but no further progress will be made - there is nothing on disk to fetch, and the item already indicates this as it is tempNonExistent. The fetcher would read from disk, and notify the op it has completed, the frontend would attempt the delWithMeta again, queuing another bgfetch etc.

This most obviously manifests as one (or more) front-end memcached threads spinning at 100% CPU.

Issue	Resolution
XDCR or restore from backup entered an endless loop if attempting to overwrite a document which was deleted or expired some time ago with a deleteWithMeta operation. This was due to a specific unanticipated state in memory which increased CPU usage, and connection became unusable for further operations.	deleteWithMeta is now resilient to temporary non-existent values with xattr datatype.

Components

Affects versions

Fix versions

7.2.1

7.1.5

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Linked issues

is caused by

MB-36087

DeleteWithMeta against a value evicted xattr document crashes memcached

relates

MB-57020

memcached should detect and abort stuck commands

Activity

Show:

CB robot August 23, 2023 at 3:01 PM

Build couchbase-server-8.0.0-1384 contains kv_engine commit 6e4a296 with commit message:
: Ensure deleteWithMeta does not bgfetch for non-existent items

CB robot August 23, 2023 at 3:01 PM

Build couchbase-server-8.0.0-1384 contains kv_engine commit 23419cb with commit message:
,,: Merge branch 7.1.4 into 7.1.x

CB robot August 23, 2023 at 3:01 PM

Build couchbase-server-8.0.0-1384 contains kv_engine commit 7e098a5 with commit message:
,,: Merge branch '7.1.4' into neo

CB robot August 23, 2023 at 1:59 PM

Build couchbase-server-7.6.0-1402 contains kv_engine commit 6e4a296 with commit message:
: Ensure deleteWithMeta does not bgfetch for non-existent items

CB robot August 23, 2023 at 1:59 PM

Build couchbase-server-7.6.0-1402 contains kv_engine commit 23419cb with commit message:
,,: Merge branch 7.1.4 into 7.1.x

Fixed

Pinned fields

Click on the next to a field label to start pinning.

Details
Assignee
Ashwin Govindarajulu
Reporter
James Harrison(Deactivated)
Is this a Regression?
No
Triage
Untriaged
Due date
May 22, 2023
Story Points
0
Priority
Major
Instabug
Open Instabug

PagerDuty

Sentry

Zendesk Support

Created May 17, 2023 at 1:21 PM

Updated September 2, 2024 at 11:13 AM

Resolved May 25, 2023 at 3:31 PM

Configure

Instabug

DeleteWithMeta endlessly attempts to bgfetch tempNonExistent items

Description

Components

Affects versions

Fix versions

Labels

Environment

Link to Log File, atop/blg, CBCollectInfo, Core dump

Release Notes Description

Linked issues

is caused by

relates

Activity

CB robot August 23, 2023 at 3:01 PM

CB robot August 23, 2023 at 3:01 PM

CB robot August 23, 2023 at 3:01 PM

CB robot August 23, 2023 at 1:59 PM

CB robot August 23, 2023 at 1:59 PM

DetailsAssigneeAshwin GovindarajuluAshwin GovindarajuluReporterJames HarrisonJames Harrison(Deactivated)Is this a Regression?NoTriageUntriagedDue dateMay 22, 2023Story Points0PriorityMajorInstabugOpen Instabug

Details

Assignee

Reporter

Is this a Regression?

Triage

Due date

Story Points

Priority

Instabug

PagerDutyPagerDuty Incident

PagerDuty

Sentry Linked Issues

Sentry

Zendesk SupportLinked Tickets

Zendesk Support

Details
Assignee
Ashwin Govindarajulu
Reporter
James Harrison(Deactivated)
Is this a Regression?
No
Triage
Untriaged
Due date
May 22, 2023
Story Points
0
Priority
Major
Instabug
Open Instabug

PagerDuty

Sentry

Zendesk Support