[checkpoint] Allocation in replica checkpoints can push the system to hard OOM

Description

Problem

There are multiple scenarios where replica checkpoints might allocate most of the memory on a node in a state where that memory is not releasable. That can result in hard OOM and possible deadlock in scenarios like rebalance or bulk load.

is an example of livelock at rebalance. That shows that without on-going mutations we can end-up with replica disk checkpoint being stuck in the open state, which means that we cannot recover all the memory associated with them.

While those scenarios are uncommon on on-premise envs, the system breaks quite quickly on many, small bucket envs if someone attempts simple loads with (eg) low memory quotas and bigger-than-usual doc sizes.

Original proposal

Due to the (current) invariant / assumption there’s always one open checkpoint - hence cannot close the last one (even though we have the last marker) as we don’t know what the seqnos for the next checkpoint are going to be.

If we relaxed that for replicas (which I think makes sense given they are essentially slaved to the active) then we could close the checkpoint as soon as the last mutation arrives - and hence remove that checkpoint once it’s unreferenced.

This only works for disk checkpoints as we need to know checkpoint ends not snap ends.

Final proposal

Force-closing the open checkpoint at replica comes with its own issues, see historical conversation in comments for details.

In the end we solve by allowing ItemExpel to remove all the mutations in checkpoints.
Note that, differently from the original proposal, the ItemExpel fix is wider-scoped and isn't restricted to Disk Checkpoints. So that improves our memory-recovery ability on Memory Checkpoints too and any similar issue caused by those.

Issue	Resolution
The last item in a replica checkpoint was not expelled. In scenarios such as large average item size, high numbers of replicas or low Bucket quota could result in a data-node entering an unrecoverable Out-of-Memory state.	ItemExpel has been enhanced to release all the items in a checkpoint when memory conditions allow.

Components

Affects versions

6.5.1

6.6.6

7.1.3

Fix versions

7.2.1

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

Linked issues

Activity

Show:

CB robot September 20, 2023 at 5:45 AM

Build capella-analytics-1.0.0-1025 contains kv_engine commit 632b63d with commit message:
: Don't reuse touched-by-expel checkpoint in CM::createSnapshot

CB robot September 20, 2023 at 5:45 AM

Build capella-analytics-1.0.0-1025 contains kv_engine commit 39abd19 with commit message:
: Ensure no logic change in CM::getSnapshotInfo()

CB robot September 20, 2023 at 5:45 AM

Build capella-analytics-1.0.0-1025 contains kv_engine commit f4d1bab with commit message:
: Ensure no logic change in CM::getVisibleSnapshotEndSeqno()

CB robot September 19, 2023 at 10:43 AM

Build couchbase-server-8.0.0-1410 contains kv_engine commit 632b63d with commit message:
: Don't reuse touched-by-expel checkpoint in CM::createSnapshot

CB robot September 19, 2023 at 10:43 AM

Build couchbase-server-8.0.0-1410 contains kv_engine commit 39abd19 with commit message:
: Ensure no logic change in CM::getSnapshotInfo()

Fixed

Pinned fields

Click on the next to a field label to start pinning.

Details
Assignee
Ashwin Govindarajulu
Reporter
Daniel Owen
Is this a Regression?
Yes
Triage
Triaged
Due date
Aug 11, 2023
Story Points
1
Sprint
None
Priority
Critical
Instabug
Open Instabug

PagerDuty

Sentry

Zendesk Support

Created May 13, 2020 at 3:41 PM

Updated September 20, 2023 at 5:45 AM

Resolved August 11, 2023 at 1:29 PM

Configure

Instabug

[checkpoint] Allocation in replica checkpoints can push the system to hard OOM

Description

Problem

Original proposal

Final proposal

Components

Affects versions

Fix versions

Labels

Environment

Link to Log File, atop/blg, CBCollectInfo, Core dump

Release Notes Description

Attachments

Linked issues

is duplicated by

relates

relates to

Activity

CB robot September 20, 2023 at 5:45 AM

CB robot September 20, 2023 at 5:45 AM

CB robot September 20, 2023 at 5:45 AM

CB robot September 19, 2023 at 10:43 AM

CB robot September 19, 2023 at 10:43 AM

DetailsAssigneeAshwin GovindarajuluAshwin GovindarajuluReporterDaniel OwenDaniel OwenIs this a Regression?YesTriageTriagedDue dateAug 11, 2023Story Points1SprintNone+3PriorityCriticalInstabugOpen Instabug

Details

Assignee

Reporter

Is this a Regression?

Triage

Due date

Story Points

Sprint

Priority

Instabug

PagerDutyPagerDuty Incident

PagerDuty

Sentry Linked Issues

Sentry

Zendesk SupportLinked Tickets

Zendesk Support

Details
Assignee
Ashwin Govindarajulu
Reporter
Daniel Owen
Is this a Regression?
Yes
Triage
Triaged
Due date
Aug 11, 2023
Story Points
1
Sprint
None
Priority
Critical
Instabug
Open Instabug

PagerDuty

Sentry

Zendesk Support