Limit the Checkpoint memory usage

Description

Currently checkpoints can take up to the entire bucket quota. That has a number of problematic side effects in DGM scenarios, eg Resident Ratio dropping to 0, high replica checkpoints memory usage contributing to replication live-locks, etc..

It's important to highlight that capping the Checkpoint quota is expected to directly solve some existing issues (like the Resident Ratio one), but the same doesn't apply to other issues (like the DCP live-lock) where the high Checkpoint mem-usage is probably just one of the factors that contribute to the problem. Other factors are being investigated and addressed in dedicated MBs.

Affects versions

None

Fix versions

Labels

Environment

None

Release Notes Description

None

Attachments

4
  • 16 Jul 2021, 03:07 PM
  • 16 Jul 2021, 01:58 PM
  • 16 Jul 2021, 01:56 PM
  • 16 Jul 2021, 01:56 PM

Activity

Show:

Paolo Cocchi January 13, 2022 at 11:16 AM

Verified by unit tests.

CB robot August 12, 2021 at 8:51 AM

Build couchbase-server-7.1.0-1133 contains kv_engine commit 0e5ae2c with commit message:
https://couchbasecloud.atlassian.net/browse/MB-46827#icft=MB-46827: Remove cursor_dropping_checkpoint_mem_<lower/upper>_mark

CB robot August 11, 2021 at 11:34 AM

Build couchbase-server-7.1.0-1130 contains kv_engine commit cb334fa with commit message:
https://couchbasecloud.atlassian.net/browse/MB-46827#icft=MB-46827: Introduce the new checkpoint memory recovery logic

CB robot July 23, 2021 at 9:02 PM

Build couchbase-server-7.1.0-1091 contains kv_engine commit cd05676 with commit message:
https://couchbasecloud.atlassian.net/browse/MB-46827#icft=MB-46827: Remove VBucketMap::getVBucketsTotalCheckpointMemoryUsage

CB robot July 23, 2021 at 9:02 PM

Build couchbase-server-7.1.0-1091 contains kv_engine commit 084a46c with commit message:
https://couchbasecloud.atlassian.net/browse/MB-46827#icft=MB-46827: Introduce KVBucket::hasCapacityInCheckpoints()

Done
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Story Points

Sprint

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created June 9, 2021 at 1:37 PM
Updated July 18, 2023 at 12:00 PM
Resolved August 11, 2021 at 11:29 AM
Instabug