Memcached crashed in CheckpointManager::expelUnreferencedCheckpointItems() during rollback

Description

Steps To Recreate:

  1. Create a 4 node cluster

  2. Create a magma bucket with (bucket_history_retention_seconds=600,bucket_history_retention_bytes=6000000000)

  3. Create 5000000 items(doc size = 256)

  4. Start new doc ops(update:expiry)

  5. Trigger compaction

  6. SIGKILL memcached once

  7. Observed Memcached crashed in CheckpointManager::expelUnreferencedCheckpointItems (this=0x7f6bcc52de40)

Note:
Though actual test is about crash recovery .Basically keep killing memcached while data loading is going on and between two sigkill test waits for cluster warmup to finish and after warmup finishes test waits for 30 to 60 before next iteration of memcached kill, so total time between two sigkills is = warmup_time+30/60 seconds) , but in the case the crash was observed after first kill itself(since crash was observed memcached was killed just once)

Core Dump was found on node 172.23.121.115

BackTrace:

QE-TEST:

Job: http://qe-jenkins1.sc.couchbase.com/job/test_suite_executor-TAF/24359/consoleFull

 

Issue

Resolution

In rare cases, after a failover or memcached restart, a replica rollback while under memory pressure might have caused a crash in the Data Service.

Memory pressure recovery logic (Item expelling) is now skipped when replica rollback is in progress.

Components

Fix versions

Labels

Environment

7.2.0-5318

Link to Log File, atop/blg, CBCollectInfo, Core dump

https://cb-engineering.s3.amazonaws.com/new.tar.gz

Release Notes Description

None

Activity

Show:

CB robot September 26, 2023 at 5:01 AM

Build capella-analytics-1.0.0-1030 contains kv_engine commit 2fac253 with commit message:
[BP] : Make ItemExpel resilient to VBucket rollback

CB robot September 25, 2023 at 11:11 AM

Build couchbase-server-7.6.0-1542 contains kv_engine commit 2fac253 with commit message:
[BP] : Make ItemExpel resilient to VBucket rollback

CB robot September 25, 2023 at 9:47 AM

Build couchbase-server-8.0.0-1416 contains kv_engine commit 2fac253 with commit message:
[BP] : Make ItemExpel resilient to VBucket rollback

CB robot September 21, 2023 at 3:04 PM

Build couchbase-server-7.2.3-6603 contains kv_engine commit 2fac253 with commit message:
[BP] : Make ItemExpel resilient to VBucket rollback

CB robot September 1, 2023 at 1:29 PM

Build couchbase-server-8.0.0-1392 contains kv_engine commit 4e1e1b7 with commit message:
: Fix comment in CM::expelUnreferencedCheckpointItems()

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Unknown

Triage

Triaged

Due date

Story Points

Sprint

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created April 27, 2023 at 3:29 AM
Updated October 11, 2024 at 7:38 AM
Resolved August 4, 2023 at 2:28 PM
Instabug