[BP to 7.2.x] - Indexer rebalance seems to be hung since 7+ hours.

Description

QE test

There are 10 databases in the cluster. The GSI nodes were scaled from 2 to 4 due to the 2 old nodes over HWM. During the scaling operation, GSI is internally doing a indexer rebalance which seems to be hanging from many hours

 

Issue

Resolution

During scaling, an GSI indexer rebalance froze and did not successfully complete. This was because an index snapshot was not correctly deleted and recreated.

A flag now handles snapshots to ensure they are correctly deleted or recreated when indexes are updated during rebalancing.

Components

Affects versions

Fix versions

Labels

Environment

7.5.0-4002

Link to Log File, atop/blg, CBCollectInfo, Core dump

http://supportal.couchbase.com/snapshot/765b1c97d61fba57bb48c07d58de3abe::0 s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-001.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-002.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-003.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-010.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-011.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-012.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-008.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-009.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-013.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-014.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-q-node-004.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-q-node-005.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-s-node-006.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-s-node-007.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip

Release Notes Description

None

Activity

Show:

Varun Velamuri August 28, 2023 at 2:27 PM

Pavan PB August 10, 2023 at 6:16 AM

, the 7.1.5 ticket was validated via code instrumentation. Tagging this with request-dev-verify label. The system test runs have not seen any rebalance hang issues.

CB robot June 14, 2023 at 1:31 AM

Build couchbase-server-7.2.1-5788 contains indexing commit 9d51282 with commit message:
Do not destroy a snapshot that is already deleted

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Unknown

Triage

Untriaged

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created April 6, 2023 at 3:37 AM
Updated September 18, 2023 at 2:04 PM
Resolved June 13, 2023 at 10:40 PM
Instabug