[BP to 7.2.x] - Indexer rebalance seems to be hung since 7+ hours.
Description
Components
Affects versions
Fix versions
Labels
Environment
7.5.0-4002
Link to Log File, atop/blg, CBCollectInfo, Core dump
http://supportal.couchbase.com/snapshot/765b1c97d61fba57bb48c07d58de3abe::0
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-001.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-002.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-003.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-010.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-011.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-d-node-012.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-008.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-009.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-013.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-i-node-014.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-q-node-004.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-q-node-005.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-s-node-006.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
s3://cb-customers-secure/index_rebl_hung/2023-03-28/collectinfo-2023-03-28t043534-ns_1@svc-s-node-007.vo-mjwawd04g3-v.sandbox.nonprod-project-avengers.com.zip
Release Notes Description
None
Activity
Show:

Varun Velamuri August 28, 2023 at 2:27 PM
Verified this issue based on code-instrumentation as per the steps mentioned in: https://couchbasecloud.atlassian.net/browse/MB-56169?focusedCommentId=859307&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel

Pavan PB August 10, 2023 at 6:16 AM
, the 7.1.5 ticket was validated via code instrumentation. Tagging this with request-dev-verify label. The system test runs have not seen any rebalance hang issues.

CB robot June 14, 2023 at 1:31 AM
Build couchbase-server-7.2.1-5788 contains indexing commit 9d51282 with commit message:
Do not destroy a snapshot that is already deleted
Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details
Assignee
Pavan PBPavan PBReporter
Varun VelamuriVarun VelamuriIs this a Regression?
UnknownTriage
UntriagedStory Points
0Priority
CriticalInstabug
Open Instabug
Details
Details
Assignee

Reporter

Is this a Regression?
Unknown
Triage
Untriaged
Story Points
0
Priority
Instabug
Open Instabug
PagerDuty
PagerDuty Incident
PagerDuty
PagerDuty Incident
PagerDuty

PagerDuty Incident
Sentry
Linked Issues
Sentry
Linked Issues
Sentry
Linked Issues
Zendesk Support
Linked Tickets
Zendesk Support
Linked Tickets
Zendesk Support

Linked Tickets
Created April 6, 2023 at 3:37 AM
Updated September 18, 2023 at 2:04 PM
Resolved June 13, 2023 at 10:40 PM
Instabug
QE test
There are 10 databases in the cluster. The GSI nodes were scaled from 2 to 4 due to the 2 old nodes over HWM. During the scaling operation, GSI is internally doing a indexer rebalance which seems to be hanging from many hours
Issue
Resolution
During scaling, an GSI indexer rebalance froze and did not successfully complete. This was because an index snapshot was not correctly deleted and recreated.
A flag now handles snapshots to ensure they are correctly deleted or recreated when indexes are updated during rebalancing.