Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.2.3
-
Untriaged
-
0
-
Unknown
-
Analytics Sprint 32
Description
Found this server issue in during 7.2.3 Capella testing -
An aws cluster with ami - couchbase-cloud-server-7.2.3-6705-x86_64-v1.0.24 failed to rebalance after a node was randomly killed and a new node spawned up but the rebalance failed.
recurring error in the server logs-
Rebalance exited with reason {service_rebalance_failed,cbas, {worker_died, {'EXIT',<0.22011.134>, {rebalance_failed,
}}}}.
cluster can be found here - https://ui.sbx-3.sandbox.nonprod-project-avengers.com/database/datatools?oid=259d212d-002f-40cb-9d87-dcc138110c8c&pid=42270d0b-d978-4f19-b3d1-c833193668fc&dbid=acc8336e-afcf-46d3-bec1-8acafa6dd124
dd logs - https://app.datadoghq.com/logs?query=%40clusterId%3Aacc8336e-afcf-46d3-bec1-8acafa6dd124 &cols=host%2Cservice&index=*&messageDisplay=inline&refresh_mode=sliding&stream_sort=desc&viz=stream&from_ts=1699521768839&to_ts=1699525368839&live=true
server logs -
https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2023-11-09T095131-ns_1%40svc-dqisea-node-001.t9msz82bn5isouab.sandbox.nonprod-project-avengers.com.zip
corresponding av - https://couchbasecloud.atlassian.net/browse/AV-67126
Attachments
Issue Links
For Gerrit Dashboard: MB-59582 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
201900,2 | MB-59582: short circuit waitForSeqnos w/ length 0 | neo | analytics-dcp-client | Status: MERGED | +2 | +1 |
201901,7 | MB-59582: disregard seqno differences > collection high seqno in kv master | neo | cbas-core | Status: MERGED | +2 | +1 |