Details
Description
Steps to reproduce -
- Create 4 node cluster with 2 cbas and 2 KV nodes.
- Set cbas replica to 3.
- create cbas infra like datasets, dataverses and indexes.
- Actual number of replica will be 1, as there are only 2 cbas nodes.
- Now rebalance-In 2 more CBAS nodes.
- while rebalance is happening, stop couchbase server on one of the existing cbas nodes.
- rebalance fails as expected. Verify that no data loss happened on cbas side and the actual replica number is still 1.
- start the couchbase server that was stopped in step 6.
- rebalance again.
Observation -
On checking logs on WebUI, we can see that analytics reported rebalance failure, but ns-server reported that rebalance passed.
Hot-reloaded memcached.json for config change of the following keys: [<<"scramsha_fallback_salt">>] (repeated 1 times, last seen 58.067673 secs ago)memcached_config_mgr 000ns_1@172.23.104.2179:34:06 PM 30 Jan, 2022
|
|
Rebalance completed successfully.
|
Rebalance Operation Id = f9728f5f1e58640e26f6d09702154175ns_orchestrator 000ns_1@172.23.104.1619:33:52 PM 30 Jan, 2022Analytics Service unable to successfully rebalance 929d7e5eeb4809f400ba94c836dcc0a4 due to 'HYR0003: Failure on node 08e19c5052c5c4d8f1b64ab038e93fb7'; see analytics_info.log for detailsanalytics 000ns_1@172.23.104.1799:33:51 PM 30 Jan, 2022
|
|
Bucket "EExRjRyxSYOul-4-870000" rebalance appears to be swap rebalancens_vbucket_mover 000ns_1@172.23.104.1619:33:45 PM 30 Jan, 2022
|
|
Started rebalancing bucket EExRjRyxSYOul-4-870000ns_rebalancer 000ns_1@172.23.104.1619:33:45 PM 30 Jan, 2022
|
|
Starting rebalance, KeepNodes = ['ns_1@172.23.104.217','ns_1@172.23.104.163',
|
'ns_1@172.23.104.201','ns_1@172.23.104.202',
|
'ns_1@172.23.104.161','ns_1@172.23.104.179'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = f9728f5f1e58640e26f6d09702154175ns_orchestrator 000ns_1@172.23.104.1619:33:45 PM 30 Jan, 2022
|
Attachments
Issue Links
- relates to
-
MB-30766 [System Test] Rebalance operation for any service fails because of analytics nodes rebalance error - Datasets in different partitions have different DCP states
- Closed