Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.0.3
-
7.1.0-2534
-
Untriaged
-
Centos 64-bit
-
-
1
-
No
-
CX Sprint 285
Description
Steps to reproduce -
1. Have a 5 node cluster like below -
Node | Services | CPU_utilization | Mem_total | Mem_free | Swap_mem_used | Active / Replica | Version |
172.23.105.19 | cbas | 0.251256281407 | 3.91 GiB | 3.06 GiB | 39.00 MiB / 3.50 GiB | 0 / 0 | 6.6.4-9961-enterprise |
172.23.105.31 | index, kv, n1ql | 0.753768844221 | 3.91 GiB | 3.38 GiB | 80.50 MiB / 3.50 GiB | 0 / 0 | 6.6.4-9961-enterprise |
172.23.105.20 | cbas | 0.501253132832 | 3.91 GiB | 3.13 GiB | 114.57 MiB / 3.50 GiB | 0 / 0 | 6.6.4-9961-enterprise |
172.23.105.244 | index, kv, n1ql | 0.503778337531 | 3.91 GiB | 3.41 GiB | 39.00 MiB / 3.50 GiB | 0 / 0 | 6.6.4-9961-enterprise |
172.23.105.245 | cbas | 1.00250626566 | 3.91 GiB | 3.12 GiB | 94.25 MiB / 3.50 GiB | 0 / 0 | 6.6.4-9961-enterprise |
2. Create a single KV bucket and load data into it.
3. Create following CBAS infra - 2 dataverses, 8 datasets and 3 indexes.
4. Now upgrade each node in the cluster using "online swap upgrade" method. Cluster should look something like below -
Nodes | Services | Version | CPU | Status | Membership / Recovery |
172.23.105.19 | cbas | 7.1.0-2534-enterprise | 0.777331995988 | Cluster node | active / none |
172.23.105.20 | index, kv, n1ql | 7.1.0-2534-enterprise | 1.15577889447 | Cluster node | active / none |
172.23.105.244 | cbas | 7.1.0-2534-enterprise | 0.57701956849 | Cluster node | active / none |
172.23.105.245 | index, kv, n1ql | 7.1.0-2534-enterprise | 1.15490836053 | Cluster node | active / none |
172.23.105.24 | cbas | 7.1.0-2534-enterprise | 1.44927536232 | Cluster node | active / none |
5. Rebalance cluster again to enable CBAS service.
6. Validate pre-upgrade CBAS infra is still intact and no data loss happened.
7. Enable CBAS replicas and set it to 3.
8. Rebalance for replica to take effect.
9. Load more docs in the bucket that was created before the upgrade and verify the ingestion into datasets is happening as expected.
10. Now delete all the data from the bucket and verify that the data was flushed from datasets.
11. Create new scopes and collections in the existing bucket and load data.
12. Create new buckets, scopes and collections and load data.
13. Create new CBAS infra - 10 dataverses, 30 datasets, 10 synonyms and 5 indexes.
14. Verify ingestion completed for all the newly created datasets.
15. Failover one of the CBAS nodes (except the CBAS CC node).
16. CBAS service crash is observed. CBAS does not come up
17. Rebalancing after adding the failed over node also throws error.
Analytics Service unable to successfully rebalance 474e248ff3cee0964d49d60bdbb255d2 due to 'java.lang.InterruptedException: sleep interrupted'; see analytics_info.log for details
|
|
Analytics Service unable to successfully rebalance bcd9bae69f337714db5221bfa407208d due to 'java.lang.Exception: replica 3@172.23.105.19:9120 failed'; see analytics_info.log for details
|
|
Rebalance exited with reason {{badmatch,failed},
|
[{ns_rebalancer,rebalance_body,5,
|
[{file,"src/ns_rebalancer.erl"},
|
{line,508}]},
|
{async,'-async_init/4-fun-1-',3,
|
[{file,"src/async.erl"},{line,191}]}]}.
|
Rebalance Operation Id = 37586dd42292b66579d4bf0104f975ef
|
|
Analytics Service unable to successfully rebalance adb71312dce14443412608f6b5731c08 due to 'java.lang.Exception: replica 3@172.23.105.19:9120 failed'; see analytics_info.log for details
|
This issue was found after adding support for N2N encryption in upgrade tests.
Attachments
Issue Links
- is a backport of
-
MB-51642 [Upgrade Test] CBAS service keeps crashing when one of the CBAS nodes is failed over
- Closed
- links to