Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
7.6.0
-
None
-
Untriaged
-
0
-
Unknown
Description
The number of rebalance hangs while adding/removing nodes to a cluster seems to occur more frequently. Here's a case where I did
- ./cluster_run -n 3 --dont-rename
- ./cluster_connect -n 1 -s 1024 -I 512 -M plasma -T n0:kv
- from UI add second node to cluster
- from UI add third node to cluster
- click on rebalance
After two minutes of hung progress I took cbcollects of all three nodes
https://s3.amazonaws.com/cb-engineering/stevewatanabe/RebalanceHang16Oct2023/collectinfo-2023-10-17T000553-n_0%40127.0.0.1.zip
https://s3.amazonaws.com/cb-engineering/stevewatanabe/RebalanceHang16Oct2023/collectinfo-2023-10-17T000553-n_1%40127.0.0.1.zip
https://s3.amazonaws.com/cb-engineering/stevewatanabe/RebalanceHang16Oct2023/collectinfo-2023-10-17T000553-n_2%40127.0.0.1.zip
and did a screenshot
rebalanceProgress showed
$ xcurl localhost:9000/pools/default/rebalanceProgress | jq
|
{
|
"status": "running",
|
"n_2@127.0.0.1": {
|
"progress": 0.6
|
},
|
"n_1@127.0.0.1": {
|
"progress": 0
|
},
|
"n_0@127.0.0.1": {
|
"progress": 0.4090909090909091
|
}
|
}
|
Attachments
Issue Links
- duplicates
-
MB-59089 [Magma] :- Swap rebalance + CRUD on collections/data hangs on 1 DGM bucket
- Closed