- Take a 3 search node cluster.
- Create a search index with 1 partition and 2 replicas
- Remove one node in the cluster (via rebalance out)
- Upon rebalance completion, the rebalance button remains active (expected because the search index's constraints cannot be met - 1 replica missing)
- Now, add back the node into the cluster and rebalance
- At this point we expect the rebalance button to NOT be active anymore because the search index's constraints are met, but logs indicate otherwise.
p.s I haven't been able to reproduce this on cluster_run. But Jon Strabala has reproduced it on aws instances. Attached are the 3 search node logs.
Rebalance starts:
2023-01-05T20:18:00.136Z, ns_orchestrator:0:info:message(ns_1@10.0.0.23) - Starting rebalance, KeepNodes = ['ns_1@10.0.0.114','ns_1@10.0.0.144',
|
'ns_1@10.0.0.145','ns_1@10.0.0.163',
|
'ns_1@10.0.0.23','ns_1@10.0.0.251',
|
'ns_1@10.0.0.30','ns_1@10.0.0.54',
|
'ns_1@10.0.0.95'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 8278626a19cceac5319915bbc5fc62ed
|
After rebalance, ns_server determines 3 search nodes in the cluster ..
{{service_map,fts},
|
{['ns_1@10.0.0.163','ns_1@10.0.0.30','ns_1@10.0.0.54'],
|
{<<"44b39b4e34f1baea1a88bc58b3a40000">>,2329}}}
|
ns_server checks with the search service on the system status. Now while search detects there's 3 nodes in the cluster, it complains on "could not meet replication constraints" which appears to be the problem.
[json_rpc:debug,2023-01-05T20:18:07.042Z,ns_1@10.0.0.163:json_rpc_connection-fts-service_api<0.23444.8>:json_rpc_connection:handle_call:156]sending jsonrpc call:{[{jsonrpc,<<"2.0">>},
|
{id,107},
|
{method,<<"ServiceAPI.GetCurrentTopology">>},
|
{params,[{[{rev,<<"MjE=">>},{timeout,30000}]}]}]}
|
[json_rpc:debug,2023-01-05T20:18:07.056Z,ns_1@10.0.0.163:json_rpc_connection-fts-service_api<0.23444.8>:json_rpc_connection:handle_info:89]got response: [{<<"id">>,107},
|
{<<"result">>,
|
{[{<<"rev">>,<<"MjI=">>},
|
{<<"nodes">>,
|
[<<"03544ee7c1e491540ba818097f01fca9">>,
|
<<"e636f4410dab1791c07ecce7aa6bb5a7">>,
|
<<"ea6ef01e396c22f9c796bca19d77dc1e">>]},
|
{<<"isBalanced">>,false},
|
{<<"messages">>,
|
[<<"warning: resource: \"ts02_fts_01\" -- could not meet replication constraints">>]}]}},
|
{<<"error">>,null}]
|
And it seems ONLY one of the 3 nodes is complaining ..
10.0.0.30 |
2023-01-05T20:18:01.186+00:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: 13, changed, rv: &{Rev:[49 52] Nodes:[03544ee7c1e491540ba818097f01fca9 e636f4410dab1791c07ecce7aa6bb5a7 ea6ef01e396c22f9c796bca19d77dc1e] IsBalanced:true Messages:[]}
|
2023-01-05T20:18:07.042+00:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: 14, changed, rv: &{Rev:[49 53] Nodes:[e636f4410dab1791c07ecce7aa6bb5a7 ea6ef01e396c22f9c796bca19d77dc1e 03544ee7c1e491540ba818097f01fca9] IsBalanced:true Messages:[]}
|
2023-01-05T20:18:07.056+00:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: 15, changed, rv: &{Rev:[49 54] Nodes:[03544ee7c1e491540ba818097f01fca9 e636f4410dab1791c07ecce7aa6bb5a7 ea6ef01e396c22f9c796bca19d77dc1e] IsBalanced:true Messages:[]}
|
|
/mnt/datadisk/index/@fts:
|
total 4
|
1385168992 drwxrwx--- 4 couchbase couchbase 95 Jan 5 20:23 .
|
1383071840 drwxrwx--- 4 couchbase couchbase 33 Jan 5 19:01 ..
|
1385168993 -rw------- 1 couchbase couchbase 32 Jan 5 20:11 cbft.uuid
|
1388314720 drwx------ 2 couchbase couchbase 73 Jan 5 20:18 planPIndexes
|
1545601120 drwx------ 3 couchbase couchbase 86 Jan 5 20:15 ts02_fts_01_5e10ee0c69fc1b62_4c1c5584.pindex
|
|
10.0.0.54 |
2023-01-05T20:18:01.209+00:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: , changed, rv: &{Rev:[52] Nodes:[03544ee7c1e491540ba818097f01fca9 e636f4410dab1791c07ecce7aa6bb5a7 ea6ef01e396c22f9c796bca19d77dc1e] IsBalanced:true Messages:[]}
|
|
/mnt/datadisk/index/@fts:
|
total 4
|
1460666464 drwxrwx--- 4 couchbase couchbase 95 Jan 5 20:23 .
|
1383071840 drwxrwx--- 4 couchbase couchbase 33 Jan 5 19:27 ..
|
1460666465 -rw------- 1 couchbase couchbase 32 Jan 5 20:18 cbft.uuid
|
1461715040 drwx------ 2 couchbase couchbase 73 Jan 5 20:18 planPIndexes
|
1654653024 drwx------ 3 couchbase couchbase 86 Jan 5 20:18 ts02_fts_01_5e10ee0c69fc1b62_4c1c5584.pindex
|
|
10.0.0.163 |
2023-01-05T20:18:01.181+00:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: 19, changed, rv: &{Rev:[50 48] Nodes:[03544ee7c1e491540ba818097f01fca9 e636f4410dab1791c07ecce7aa6bb5a7 ea6ef01e396c22f9c796bca19d77dc1e] IsBalanced:false Messages:[warning: resource: "ts02_fts_01" -- could not meet replication constraints]}
|
|
/mnt/datadisk/index/@fts:
|
total 4
|
1385168992 drwxrwx--- 4 couchbase couchbase 95 Jan 5 20:23 .
|
1383071840 drwxrwx--- 4 couchbase couchbase 33 Jan 5 19:01 ..
|
1385168993 -rw------- 1 couchbase couchbase 32 Jan 5 20:10 cbft.uuid
|
1483735136 drwx------ 2 couchbase couchbase 73 Jan 5 20:18 planPIndexes
|
1557135456 drwx------ 3 couchbase couchbase 86 Jan 5 20:15 ts02_fts_01_5e10ee0c69fc1b62_4c1c5584.pindex
|
|
|