Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.0.1
-
Untriaged
-
-
1
-
Yes
Description
Build : 7.0.1-5977
Test : -test tests/2i/cheshirecat/test_idx_clusterops_cheshire_cat_recovery.yml -scope tests/2i/cheshirecat/scope_idx_cheshire_cat_dgm.yml
Scale : 2
Iteration : 8th (day 3)
There were 12 rebalance operations (including rebalance retries) that failed between 2021-08-05T07:15:28 & 2021-08-05T08:09:44 due to the following error :
[ns_server:error,2021-08-05T07:15:28.689-07:00,ns_1@172.23.97.215:service_rebalancer-index<0.13433.1863>:service_rebalancer:run_rebalance_worker:119]Worker terminated abnormally: {'EXIT',<0.22636.1863>,
|
{{badmatch,
|
{error,
|
{unknown_error,
|
<<"Protocol Conflict Error: Existing Rebalance Token Found">>}}},
|
[{service_rebalancer,rebalance_worker,1,
|
[{file,"src/service_rebalancer.erl"},
|
{line,164}]},
|
{proc_lib,init_p,3,
|
[{file,"proc_lib.erl"},{line,234}]}]}}
|
[user:error,2021-08-05T07:15:28.692-07:00,ns_1@172.23.97.215:<0.9048.0>:ns_orchestrator:log_rebalance_completion:1416]Rebalance exited with reason {service_rebalance_failed,index,
|
{worker_died,
|
{'EXIT',<0.22636.1863>,
|
{{badmatch,
|
{error,
|
{unknown_error,
|
<<"Protocol Conflict Error: Existing Rebalance Token Found">>}}},
|
[{service_rebalancer,rebalance_worker,1,
|
[{file,"src/service_rebalancer.erl"},
|
{line,164}]},
|
{proc_lib,init_p,3,
|
[{file,"proc_lib.erl"},{line,234}]}]}}}}.
|
Rebalance Operation Id = 4ffb583616e3db29eaada8060814357e
|
The indexer nodes in the cluster are :
172.23.107.2, 172.23.107.3, 172.23.107.4, 172.23.107.5, 172.23.97.216, 172.23.97.217
On 172.23.97.217, following can be seen in the indexer logs around the time of the above rebalance failure :
2021-08-05T07:15:28.619-07:00 [Info] ServiceMgr::StartTopologyChange {a710619e06a78fafe47a00bf5001c163 [] topology-change-rebalance [{{74489e779980eda2f0e670ca180abc6d 5 <nil>} recovery-full} {{5fa598444337c8d73f779b6e8bef8a84 5 <nil>} recovery-full} {{f7e7ed8fefd9cd788d594b6dcc4ad22c 5 <nil>} recovery-full} {{732bbd597e5e5f841da2d912f49a0961 5 <nil>} recovery-full} {{db5cedf056b862d55cd091c0d82299d5 5 <nil>} recovery-full} {{4897f2a4f003b2716d736860c60f007b 5 <nil>} recovery-full}] []}
|
2021-08-05T07:15:28.635-07:00 [Info] ServiceMgr::cleanupOrphanTokens Found Rebalance Token &{74489e779980eda2f0e670ca180abc6d a5:ae:d8:21:c3:7a:a2:11 MoveIndex move index failure - index build is in progress for indexes: [bucket5:idx2_ASLCO4Z36M_idxprefix]. }
|
2021-08-05T07:15:28.637-07:00 [Info] updator: updating service map. server group=Group 1, indexerVersion=5 nodeAddr 172.23.97.217:8091 clusterVersion 5 excludeNode storageMode 2
|
2021-08-05T07:15:28.663-07:00 [Error] ServiceMgr::startRebalance Found Existing Global RToken &{74489e779980eda2f0e670ca180abc6d a5:ae:d8:21:c3:7a:a2:11 MoveIndex move index failure - index build is in progress for indexes: [bucket5:idx2_ASLCO4Z36M_idxprefix]. }
|
2021-08-05T07:15:28.663-07:00 [Info] ServiceMgr::runCleanupPhase path /indexing/rebalance/RebalanceToken isMaster true
|
2021-08-05T07:15:28.684-07:00 [Info] ServiceMgr::cleanupLocalRToken Cleanup
|
2021-08-05T07:15:28.684-07:00 [Info] ClustMgr:handleDelLocalValue Key RebalanceToken
|
2021-08-05T07:15:28.685-07:00 [Info] ServiceMgr::cleanupRebalanceRunning Cleanup
|
2021-08-05T07:15:28.685-07:00 [Info] ClustMgr:handleDelLocalValue Key RebalanceRunning
|
2021-08-05T07:15:28.686-07:00 [Info] ServiceMgr::StartTopologyChange returns Error Protocol Conflict Error: Existing Rebalance Token Found. isBalanced false.
|
This issue is similar to MB-46489 which was fixed in 7.0.0
Attachments
Issue Links
- backports to
-
MB-47827 [BP 7.0.2 MB-47775] - [System Test] multiple subsequent rebalance failures due to error "Protocol Conflict Error: Existing Rebalance Token Found"
- Closed