Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
7.6.2
-
Enterprise Edition 7.6.2 build 3674
-
Untriaged
-
-
0
-
Unknown
Description
steps
1. create a 6 node cluster
172.23.104.107, 172.23.104.235, 172.23.104.241, 172.23.104.250, 172.23.96.183, 172.23.96.197
2. add some failover delay
curl -k https://Administrator:password@localhost:18091/diag/eval -X POST -d 'testconditions:set(failover_end, {delay, 30000})'
|
3. having autofailover timeout - 20 and max Events - 5
4. bringing down 172.23.104.107 and failover starts
[user:info,2024-05-27T09:31:10.346-07:00,ns_1@172.23.96.197:<0.20145.22>:ns_orchestrator:handle_start_failover:1863]Starting failover of nodes ['ns_1@172.23.104.107'] AllowUnsafe = false Operation Id = 71295d207d1c7738dc921afc3a5a84c9
|
[ns_server:info,2024-05-27T09:31:10.346-07:00,ns_1@172.23.96.197:<0.23084.262>:failover:pre_failover_config_sync:223]Going to sync with chronicle quorum
|
[ns_server:info,2024-05-27T09:31:10.565-07:00,ns_1@172.23.96.197:<0.23323.262>:ns_janitor:sanify_chain:670]Setting vbucket 0 in "bucket-0" on 'ns_1@172.23.104.250' from replica to active.
|
[ns_server:info,2024-05-27T09:31:10.565-07:00,ns_1@172.23.96.197:<0.23323.262>:ns_janitor:sanify_chain:670]Setting vbucket 1 in "bucket-0" on
|
5. bringing down 3 more nodes simultaneously while current failover going on 172.23.104.235, 172.23.104.241, 172.23.96.183
6. .107 Autofailed over and as expected server reports following reason for subsequent failover not getting triggered
Could not automatically fail over nodes (['ns_1@172.23.96.183', 'ns_1@172.23.104.241', 'ns_1@172.23.104.235']). Could not contact majority of servers. Orchestration may be compromised.
|
its stuck in a state where rebalance keeps getting triggered
/pools/default/tasks from remaining nodes displays rebalance running with rebalance id keeps on updating
[{"statusId":"3afbc40a79e60b03651a7f1e815bec09","type":"rebalance","subtype":"failover","recommendedRefreshPeriod":0.25,"status":"running","progress":0,"perNode":{},"detailedProgress":{},"stageInfo":{},"rebalanceId":"c0689618a0408efda88a456a608e5ffd","nodesInfo":{"active_nodes":["ns_1@172.23.104.235","ns_1@172.23.104.241","ns_1@172.23.104.250","ns_1@172.23.96.183","ns_1@172.23.96.197"],"failover_nodes":["ns_1@172.23.96.183","ns_1@172.23.104.241","ns_1@172.23.104.235"],"master_node":"ns_1@172.23.96.197"},"masterNode":"ns_1@172.23.96.197"}]
|
[chronicle:info,2024-05-27T10:16:56.380-07:00,ns_1@172.23.96.197:chronicle_leader<0.18979.22>:chronicle_leader:handle_election_result:698]Election failed: {error,{no_quorum,['ns_1@172.23.104.250', 'ns_1@172.23.96.197'], {6,'ns_1@172.23.96.197'}}} [chronicle:info,2024-05-27T10:16:56.695-07:00,ns_1@172.23.96.197:<0.23731.267>:chronicle_leader:do_election_worker:892]Starting election. History ID: <<"6b34e0cd8bb9a15d550f1b01bf2a0b53">> Log position: {{6,'ns_1@172.23.96.197'},11917} Peers: ['ns_1@172.23.104.250','ns_1@172.23.104.241','ns_1@172.23.96.183', 'ns_1@172.23.104.235','ns_1@172.23.96.197'] Required quorum: {majority,{set,5,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[], []}, {{['ns_1@172.23.104.250'], ['ns_1@172.23.104.241'], [], ['ns_1@172.23.96.183'], [], ['ns_1@172.23.104.235'], [], ['ns_1@172.23.96.197'], [],[],[],[],[],[],[],[]}}}} [chronicle:info,2024-05-27T10:16:56.703-07:00,ns_1@172.23.96.197:chronicle_leader<0.18979.22>:chronicle_leader:handle_election_result:698]Election failed: {error,{no_quorum,['ns_1@172.23.104.250', 'ns_1@172.23.96.197'], {6,'ns_1@172.23.96.197'}}} [ns_server:error,2024-05-27T10:16:58.126-07:00,ns_1@172.23.96.197:<0.20838.267>:leader_activities:report_error:944]Activity {default,failover} failed with error {no_quorum, [{required_quorum,majority}, {leases, ['ns_1@172.23.104.250', 'ns_1@172.23.96.197']}]} [ns_server:info,2024-05-27T10:16:58.128-07:00,ns_1@172.23.96.197:leader_registry<0.19516.22>:leader_registry:handle_down:286]Process <0.23361.267> registered as 'ns_rebalance_observer' terminated. [chronicle:info,2024-05-27T10:16:58.413-07:00,ns_1@172.23.96.197:<0.23020.267>:chronicle_leader:do_election_worker:892]Starting election. History ID: <<"6b34e0cd8bb9a15d550f1b01bf2a0b53">> Log position: {{6,'ns_1@172.23.96.197'},11917} Peers: ['ns_1@172.23.104.250','ns_1@172.23.104.241','ns_1@172.23.96.183', 'ns_1@172.23.104.235','ns_1@172.23.96.197'] Required quorum: {majority,{set,5,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
|
|
|
also when I bring up nodes all 3 of them gets failed over .