Details
-
Bug
-
Resolution: Unresolved
-
Major
-
7.6.0
-
Untriaged
-
-
0
-
Unknown
Description
this is happening intermittently.
steps to repro
1. create a 4 node cluster
172.23.105.191
|
172.23.104.235
|
172.23.104.234
|
172.23.104.236
|
2. stop couchbase service in 3 nodes
172.23.104.234
|
172.23.104.236
|
172.23.105.191
|
3. wait for autofailover timeout , and as expected AFO did not get trigger.
[ns_server:info,2024-03-10T23:56:21.204-07:00,ns_1@172.23.104.235:leader_registry<0.11245.200>:leader_registry:handle_down:286]Process <0.32271.200> registered as 'ns_rebalance_observer' terminated.
|
[user:info,2024-03-10T23:56:21.204-07:00,ns_1@172.23.104.235:<0.26000.200>:auto_failover:report_failover_error:710]Could not automatically fail over nodes (['ns_1@172.23.104.234']). Could not contact majority of servers. Orchestration may be compromised.
|
4. hard failing over all 3 nodes with allow safe = true from .235
[ns_server:info,2024-03-10T23:58:06.248-07:00,ns_1@172.23.104.235:leader_registry<0.11245.200>:leader_registry:handle_down:286]Process <0.5797.201> registered as 'ns_rebalance_observer' terminated.
|
[ns_server:info,2024-03-10T23:58:06.250-07:00,ns_1@172.23.104.235:<0.8161.201>:failover:restore_chronicle_quorum:122]Attempting quorum loss failover of = ['ns_1@172.23.104.234',
|
'ns_1@172.23.104.236',
|
'ns_1@172.23.105.191']
|
[ns_server:info,2024-03-10T23:58:06.250-07:00,ns_1@172.23.104.235:<0.25276.200>:chronicle_master:do_handle_call:136]Starting quorum failover with opaque {#Ref<0.1041056013.2727608321.221361>,
|
['ns_1@172.23.104.234',
|
'ns_1@172.23.104.236',
|
'ns_1@172.23.105.191']}, keeping nodes ['ns_1@172.23.104.235']
|
[chronicle:info,2024-03-10T23:58:06.252-07:00,ns_1@172.23.104.235:chronicle_leader<0.10670.200>:chronicle_leader:handle_new_history:531]History changed to <<"4d76aed31e0f1fba4206aa868d5a073f">>. Becoming an observer.
|
[ns_server:error,2024-03-10T23:58:16.252-07:00,ns_1@172.23.104.235:<0.25276.200>:chronicle_master:do_handle_call:142]Unsuccesfull quorum loss failover. (no_leader).
|
172.16.1.176 - Administrator/UI [10/Mar/2024:23:58:16 -0700] "POST /controller/startFailover HTTP/1.1" 500 34 "http://172.23.104.235:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36" 13000
|