Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
7.0.0
-
Centos 7 64 bit; CB EE 7.0.0-5085
-
1
Description
Summary:
The preferred way to failover when multiple nodes are unresponsive is to failover multiple all of them at once(multiple nodes failover). So the ask here is if it is possible to force the multiple failover dialog on UI when the user attempts to failover one of the unresponsive nodes (by clicking on failover next to the server) using the failover option against the server.
(Note that this is not a quorum failover)
Elaborating the current behaviour with an example
1. Create a 5 node server .215, .217, .219, .237, .90
2. Load travel-sample with 3 replicas
3. Stop server on .217 and .219. to make these 2 nodes unresponsive.
So here's what happens currently when the user attempts to failover them individually one by one (instead of failing them over both together)
on UI:
UI didn't return any response and it seemed like it was processing the failover indefinitely without a response for a long time.
REST API:
Returns a response of unexpected server error.
on ns_server_error.log
[ns_server:error,2021-05-05T00:26:40.393-07:00,ns_1@172.23.105.215:<0.16783.3>:ns_doctor:wait_statuses_loop:251]Couldn't get statuses for ['ns_1@172.23.105.219']
|
[ns_server:error,2021-05-05T00:26:40.393-07:00,ns_1@172.23.105.215:<0.16155.3>:menelaus_util:reply_server_error:206]Server error during processing: ["web request failed",
|
{path,"/pools/default"},
|
{method,'POST'},
|
{type,error},
|
{what,
|
{badmatch,
|
{error,{timeout,['ns_1@172.23.105.219']}}}},
|
{trace,
|
[{menelaus_web_pools,
|
do_validate_memory_quota,4,
|
[{file,"src/menelaus_web_pools.erl"},
|
{line,407}]},
|
{lists,foldl,3,
|
[{file,"lists.erl"},{line,1263}]},
|
{validator,handle,4,
|
[{file,"src/validator.erl"},{line,79}]},
|
{menelaus_web_pools,
|
do_handle_pool_settings_post_loop,2,
|
[{file,"src/menelaus_web_pools.erl"},
|
{line,451}]},
|
{request_throttler,do_request,3,
|
[{file,"src/request_throttler.erl"},
|
{line,58}]},
|
{menelaus_util,handle_request,2,
|
[{file,"src/menelaus_util.erl"},
|
{line,217}]},
|
{mochiweb_http,headers,6,
|
[{file,
|
"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/mochiweb/mochiweb_http.erl"},
|
{line,150}]},
|
{proc_lib,init_p_do_apply,3,
|
[{file,"proc_lib.erl"},{line,249}]}]}]
|
[ns_server:error,2021-05-05T00:27:52.231-07:00,ns_1@172.23.105.215:<0.21777.3>:rebalance:progress:147]Couldn't reach ns_rebalance_observer
|
[ns_server:error,2021-05-05T00:28:02.609-07:00,ns_1@172.23.105.215:<0.21641.3>:ns_rebalance_observer:generic_get_call:108]Unexpected exception {exit,
|
{noproc,
|
{gen_server,call,
|
[{via,leader_registry,ns_rebalance_observer},
|
get_aggregated_progress,10000]}}}
|
[ns_server:error,2021-05-05T00:28:02.609-07:00,ns_1@172.23.105.215:<0.21641.3>:rebalance:progress:147]Couldn't reach ns_rebalance_observer
|
[ns_server:error,2021-05-05T00:28:13.282-07:00,ns_1@172.23.105.215:<0.28029.3>:ns_rebalance_observer:generic_get_call:108]Unexpected exception {exit,
|
{noproc,
|
{gen_server,call,
|
[{via,leader_registry,ns_rebalance_observer},
|
get_aggregated_progress,10000]}}}
|
(Note that failing them over one by one may still work if the bucket didn't have 3 replicas I think)