Details
-
Task
-
Resolution: Fixed
-
Major
-
None
-
0
Description
Seen in https://cv.jenkins.couchbase.com/job/ns-server-ns-test/9367
Summary error:
*failed*
|
in function failover_tests:auto_failover_t/2 (test/failover_tests.erl, line 481)
|
in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71)
|
in call from eunit_proc:run_test/1 (eunit_proc.erl, line 531)
|
in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 356)
|
in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 514)
|
in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 456)
|
in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 346)
|
in call from eunit_proc:run_group/2 (eunit_proc.erl, line 570)
|
**error:{assertNotEqual,[{module,failover_tests},
|
{line,481},
|
{expression,"proplists : get_value ( failover_complete , Counters )"},
|
{value,undefined}]}
|
output:<<"">>
|
Detailed logs:
failover_tests:365: -auto_failover_test_/0-fun-6- (Auto failover)...[ns_server:debug,2024-03-01T22:18:59.098Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:init_logic_state:249]Using auto-failover logic state {state,[],
|
[{service_state,kv,nil,false},
|
{service_state,n1ql,nil,false},
|
{service_state,index,nil,false},
|
{service_state,fts,nil,false}],
|
-1}
|
[user:info,2024-03-01T22:18:59.098Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:enable_auto_failover:448]Enabled auto-failover with timeout 1 and max count 5
|
[ns_server:debug,2024-03-01T22:18:59.098Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:update_and_save_auto_failover_state:467]No change in timeout 1
|
[ns_server:debug,2024-03-01T22:18:59.098Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:update_and_save_auto_failover_state:475]No change in max count 5
|
[ns_server:debug,2024-03-01T22:18:59.108Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:log_down_nodes_reason:435]Node q is considered down. Reason:"The cluster manager did not respond. "
|
[ns_server:debug,2024-03-01T22:18:59.108Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:log_down_nodes_reason:435]Node b is considered down. Reason:"The cluster manager did not respond. "
|
[ns_server:debug,2024-03-01T22:18:59.109Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:log_down_nodes_reason:435]Node a is considered down. Reason:"The cluster manager did not respond. "
|
[ns_server:debug,2024-03-01T22:18:59.109Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:130]Transitioned node {c,undefined} state new -> up
|
[ns_server:debug,2024-03-01T22:18:59.109Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,c,new,
|
up,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:130]Transitioned node {d,undefined} state new -> up
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,d,new,
|
up,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{a,undefined},0,new,undefined}
|
->{node_state,{a,undefined},0,half_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,a,new,
|
half_down,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{b,undefined},0,new,undefined}
|
->{node_state,{b,undefined},0,half_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,b,new,
|
half_down,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{q,undefined},0,new,undefined}
|
->{node_state,{q,undefined},0,half_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.110Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,q,new,
|
half_down,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:process_frame:205]List of candidates changed from [] to [{a,undefined},
|
{b,undefined},
|
{q,undefined}]. Resetting counter
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{a,undefined},0,half_down,undefined}
|
->{node_state,{a,undefined},0,nearly_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,a,
|
half_down,nearly_down,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{b,undefined},0,half_down,undefined}
|
->{node_state,{b,undefined},0,nearly_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,b,
|
half_down,nearly_down,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{q,undefined},0,half_down,undefined}
|
->{node_state,{q,undefined},0,nearly_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.112Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,q,
|
half_down,nearly_down,0}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.114Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{a,undefined},0,nearly_down,undefined}
|
->{node_state,{a,undefined},1,nearly_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.114Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,a,
|
nearly_down,nearly_down,1}: {error,
|
badarg}
|
[ns_server:debug,2024-03-01T22:18:59.114Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{b,undefined},0,nearly_down,undefined}
|
->{node_state,{b,undefined},1,nearly_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.114Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,b,
|
nearly_down,nearly_down,1}: {error,
|
badarg}
|
[ns_server:debug,2024-03-01T22:18:59.114Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{q,undefined},0,nearly_down,undefined}
|
->{node_state,{q,undefined},1,nearly_down,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.114Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,q,
|
nearly_down,nearly_down,1}: {error,
|
badarg}
|
[ns_server:debug,2024-03-01T22:18:59.115Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{a,undefined},1,nearly_down,undefined}
|
->{node_state,{a,undefined},1,failover,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.115Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,a,
|
nearly_down,failover,1}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.115Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{b,undefined},1,nearly_down,undefined}
|
->{node_state,{b,undefined},1,failover,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.116Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,b,
|
nearly_down,failover,1}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.116Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:log_master_activity:132]Incremented down state:
|
{node_state,{q,undefined},1,nearly_down,undefined}
|
->{node_state,{q,undefined},1,failover,undefined}
|
[ns_server:debug,2024-03-01T22:18:59.116Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_node_state_change,q,
|
nearly_down,failover,1}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.116Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover_logic:process_frame:233]Decided on following actions: [{failover,
|
[{a,undefined},
|
{b,undefined},
|
{q,undefined}]}]
|
[ns_server:debug,2024-03-01T22:18:59.116Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2366.0>:auto_failover:trigger_autofailover:636]Initiating failover request against orchestrator for [a,b,q]
|
[ns_server:debug,2024-03-01T22:18:59.193Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2355.0>:failover:start:54]Starting failover with Nodes = [a,b,q], Options = #{allow_unsafe => false,
|
auto => true,
|
down_nodes => [q,b,a],
|
failover_reasons =>
|
[{a,
|
"The cluster manager did not respond. "},
|
{b,
|
"The cluster manager did not respond. "},
|
{q,
|
"The cluster manager did not respond. "}]}
|
[user:info,2024-03-01T22:18:59.194Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:orchestrate:172]Starting failing over [a,b,q]
|
[ns_server:debug,2024-03-01T22:18:59.194Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2355.0>:failover:start:78]Failover started. Pid = <0.2370.0>
|
[ns_server:debug,2024-03-01T22:18:59.194Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:master_activity_events:submit_cast:75]Failed to send master activity event {failover,[a,b,q]}: {error,badarg}
|
[user:info,2024-03-01T22:18:59.194Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2355.0>:ns_orchestrator:handle_start_failover:1861]Starting failover of nodes [a,b,q] AllowUnsafe = false Operation Id = b5db1694a3ab7b3a7b1a52f45889adfb
|
[ns_server:info,2024-03-01T22:18:59.194Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:pre_failover_config_sync:219]Going to sync with chronicle quorum
|
[ns_server:debug,2024-03-01T22:18:59.197Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:master_activity_events:submit_cast:75]Failed to send master activity event {bucket_failover_started,"default",
|
[a,b],
|
<0.2370.0>}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.198Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:promote_max_replicas:842]Retrieved the following vbuckets information: [{1,[]}]
|
[ns_server:debug,2024-03-01T22:18:59.198Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:failover_bucket_prep:611]Original vbucket map: [[a,b,c,d]]
|
VBucket map with failover applied: [[c,d,undefined,undefined]]
|
[ns_server:debug,2024-03-01T22:18:59.199Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:master_activity_events:submit_cast:75]Failed to send master activity event {set_ff_map,"default",undefined}: {error,
|
badarg}
|
[ns_server:debug,2024-03-01T22:18:59.201Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:ns_janitor:check_states_match:399]Found states mismatch in bucket "default":
|
[{0,
|
[c,d,undefined,undefined],
|
[{c,active,[{high_prepared_seqno,1},{high_seqno,1}]},
|
{d,replica,[{high_prepared_seqno,1},{high_seqno,1}]},
|
{undefined,replica,[{high_prepared_seqno,1},{high_seqno,1}]},
|
{undefined,replica,[{high_prepared_seqno,1},{high_seqno,1}]}]}]
|
[ns_server:debug,2024-03-01T22:18:59.201Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:ns_janitor:pull_config:418]Going to pull config
|
[ns_server:debug,2024-03-01T22:18:59.202Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:master_activity_events:submit_cast:75]Failed to send master activity event {bucket_failover_ended,"default",
|
[a,b],
|
<0.2370.0>}: {error,badarg}
|
[ns_server:debug,2024-03-01T22:18:59.202Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:update_failover_vbuckets_sets:498]Updating failover_vbuckets for a with [{"default",[0]}]
|
Existing vbuckets: []
|
New vbuckets: [{"default",[0]}]
|
[ns_server:debug,2024-03-01T22:18:59.203Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:update_failover_vbuckets_sets:498]Updating failover_vbuckets for b with [{"default",[0]}]
|
Existing vbuckets: []
|
New vbuckets: [{"default",[0]}]
|
[ns_server:debug,2024-03-01T22:18:59.203Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:complete_services_failover:644]Failover nodes [a,b,q] from services [query]
|
[ns_server:debug,2024-03-01T22:18:59.209Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:ns_cluster_membership:service_clear_pending_failover:630]Clear pending failover for service query
|
[ns_server:debug,2024-03-01T22:18:59.210Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:complete_failover_service:654]Failed over service query on nodes [a,b,q] successfully
|
[user:info,2024-03-01T22:18:59.210Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:orchestrate:182]Failed over [a,b,q]: ok
|
[user:info,2024-03-01T22:18:59.210Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:failover:deactivate_nodes:237]Deactivating failed over nodes [a,b,q]
|
[ns_server:debug,2024-03-01T22:18:59.210Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:ns_cluster_membership:deactivate:227]Deactivate nodes [a,b,q]
|
[ns_server:debug,2024-03-01T22:18:59.210Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2370.0>:master_activity_events:submit_cast:75]Failed to send master activity event {failover_ended}: {error,badarg}
|
[user:info,2024-03-01T22:18:59.210Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2355.0>:ns_orchestrator:log_rebalance_completion:1661]Failover completed successfully.
|
Rebalance Operation Id = b5db1694a3ab7b3a7b1a52f45889adfb
|
*failed*
|
in function failover_tests:auto_failover_t/2 (test/failover_tests.erl, line 481)
|
in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71)
|
in call from eunit_proc:run_test/1 (eunit_proc.erl, line 531)
|
in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 356)
|
in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 514)
|
in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 456)
|
in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 346)
|
in call from eunit_proc:run_group/2 (eunit_proc.erl, line 570)
|
**error:{assertNotEqual,[{module,failover_tests},
|
{line,481},
|
{expression,"proplists : get_value ( failover_complete , Counters )"},
|
{value,undefined}]}
|
output:<<"">>
|
|
[ns_server:info,2024-03-01T22:18:59.217Z,test-yRgSvKGDx0djLt4A@b2c327656839:leader_registry<0.2319.0>:leader_registry:handle_down:286]Process <0.2368.0> registered as 'ns_rebalance_observer' terminated.
|
[ns_server:debug,2024-03-01T22:18:59.217Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_done,a,
|
"cluster_manager_down"}: {error,badarg}
|
[user:info,2024-03-01T22:18:59.218Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:log_failover_success:659]Node (a) was automatically failed over. Reason: The cluster manager did not respond.
|
[ns_server:debug,2024-03-01T22:18:59.224Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_done,b,
|
"cluster_manager_down"}: {error,badarg}
|
[user:info,2024-03-01T22:18:59.224Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:log_failover_success:659]Node (b) was automatically failed over. Reason: The cluster manager did not respond.
|
[ns_server:debug,2024-03-01T22:18:59.224Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:master_activity_events:submit_cast:75]Failed to send master activity event {autofailover_done,q,
|
"cluster_manager_down"}: {error,badarg}
|
[user:info,2024-03-01T22:18:59.224Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2356.0>:auto_failover:log_failover_success:659]Node (q) was automatically failed over. Reason: The cluster manager did not respond.
|
[ns_server:info,2024-03-01T22:18:59.229Z,test-yRgSvKGDx0djLt4A@b2c327656839:leader_registry<0.2319.0>:leader_registry:handle_down:286]Process <0.2356.0> registered as 'auto_failover' terminated.
|
[ns_server:debug,2024-03-01T22:19:00.563Z,test-yRgSvKGDx0djLt4A@b2c327656839:leader_registry<0.2464.0>:leader_registry:handle_new_leader:275]New leader is 'test-yRgSvKGDx0djLt4A@b2c327656839'. Invalidating name cache.
|
[ns_server:info,2024-03-01T22:19:00.885Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.1614.0>:misc:start_singleton:913]start_singleton(gen_server, start_link, [{via,leader_registry,
|
auto_reprovision},
|
auto_reprovision,[],[]]): started as <0.2478.0> on 'test-yRgSvKGDx0djLt4A@b2c327656839'
|
|
[ns_server:info,2024-03-01T22:19:01.227Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.1614.0>:misc:start_singleton:913]start_singleton(gen_statem, start_link, [{via,leader_registry,ns_orchestrator},
|
ns_orchestrator,[],[]]): started as <0.2500.0> on 'test-yRgSvKGDx0djLt4A@b2c327656839'
|
|
[ns_server:debug,2024-03-01T22:19:01.234Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.2501.0>:auto_failover:init:223]init auto_failover.
|
[ns_server:info,2024-03-01T22:19:01.235Z,test-yRgSvKGDx0djLt4A@b2c327656839:<0.1614.0>:misc:start_singleton:913]start_singleton(gen_server, start_link, [{via,leader_registry,auto_failover},
|
auto_failover,[],[]]): started as <0.2501.0> on 'test-yRgSvKGDx0djLt4A@b2c327656839'
|
Attachments
For Gerrit Dashboard: MB-61017 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
207455,5 | [BP] MB-61017: Wait for failover to complete before checking counters | trinity | ns_server | Status: MERGED | +2 | +1 |
207721,2 | MB-61017: Wait for failover to complete before checking counters | master | ns_server | Status: MERGED | +2 | +1 |
207818,1 | Merge remote-tracking branch 'couchbase/trinity' | master | ns_server | Status: MERGED | +2 | +1 |