Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.0.0
-
Untriaged
-
No
-
CX Sprint 109, CX Sprint 110, CX Sprint 111, CX Sprint 113
Description
Build : 6.0.0-1261
Test : -test tests/analytics/test_analytics_rebalance.yml -scope tests/analytics/scope_analytics_rebalance.yml
Iteration : 3rd iteration
Scale : 40
In the 3rd iteration of the test, there is a rebalance failure when trying to remove one analytics node from the cluster. This step passed in the first 2 iterations. The following is seen in the diag logs.
Rebalance exited with reason {service_rebalance_failed,cbas,
{rebalance_failed,
}}
The analytics logs have rolled over since there are constant kv ops and queries running. The following is seen in the debug.log:
=========================INFO REPORT=========================
|
{net_kernel,{net_kernel,875,nodedown,'ns_1@172.23.104.96'}}
|
[ns_server:debug,2018-06-20T07:35:10.061-07:00,ns_1@172.23.104.17:ns_config_log<0.164.0>:ns_config_log:log_common:227]config change:
|
{local_changes_count,<<"23e661baddf3ef30f1fd83f2604f5282">>} ->
|
[{'_vclock',[{<<"23e661baddf3ef30f1fd83f2604f5282">>,{50,63696724510}}]}]
|
[ns_server:debug,2018-06-20T07:35:10.061-07:00,ns_1@172.23.104.17:ns_config_log<0.164.0>:ns_config_log:log_common:227]config change:
|
{metakv,<<"/cbas/balanced/77429a92bfd53114e7ab2684207ed992">>} ->
|
[{'_vclock',[{<<"23e661baddf3ef30f1fd83f2604f5282">>,{1,63696724510}}]}|
|
<<"\"unbalanced\"">>]
|
[ns_server:debug,2018-06-20T07:35:10.062-07:00,ns_1@172.23.104.17:ns_config_rep<0.7742.0>:ns_config_rep:do_push_keys:330]Replicating some config keys ([{local_changes_count,
|
<<"23e661baddf3ef30f1fd83f2604f5282">>},
|
{metakv,
|
<<"/cbas/balanced/77429a92bfd53114e7ab2684207ed992">>}]..)
|
[ns_server:debug,2018-06-20T07:35:10.065-07:00,ns_1@172.23.104.17:service_agent-cbas<0.8641.0>:service_agent:cleanup_service:501]Cleaning up stale tasks:
|
[[{<<"rev">>,<<"OTkw">>},
|
{<<"id">>,<<"rebalance/77429a92bfd53114e7ab2684207ed992">>},
|
{<<"type">>,<<"task-rebalance">>},
|
{<<"status">>,<<"task-failed">>},
|
{<<"isCancelable">>,true},
|
{<<"progress">>,0},
|
{<<"errorMessage">>,
|
<<"Rebalance 77429a92bfd53114e7ab2684207ed992 failed, see analytics log for details">>},
|
{<<"extra">>,
|
{[{<<"cc-ejecting">>,false},
|
{<<"rebalanceId">>,<<"77429a92bfd53114e7ab2684207ed992">>}]}}]]
|
[ns_server:error,2018-06-20T07:35:10.067-07:00,ns_1@172.23.104.17:service_agent-cbas<0.8641.0>:service_agent:handle_call:182]Got rebalance-only call {if_rebalance,<21020.11748.58>,unset_rebalancer} that doesn't match rebalancer pid undefined
|
[ns_server:info,2018-06-20T07:35:10.083-07:00,ns_1@172.23.104.17:<0.12272.19>:diag_handler:log_all_dcp_stats:195]logging dcp stats
|
[ns_server:info,2018-06-20T07:35:10.084-07:00,ns_1@172.23.104.17:<0.12272.19>:diag_handler:log_all_dcp_stats:199]end of logging dcp stats
|
[ns_server:debug,2018-06-20T07:35:10.090-07:00,ns_1@172.23.104.17:ns_config_log<0.164.0>:ns_config_log:log_common:227]config change:
|
counters ->
|
[{'_vclock',[{<<"935a9b78efd6d77b557b56f820c122ad">>,{24,63696724510}}]},
|
{rebalance_fail,1},
|
{rebalance_start,12},
|
{rebalance_success,11}]
|
|
The node that is being removed is 172.23.104.21.
Supportal link (has link to logs): https://supportal.couchbase.com/snapshot/636ba169d816cb0a6b01675521a481fe::0
Attachments
For Gerrit Dashboard: MB-30194 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
96915,5 | MB-30472, MB-30194 Increase timeout on rebalance disconnect | master | asterix-opt | Status: MERGED | +2 | +1 |