Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30194

[system test] rebalance out of an analytics node failed

    XMLWordPrintable

Details

    • Untriaged
    • No
    • CX Sprint 109, CX Sprint 110, CX Sprint 111, CX Sprint 113

    Description

      Build : 6.0.0-1261
      Test : -test tests/analytics/test_analytics_rebalance.yml -scope tests/analytics/scope_analytics_rebalance.yml
      Iteration : 3rd iteration
      Scale : 40

      In the 3rd iteration of the test, there is a rebalance failure when trying to remove one analytics node from the cluster. This step passed in the first 2 iterations. The following is seen in the diag logs.

      Rebalance exited with reason {service_rebalance_failed,cbas,
      {rebalance_failed,

      {service_error, <<"Rebalance 77429a92bfd53114e7ab2684207ed992 failed, see analytics log for details">>}

      }}

      The analytics logs have rolled over since there are constant kv ops and queries running. The following is seen in the debug.log:

      =========================INFO REPORT=========================
      {net_kernel,{net_kernel,875,nodedown,'ns_1@172.23.104.96'}}
      [ns_server:debug,2018-06-20T07:35:10.061-07:00,ns_1@172.23.104.17:ns_config_log<0.164.0>:ns_config_log:log_common:227]config change:
      {local_changes_count,<<"23e661baddf3ef30f1fd83f2604f5282">>} ->
      [{'_vclock',[{<<"23e661baddf3ef30f1fd83f2604f5282">>,{50,63696724510}}]}]
      [ns_server:debug,2018-06-20T07:35:10.061-07:00,ns_1@172.23.104.17:ns_config_log<0.164.0>:ns_config_log:log_common:227]config change:
      {metakv,<<"/cbas/balanced/77429a92bfd53114e7ab2684207ed992">>} ->
      [{'_vclock',[{<<"23e661baddf3ef30f1fd83f2604f5282">>,{1,63696724510}}]}|
       <<"\"unbalanced\"">>]
      [ns_server:debug,2018-06-20T07:35:10.062-07:00,ns_1@172.23.104.17:ns_config_rep<0.7742.0>:ns_config_rep:do_push_keys:330]Replicating some config keys ([{local_changes_count,
                                         <<"23e661baddf3ef30f1fd83f2604f5282">>},
                                     {metakv,
                                         <<"/cbas/balanced/77429a92bfd53114e7ab2684207ed992">>}]..)
      [ns_server:debug,2018-06-20T07:35:10.065-07:00,ns_1@172.23.104.17:service_agent-cbas<0.8641.0>:service_agent:cleanup_service:501]Cleaning up stale tasks:
      [[{<<"rev">>,<<"OTkw">>},
        {<<"id">>,<<"rebalance/77429a92bfd53114e7ab2684207ed992">>},
        {<<"type">>,<<"task-rebalance">>},
        {<<"status">>,<<"task-failed">>},
        {<<"isCancelable">>,true},
        {<<"progress">>,0},
        {<<"errorMessage">>,
         <<"Rebalance 77429a92bfd53114e7ab2684207ed992 failed, see analytics log for details">>},
        {<<"extra">>,
         {[{<<"cc-ejecting">>,false},
           {<<"rebalanceId">>,<<"77429a92bfd53114e7ab2684207ed992">>}]}}]]
      [ns_server:error,2018-06-20T07:35:10.067-07:00,ns_1@172.23.104.17:service_agent-cbas<0.8641.0>:service_agent:handle_call:182]Got rebalance-only call {if_rebalance,<21020.11748.58>,unset_rebalancer} that doesn't match rebalancer pid undefined
      [ns_server:info,2018-06-20T07:35:10.083-07:00,ns_1@172.23.104.17:<0.12272.19>:diag_handler:log_all_dcp_stats:195]logging dcp stats
      [ns_server:info,2018-06-20T07:35:10.084-07:00,ns_1@172.23.104.17:<0.12272.19>:diag_handler:log_all_dcp_stats:199]end of logging dcp stats
      [ns_server:debug,2018-06-20T07:35:10.090-07:00,ns_1@172.23.104.17:ns_config_log<0.164.0>:ns_config_log:log_common:227]config change:
      counters ->
      [{'_vclock',[{<<"935a9b78efd6d77b557b56f820c122ad">>,{24,63696724510}}]},
       {rebalance_fail,1},
       {rebalance_start,12},
       {rebalance_success,11}]
      
      

      The node that is being removed is 172.23.104.21.

      Supportal link (has link to logs): https://supportal.couchbase.com/snapshot/636ba169d816cb0a6b01675521a481fe::0

      Attachments

        For Gerrit Dashboard: MB-30194
        # Subject Branch Project Status CR V

        Activity

          People

            Abdullah.Alamoudi Abdullah Alamoudi [X] (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty