Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58072

[System Test] :- Rebalance in of kv node fails due to no_stats_for_this_vbucket error

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • 7.2.1
    • couchbase-bucket
    • Enterprise Edition 7.2.1 build 5882
    • Untriaged
    • Centos 64-bit
    • 0
    • Unknown

    Description

      Script to Repro

      ./sequoia -client 172.23.104.27:2375 -provider file:centos_pine.yml -test tests/integration/7.2/test_7.2.yml -scope tests/integration/7.2/scope_7.2_magma.yml -scale 3 -repeat 0 -log_level 0 -version 7.2.1-5882 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=1209600 -show_topology=true
      

      We have been running Longevity runs for almost 7 days now. We initially saw the issue https://issues.couchbase.com/browse/MB-57874?focusedId=698741&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-698741

      Then we saw another instance of kv rebalance failure in subsequent rebalances.

      [2023-07-31T04:08:39-07:00, sequoiatools/couchbase-cli:7.1:fe851a] server-add -c 172.23.108.103:8091 --server-add https://172.23.105.107 -u Administrator -p password --server-add-username Administrator --server-add-password password --services data
       
      [2023-07-31T04:08:51-07:00, sequoiatools/couchbase-cli:7.1:5a6558] rebalance -c 172.23.108.103:8091 -u Administrator -p password
      Error occurred on container - sequoiatools/couchbase-cli:7.1:[rebalance -c 172.23.108.103:8091 -u Administrator -p password]
       
      docker logs 5a6558
      docker start 5a6558
       
      *Unable to display progress bar on this os
      JERROR: Rebalance failed. See logs for detailed reason. You can try again.
      

      Rebalance start

      2023-07-31T04:08:45.907-07:00, ns_cluster:3:info:message(ns_1@172.23.105.107) - Node ns_1@172.23.105.107 joined cluster
      2023-07-31T04:08:46.045-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.105.107) - Hot-reloaded memcached.json for config change of the following keys: [<<"scramsha_fallback_salt">>]
      2023-07-31T04:08:52.455-07:00, ns_orchestrator:0:info:message(ns_1@172.23.108.103) - Starting rebalance, KeepNodes = ['ns_1@172.23.104.137','ns_1@172.23.104.155',
                                       'ns_1@172.23.104.157','ns_1@172.23.104.67',
                                       'ns_1@172.23.104.69','ns_1@172.23.104.70',
                                       'ns_1@172.23.105.107','ns_1@172.23.105.111',
                                       'ns_1@172.23.105.168','ns_1@172.23.106.100',
                                       'ns_1@172.23.106.188','ns_1@172.23.108.103',
                                       'ns_1@172.23.120.107','ns_1@172.23.120.245',
                                       'ns_1@172.23.121.117','ns_1@172.23.123.28',
                                       'ns_1@172.23.96.148','ns_1@172.23.96.192',
                                       'ns_1@172.23.96.252','ns_1@172.23.96.253',
                                       'ns_1@172.23.97.119','ns_1@172.23.97.121',
                                       'ns_1@172.23.97.122','ns_1@172.23.97.239',
                                       'ns_1@172.23.99.11','ns_1@172.23.99.20',
                                       'ns_1@172.23.99.21','ns_1@172.23.99.25'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = dce5f69909dffdb90a1c5de0dd4015d9
      

      Rebalance failure

      2023-07-31T09:09:40.338-07:00, ns_vbucket_mover:0:critical:message(ns_1@172.23.108.103) - Worker <0.3117.2773> (for action {move,
                                           {464,
                                            ['ns_1@172.23.97.119',
                                             'ns_1@172.23.99.25'],
                                            ['ns_1@172.23.105.107',
                                             'ns_1@172.23.99.25'],
                                            []}}) exited with reason {unexpected_exit,
                                                                      {'EXIT',
                                                                       <0.4415.2773>,
                                                                       {{dcp_wait_for_data_move_failed,
                                                                         "default",
                                                                         464,
                                                                         'ns_1@172.23.97.119',
                                                                         ['ns_1@172.23.105.107',
                                                                          'ns_1@172.23.99.25'],
                                                                         {error,
                                                                          no_stats_for_this_vbucket}},
                                                                        [{ns_single_vbucket_mover,
                                                                          '-wait_dcp_data_move/5-fun-0-',
                                                                          5,
                                                                          [{file,          
                                                                            "src/ns_single_vbucket_mover.erl"},
                                                                           {line,         
                                                                            451}]},
                                                                         {proc_lib,        
                                                                          init_p,3,
                                                                          [{file,          
                                                                            "proc_lib.erl"},
                                                                           {line,
                                                                            211}]}]}}}
      2023-07-31T09:09:40.401-07:00, ns_orchestrator:0:critical:message(ns_1@172.23.108.103) - Rebalance exited with reason {mover_crashed,
                                    {unexpected_exit,
                                     {'EXIT',<0.4415.2773>,
                                      {{dcp_wait_for_data_move_failed,"default",
                                        464,'ns_1@172.23.97.119',
                                        ['ns_1@172.23.105.107','ns_1@172.23.99.25'],
                                        {error,no_stats_for_this_vbucket}},
                                       [{ns_single_vbucket_mover,
                                         '-wait_dcp_data_move/5-fun-0-',5,
                                         [{file,"src/ns_single_vbucket_mover.erl"},
                                          {line,451}]},
                                        {proc_lib,init_p,3,
                                         [{file,"proc_lib.erl"},{line,211}]}]}}}}.
      Rebalance Operation Id = dce5f69909dffdb90a1c5de0dd4015d9
      

      cbcollect_info attached.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty