Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-20508

Node remains in pending state after system reboot

    XMLWordPrintable

Details

    Description

      During Longevity testing 4.5.1-2806
      Rebalance out failed due to a node being down, yet the node remains in cluster as Pending. Appears some sort of deadlock has occurred.

      First sign of instability were net_tick_timeout

      ]Node 'ns_1@172.23.108.103' saw that node 'ns_1@172.23.108.105' went do
      wn. Details: [{nodedown_reason, net_tick_timeout}]
       
      [ns_server:error,2016-08-11T11:38:51.820-07:00,ns_1@172.23.108.103:<0.27261.58>:ns_single_vbucket_mover:spawn_and_wait:131]Got unexpected exit signal {'EXIT',<0.27688.58>,
                                  {{nodedown,'ns_1@172.23.108.105'},
                                   {gen_server,call,
                                    [{'janitor_agent-default','ns_1@172.23.108.105'},
                                     {if_rebalance,<0.10567.58>,
                                      {wait_index_updated,227}},
                                     infinity]}}}
      

      Leading to rebalance failure with

      exited with {unexpected_exit,
                                {'EXIT',<0.27873.58>,
                                 {wait_seqno_persisted_failed,"default",225,19562,
                                  [{'ns_1@172.23.108.105',
                                    {'EXIT',
                                     {{nodedown,'ns_1@172.23.108.105'},
      

      Subsequent attempts to rebalance also fail for same reason. The down node remains in a yellow Pending state. Also the node that was being rebalanced out ('.104') remains in cluster with pending state.

      Mcd trace on .105:
      https://s3.amazonaws.com/scalability-mcafee/nodedown/mcd_trace.txt

      Cluster is still live:
      http://172.23.108.103:8091

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              tommie Tommie McAfee (Inactive)
              tommie Tommie McAfee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty