Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-21976

node is in broken state after rebalance out( Node is unknown to this cluster)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 5.0.0
    • 4.6.0
    • couchbase-bucket
    • None
    • 4.6.0-3552
    • Untriaged
    • No

    Description

      http://qa.sc.couchbase.com/job/centos-systest-launcher/571/console  (-test tests/fts/test_ftsScaleUpDown.yml  -scope tests/fts/scope_partialInitNodes.yml)

      when 172.23.108.98 node rebalance out it became in broken state and I can't open Web Console

       

      Node 'ns_1@172.23.108.98' is leaving cluster. ns_cluster 001 ns_1@172.23.108.98 9:51:05 AM Wed Dec 14, 2016
      Shutting down bucket "default" on 'ns_1@172.23.108.98' for deletion ns_memcached 000 ns_1@172.23.108.98 9:51:02 AM Wed Dec 14, 2016
      Reset auto-failover count auto_failover 000 ns_1@172.23.108.103 9:51:00 AM Wed Dec 14, 2016
      Rebalance completed successfully.
      ns_orchestrator 001 ns_1@172.23.108.103 9:51:00 AM Wed Dec 14, 2016
      Bucket "default" loaded on node 'ns_1@172.23.108.100' in 0 seconds. ns_memcached 000 ns_1@172.23.108.100 9:44:37 AM Wed Dec 14, 2016
      Bucket "default" loaded on node 'ns_1@172.23.108.96' in 0 seconds. ns_memcached 000 ns_1@172.23.108.96 9:44:37 AM Wed Dec 14, 2016
      Shutting down bucket "other-1" on 'ns_1@172.23.108.98' for deletion ns_memcached 000 ns_1@172.23.108.98 9:44:11 AM Wed Dec 14, 2016
      Bucket "default" rebalance does not seem to be swap rebalance ns_vbucket_mover 000 ns_1@172.23.108.103 9:44:06 AM Wed Dec 14, 2016
      Started rebalancing bucket default ns_rebalancer 000 ns_1@172.23.108.103 9:44:05 AM Wed Dec 14, 2016
      Bucket "other-1" loaded on node 'ns_1@172.23.108.100' in 0 seconds. ns_memcached 000 ns_1@172.23.108.100 9:40:29 AM Wed Dec 14, 2016
      Bucket "other-1" loaded on node 'ns_1@172.23.108.96' in 0 seconds. ns_memcached 000 ns_1@172.23.108.96 9:40:29 AM Wed Dec 14, 2016
      Shutting down bucket "other-2" on 'ns_1@172.23.108.98' for deletion ns_memcached 000 ns_1@172.23.108.98 9:40:04 AM Wed Dec 14, 2016
      Bucket "other-1" rebalance does not seem to be swap rebalance ns_vbucket_mover 000 ns_1@172.23.108.103 9:39:59 AM Wed Dec 14, 2016
      Started rebalancing bucket other-1 ns_rebalancer 000 ns_1@172.23.108.103 9:39:58 AM Wed Dec 14, 2016
      Bucket "other-2" loaded on node 'ns_1@172.23.108.96' in 0 seconds. ns_memcached 000 ns_1@172.23.108.96 9:35:23 AM Wed Dec 14, 2016
      Bucket "other-2" loaded on node 'ns_1@172.23.108.100' in 0 seconds. ns_memcached 000 ns_1@172.23.108.100 9:35:23 AM Wed Dec 14, 2016
      Node ns_1@172.23.108.100 joined cluster ns_cluster 003 ns_1@172.23.108.100 9:35:14 AM Wed Dec 14, 2016
      Couchbase Server has started on web port 8091 on node 'ns_1@172.23.108.100'. Version: "4.6.0-3552-enterprise". menelaus_sup 001 ns_1@172.23.108.100 9:35:14 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.108' saw that node 'ns_1@172.23.108.100' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.108 9:35:10 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.97' saw that node 'ns_1@172.23.108.100' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.97 9:35:10 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.107' saw that node 'ns_1@172.23.108.100' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.107 9:35:10 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.96' saw that node 'ns_1@172.23.108.100' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.96 9:35:10 AM Wed Dec 14, 2016
      Node ns_1@172.23.108.96 joined cluster ns_cluster 003 ns_1@172.23.108.96 9:35:01 AM Wed Dec 14, 2016
      Couchbase Server has started on web port 8091 on node 'ns_1@172.23.108.96'. Version: "4.6.0-3552-enterprise". menelaus_sup 001 ns_1@172.23.108.96 9:35:01 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.97' saw that node 'ns_1@172.23.108.96' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.97 9:34:57 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.108' saw that node 'ns_1@172.23.108.96' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.108 9:34:57 AM Wed Dec 14, 2016
      Node 'ns_1@172.23.108.107' saw that node 'ns_1@172.23.108.96' came up. Tags: [] ns_node_disco 004 ns_1@172.23.108.107 9:34:57 AM Wed Dec 14, 2016
      Bucket "other-2" rebalance does not seem to be swap rebalance ns_vbucket_mover 000 ns_1@172.23.108.103 9:34:53 AM Wed Dec 14, 2016
      Started rebalancing bucket other-2 ns_rebalancer 000 ns_1@172.23.108.103 9:34:51 AM Wed Dec 14, 2016
      Starting rebalance, KeepNodes = ['ns_1@172.23.108.100','ns_1@172.23.108.103',
      'ns_1@172.23.108.104','ns_1@172.23.108.107',
      'ns_1@172.23.108.108','ns_1@172.23.108.96',
      'ns_1@172.23.108.97'], EjectNodes = ['ns_1@172.23.108.98'], Failed over and being ejected nodes = []; no delta recovery nodes
      ns_orchestrator 004 ns_1@172.23.108.103 9:34:51 AM Wed Dec 14, 2016
       

       

       

      curl -u Administrator:password 172.23.108.98:8091/nodes/self
      "Node is unknown to this cluster."

       

       

       
      bucket delete
      [ns_server:error,2016-12-14T09:43:33.885-08:00,ns_1@172.23.108.98:ns_doctor<0.2071.0>:ns_doctor:update_status:308]The following buckets became not ready on node 'ns_1@172.23.108.97': ["other-1"], those of them are active ["other-1"]
      [ns_server:error,2016-12-14T09:51:31.979-08:00,ns_1@172.23.108.98:<0.8471.3>:timeout_diag_logger:do_diag:118]Got timeout {slow_bucket_stop,{{single_bucket_kv_sup,"default"},
      <0.3984.1>,supervisor,
      [single_bucket_kv_sup]}}
      Processes snapshot is:
       
      [ns_server:error,2016-12-14T09:51:31.979-08:00,ns_1@172.23.108.98:<0.8471.3>:timeout_diag_logger:do_diag:120]
      {<0.8334.3>,
      {'EXIT',
      {function_clause,
      [{proplists,get_value,
      [backtrace,undefined,undefined],
      [\{file,"proplists.erl"},\{line,225}]},
      {diag_handler,grab_process_info,1,
      [\{file,"src/diag_handler.erl"},\{line,139}]},
      {timeout_diag_logger,'do_diag/1-fun-0',2,
      [\{file,"src/timeout_diag_logger.erl"},\{line,116}]},
      {lists,foldl,3,[\{file,"lists.erl"},\{line,1248}]},
      {timeout_diag_logger,do_diag,1,
      [\{file,"src/timeout_diag_logger.erl"},\{line,115}]},
      {proc_lib,init_p,3,[\{file,"proc_lib.erl"},\{line,224}]}]}}}

       

      Attachments

        1. 172.23.108.100.zip
          10.19 MB
        2. 172.23.108.103.zip
          18.90 MB
        3. 172.23.108.104.zip
          6.90 MB
        4. 172.23.108.105.zip
          816 kB
        5. 172.23.108.107.zip
          2.01 MB
        6. 172.23.108.108.zip
          1.83 MB
        7. 172.23.108.96.zip
          9.42 MB
        8. 172.23.108.97.zip
          18.25 MB
        9. 172.23.108.98.zip
          19.16 MB
        10. 172.23.108.99.zip
          858 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              owend Daniel Owen
              andreibaranouski Andrei Baranouski
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty