Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46233

Newly added 7.0 node to 6.6.2 cluster in UI looks like its part of the cluster even before rebalance

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Yes

    Description

      Scripts to Repro
      1. Run the 6.6.2 longevity test for 3 days.

      ./sequoia -client 172.23.96.162:2375 -provider file:centos_third_cluster.yml -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml -scale 3 -repeat 0 -log_level 0 -version 6.6.2-9588 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      2. It had 27 nodes at the end of the test. See
      3. Added 6 7.0.0(172.23.105.102,172.23.105.62,172.23.106.232,172.23.106.239,172.23.106.37, 172.23.106.246) nodes and removed 6 node from 6.6.2(172.23.110.75,172.23.110.76,172.23.105.61,172.23.106.191,172.23.106.209,172.23.106.70)
      to do a swap rebalance of all the services(1 of each kind).

      So, I see 2 problems, Not sure if they are related or not. We can split up the bug if it is.

      Problem 1
      As soon as node 172.23.105.102 gets added to the cluster, it becomes the master node for some reason. See . This is not a problem. However when I see the servers page customer won't know this node is added to the cluster but not part of the cluster as we haven't rebalanced yet.

      See

      I was searching for the newly added node and was wondering what happened, then I realised that the message New Node | Not taking traffic | Add pending rebalance is not displayed next to 172.23.105.102 causing confusion. This can cause lot of confusion especially when you are upgrading a cluster as large as this one.

      Problem 2
      Not sure if its related to the above one or not. If you see the step 3, it is a swap rebalance. So ideally when I check the UI logs, I should see 6 nodes in the EjectNodes list and all the 33 nodes(27 + 6 newly added) in KeepNodes list. I see 33 nodes in KeepNodes but not the 6 in EjectNodes. EjectNodes is empty list.

      Servers page before rebalance -
      Logs post rebalance( Operation Id = 1ed7c9c9575434e42719ba91ea5ccac2) - See

      cbcollect_info attached.

      Attachments

        For Gerrit Dashboard: MB-46233
        # Subject Branch Project Status CR V

        Activity

          People

            Balakumaran.Gopal Balakumaran Gopal
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty