Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48448

[System Test] - Rebalance button is enabled post upgrade from 6.6.3 -> 7.0.2

    XMLWordPrintable

Details

    • Triaged
    • 1
    • No

    Description

      Steps to Repro
      1. Run the following longevity script on 6.6.3 for 5 days.

      ./sequoia -client 172.23.104.254:2375 -provider file:centos_second_cluster.yml -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml -scale 3 -repeat 0 -log_level 0 -version 6.6.3-9808 -skip_setup=true -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      At this point it should have a 27 node cluster ( 9 Kv, 6 Index, 3 analytics, 3 fts, 3 eventing and 3 n1ql)
      2. Create 10k metakv tombstones. This has been part of our testing since MB-44838 was fixed. We used to have a total of around 25k for CC, have reduced it here to around 12k.

       #!/bin/sh
      for i in {0..10000}
          do
              `curl -X PUT -u Administrator:password http://localhost:8091/_metakv/key{$i} -d 'value=foo1'`
              `curl -X DELETE -v -u Administrator:password http://localhost:8091/_metakv/key{$i}`
          done       
      

      3. Swap rebalance 6 nodes , 1 of each service with that of 7.0.2 nodes. Rebalance goes through successfully.
      4. Failover 6 nodes(6.6.3 nodes)1 of each service(kv is graceful failover), Upgrade these nodes to 7.0.2, do a recovery of all the 6 node(kv is delta recovery) and rebalance.
      5. Repeat step no 4 until all the nodes in cluster are upgraded to 7.0.2.

      After upgrade I validated that all the metakv tombstones are purged and enabled IPv4 only + enforce tls using the below commands.

      [root@localhost logs]# grep 'ns_config tombstone' debug.log
      [ns_server:debug,2021-09-14T04:26:34.722-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 11869 ns_config tombstone(s) up to timestamp 63798837690. Tombstones:
      [ns_server:debug,2021-09-14T04:30:38.296-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798837934. Tombstones:
      [ns_server:debug,2021-09-14T04:31:41.731-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 127 ns_config tombstone(s) up to timestamp 63798837998. Tombstones:
      [ns_server:debug,2021-09-14T04:39:42.419-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838481. Tombstones:
      [ns_server:debug,2021-09-14T04:40:43.040-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838542. Tombstones:
      [root@localhost logs]# 
      [root@localhost logs]#  /opt/couchbase/bin/couchbase-cli ip-family -c http://localhost:8091 -u Administrator -p password --set --ipv4only
      Switched IP family for node: http://172.23.106.134:8091
      Switched IP family for node: http://172.23.106.136:8091
      Switched IP family for node: http://172.23.106.137:8091
      Switched IP family for node: http://172.23.106.138:8091
      Switched IP family for node: http://172.23.120.58:8091
      Switched IP family for node: http://172.23.120.73:8091
      Switched IP family for node: http://172.23.120.74:8091
      Switched IP family for node: http://172.23.120.75:8091
      Switched IP family for node: http://172.23.120.77:8091
      Switched IP family for node: http://172.23.120.81:8091
      Switched IP family for node: http://172.23.120.86:8091
      Switched IP family for node: http://172.23.121.118:8091
      Switched IP family for node: http://172.23.121.77:8091
      Switched IP family for node: http://172.23.123.24:8091
      Switched IP family for node: http://172.23.123.25:8091
      Switched IP family for node: http://172.23.123.26:8091
      Switched IP family for node: http://172.23.123.31:8091
      Switched IP family for node: http://172.23.123.32:8091
      Switched IP family for node: http://172.23.123.33:8091
      Switched IP family for node: http://172.23.96.122:8091
      Switched IP family for node: http://172.23.96.14:8091
      Switched IP family for node: http://172.23.96.243:8091
      Switched IP family for node: http://172.23.97.105:8091
      Switched IP family for node: http://172.23.97.148:8091
      Switched IP family for node: http://172.23.97.149:8091
      Switched IP family for node: http://172.23.97.150:8091
      Switched IP family for node: http://172.23.97.151:8091
      SUCCESS: Switched IP family of the cluster
      [root@localhost logs]# /opt/couchbase/bin/couchbase-cli node-to-node-encryption -c http://localhost:8091 -u Administrator -p password --enable
      Turned on encryption for node: http://172.23.106.134:8091
      Turned on encryption for node: http://172.23.106.136:8091
      Turned on encryption for node: http://172.23.106.137:8091
      Turned on encryption for node: http://172.23.106.138:8091
      Turned on encryption for node: http://172.23.120.58:8091
      Turned on encryption for node: http://172.23.120.73:8091
      Turned on encryption for node: http://172.23.120.74:8091
      Turned on encryption for node: http://172.23.120.75:8091
      Turned on encryption for node: http://172.23.120.77:8091
      Turned on encryption for node: http://172.23.120.81:8091
      Turned on encryption for node: http://172.23.120.86:8091
      Turned on encryption for node: http://172.23.121.118:8091
      Turned on encryption for node: http://172.23.121.77:8091
      Turned on encryption for node: http://172.23.123.24:8091
      Turned on encryption for node: http://172.23.123.25:8091
      Turned on encryption for node: http://172.23.123.26:8091
      Turned on encryption for node: http://172.23.123.31:8091
      Turned on encryption for node: http://172.23.123.32:8091
      Turned on encryption for node: http://172.23.123.33:8091
      Turned on encryption for node: http://172.23.96.122:8091
      Turned on encryption for node: http://172.23.96.14:8091
      Turned on encryption for node: http://172.23.96.243:8091
      Turned on encryption for node: http://172.23.97.105:8091
      Turned on encryption for node: http://172.23.97.148:8091
      Turned on encryption for node: http://172.23.97.149:8091
      Turned on encryption for node: http://172.23.97.150:8091
      Turned on encryption for node: http://172.23.97.151:8091
      SUCCESS: Switched node-to-node encryption on
      [root@localhost logs]#  /opt/couchbase/bin/couchbase-cli setting-security -c http://localhost:8091 -u Administrator -p password --set --cluster-encryption-level strict
      SUCCESS: Security settings updated
      [root@localhost logs]# 
      

      At this point I noticed that Rebalance button was enabled. There was nothing to rebalance afaik. I did rebalance , It failed (with some nodes down error, possibly from setting IPV4 only and enforce tls when we restart services) and next time failed with eventing hang(that is tracked by MB-48449). This bug is to figure out why rebalance button was enabled in the first place. I don't particularly remember if the rebalance button was enabled after upgrade but before the enablement of IPV4-only and enfore-tls.

      cbcollect_info attached. This the first time we are running this system test upgrade on 7.0.2, hence there is no baseline as such and no last working build.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Balakumaran.Gopal Balakumaran Gopal
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty