Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48448

[System Test] - Rebalance button is enabled post upgrade from 6.6.3 -> 7.0.2

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 7.0.2
    • Neo
    • ns_server
    • 6.6.3-9808 -> 7.0.2-6668
    • Triaged
    • 1
    • No

    Description

      Steps to Repro
      1. Run the following longevity script on 6.6.3 for 5 days.

      ./sequoia -client 172.23.104.254:2375 -provider file:centos_second_cluster.yml -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml -scale 3 -repeat 0 -log_level 0 -version 6.6.3-9808 -skip_setup=true -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      At this point it should have a 27 node cluster ( 9 Kv, 6 Index, 3 analytics, 3 fts, 3 eventing and 3 n1ql)
      2. Create 10k metakv tombstones. This has been part of our testing since MB-44838 was fixed. We used to have a total of around 25k for CC, have reduced it here to around 12k.

       #!/bin/sh
      for i in {0..10000}
          do
              `curl -X PUT -u Administrator:password http://localhost:8091/_metakv/key{$i} -d 'value=foo1'`
              `curl -X DELETE -v -u Administrator:password http://localhost:8091/_metakv/key{$i}`
          done       
      

      3. Swap rebalance 6 nodes , 1 of each service with that of 7.0.2 nodes. Rebalance goes through successfully.
      4. Failover 6 nodes(6.6.3 nodes)1 of each service(kv is graceful failover), Upgrade these nodes to 7.0.2, do a recovery of all the 6 node(kv is delta recovery) and rebalance.
      5. Repeat step no 4 until all the nodes in cluster are upgraded to 7.0.2.

      After upgrade I validated that all the metakv tombstones are purged and enabled IPv4 only + enforce tls using the below commands.

      [root@localhost logs]# grep 'ns_config tombstone' debug.log
      [ns_server:debug,2021-09-14T04:26:34.722-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 11869 ns_config tombstone(s) up to timestamp 63798837690. Tombstones:
      [ns_server:debug,2021-09-14T04:30:38.296-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798837934. Tombstones:
      [ns_server:debug,2021-09-14T04:31:41.731-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 127 ns_config tombstone(s) up to timestamp 63798837998. Tombstones:
      [ns_server:debug,2021-09-14T04:39:42.419-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838481. Tombstones:
      [ns_server:debug,2021-09-14T04:40:43.040-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838542. Tombstones:
      [root@localhost logs]# 
      [root@localhost logs]#  /opt/couchbase/bin/couchbase-cli ip-family -c http://localhost:8091 -u Administrator -p password --set --ipv4only
      Switched IP family for node: http://172.23.106.134:8091
      Switched IP family for node: http://172.23.106.136:8091
      Switched IP family for node: http://172.23.106.137:8091
      Switched IP family for node: http://172.23.106.138:8091
      Switched IP family for node: http://172.23.120.58:8091
      Switched IP family for node: http://172.23.120.73:8091
      Switched IP family for node: http://172.23.120.74:8091
      Switched IP family for node: http://172.23.120.75:8091
      Switched IP family for node: http://172.23.120.77:8091
      Switched IP family for node: http://172.23.120.81:8091
      Switched IP family for node: http://172.23.120.86:8091
      Switched IP family for node: http://172.23.121.118:8091
      Switched IP family for node: http://172.23.121.77:8091
      Switched IP family for node: http://172.23.123.24:8091
      Switched IP family for node: http://172.23.123.25:8091
      Switched IP family for node: http://172.23.123.26:8091
      Switched IP family for node: http://172.23.123.31:8091
      Switched IP family for node: http://172.23.123.32:8091
      Switched IP family for node: http://172.23.123.33:8091
      Switched IP family for node: http://172.23.96.122:8091
      Switched IP family for node: http://172.23.96.14:8091
      Switched IP family for node: http://172.23.96.243:8091
      Switched IP family for node: http://172.23.97.105:8091
      Switched IP family for node: http://172.23.97.148:8091
      Switched IP family for node: http://172.23.97.149:8091
      Switched IP family for node: http://172.23.97.150:8091
      Switched IP family for node: http://172.23.97.151:8091
      SUCCESS: Switched IP family of the cluster
      [root@localhost logs]# /opt/couchbase/bin/couchbase-cli node-to-node-encryption -c http://localhost:8091 -u Administrator -p password --enable
      Turned on encryption for node: http://172.23.106.134:8091
      Turned on encryption for node: http://172.23.106.136:8091
      Turned on encryption for node: http://172.23.106.137:8091
      Turned on encryption for node: http://172.23.106.138:8091
      Turned on encryption for node: http://172.23.120.58:8091
      Turned on encryption for node: http://172.23.120.73:8091
      Turned on encryption for node: http://172.23.120.74:8091
      Turned on encryption for node: http://172.23.120.75:8091
      Turned on encryption for node: http://172.23.120.77:8091
      Turned on encryption for node: http://172.23.120.81:8091
      Turned on encryption for node: http://172.23.120.86:8091
      Turned on encryption for node: http://172.23.121.118:8091
      Turned on encryption for node: http://172.23.121.77:8091
      Turned on encryption for node: http://172.23.123.24:8091
      Turned on encryption for node: http://172.23.123.25:8091
      Turned on encryption for node: http://172.23.123.26:8091
      Turned on encryption for node: http://172.23.123.31:8091
      Turned on encryption for node: http://172.23.123.32:8091
      Turned on encryption for node: http://172.23.123.33:8091
      Turned on encryption for node: http://172.23.96.122:8091
      Turned on encryption for node: http://172.23.96.14:8091
      Turned on encryption for node: http://172.23.96.243:8091
      Turned on encryption for node: http://172.23.97.105:8091
      Turned on encryption for node: http://172.23.97.148:8091
      Turned on encryption for node: http://172.23.97.149:8091
      Turned on encryption for node: http://172.23.97.150:8091
      Turned on encryption for node: http://172.23.97.151:8091
      SUCCESS: Switched node-to-node encryption on
      [root@localhost logs]#  /opt/couchbase/bin/couchbase-cli setting-security -c http://localhost:8091 -u Administrator -p password --set --cluster-encryption-level strict
      SUCCESS: Security settings updated
      [root@localhost logs]# 
      

      At this point I noticed that Rebalance button was enabled. There was nothing to rebalance afaik. I did rebalance , It failed (with some nodes down error, possibly from setting IPV4 only and enforce tls when we restart services) and next time failed with eventing hang(that is tracked by MB-48449). This bug is to figure out why rebalance button was enabled in the first place. I don't particularly remember if the rebalance button was enabled after upgrade but before the enablement of IPV4-only and enfore-tls.

      cbcollect_info attached. This the first time we are running this system test upgrade on 7.0.2, hence there is no baseline as such and no last working build.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          From the node .136 diag.log we see the successful rebalance.

          2021-09-14T04:27:56.502-07:00, ns_orchestrator:0:info:message(ns_1@172.23.106.136) - Starting rebalance, KeepNodes = ['ns_1@172.23.106.134','ns_1@172.23.106.136',
                                           'ns_1@172.23.106.137','ns_1@172.23.106.138',
                                           'ns_1@172.23.120.58','ns_1@172.23.120.73',
                                           'ns_1@172.23.120.74','ns_1@172.23.120.75',
                                           'ns_1@172.23.120.77','ns_1@172.23.120.81',
                                           'ns_1@172.23.120.86','ns_1@172.23.121.118',
                                           'ns_1@172.23.121.77','ns_1@172.23.123.24',
                                           'ns_1@172.23.123.25','ns_1@172.23.123.26',
                                           'ns_1@172.23.123.31','ns_1@172.23.123.32',
                                           'ns_1@172.23.123.33','ns_1@172.23.96.122',
                                           'ns_1@172.23.96.14','ns_1@172.23.96.243',
                                           'ns_1@172.23.97.105','ns_1@172.23.97.148',
                                           'ns_1@172.23.97.149','ns_1@172.23.97.150',
                                           'ns_1@172.23.97.151'], EjectNodes = [], Failed over and being ejected nodes = [], Delta recovery nodes = ['ns_1@172.23.120.58'],  Delta recovery buckets = all; Operation Id = c737e578d27f5e0e0fcef4a7992e78042021-09-14T04:34:57.713-07:00, ns_orchestrator:0:info:message(ns_1@172.23.106.136) - Rebalance completed successfully.
          

          At this point, from the description in the ticket, the tombstones were being purged

          [ns_server:debug,2021-09-14T04:40:43.040-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838542. Tombstones:
          

          From the http_access.log we see ipv4only being configure (/opt/couchbase/bin/couchbase-cli ip-family -c http://localhost:8091 -u Administrator -p password --set --ipv4only)

          172.23.106.134 - Administrator [14/Sep/2021:05:13:16 -0700] "POST /node/controller/setupNetConfig HTTP/1.1" 200 0 - "couchbase-cli 7.0.2-6668" 199
          

          and node-to-node encryption enabled (/opt/couchbase/bin/couchbase-cli node-to-node-encryption -c http://localhost:8091 -u Administrator -p password --enable)

          172.23.106.134 - Administrator [14/Sep/2021:05:13:44 -0700] "POST /node/controller/setupNetConfig HTTP/1.1" 200 0 - "couchbase-cli 7.0.2-6668" 3039
          

          which lead to a flood of activity as the network changes, the nodes restart, etc...

          2021-09-14T05:13:16.221-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.106.134) - Hot-reloaded memcached.json for config change of the following keys: [<<"interfaces">>]
          2021-09-14T05:13:16.423-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.106.137) - Hot-reloaded memcached.json for config change of the following keys: [<<"interfaces">>]
          <snip>
          2021-09-14T05:13:20.348-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.97.151) - Hot-reloaded memcached.json for config change of the following keys: [<<"interfaces">>]
          2021-09-14T05:13:38.371-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.106.134) - Node 'ns_1@172.23.106.134' saw that node 'ns_1@172.23.106.136' went down. Details: [{nodedown_reason,
          disconnect}]
          2021-09-14T05:13:38.372-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.106.136) - Node 'ns_1@172.23.106.136' saw that node 'ns_1@172.23.106.134' went down. Details: [{nodedown_reason,
          connection_closed}]
          2021-09-14T05:13:38.460-07:00, ns_node_disco:4:info:node up(ns_1@172.23.106.134) - Node 'ns_1@172.23.106.134' saw that node 'ns_1@172.23.106.136' came up. Tags: []
          2021-09-14T05:13:38.461-07:00, ns_node_disco:4:info:node up(ns_1@172.23.106.136) - Node 'ns_1@172.23.106.136' saw that node 'ns_1@172.23.106.134' came up. Tags: []
          2021-09-14T05:13:38.461-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.106.134) - Node 'ns_1@172.23.106.134'
          <and so on, and so on>
          

          As nodes are restarting the rebalance button will be enabled and appears to have been hit again to start this rebalance...note there's still node communication reestablishment going on.

          021-09-14T05:14:23.888-07:00, ns_node_disco:4:info:node up(ns_1@172.23.97.150) - Node 'ns_1@172.23.97.150' saw that node 'ns_1@172.23.123.24' came up. Tags: []
          2021-09-14T05:14:23.895-07:00, ns_node_disco:4:info:node up(ns_1@172.23.123.24) - Node 'ns_1@172.23.123.24' saw that node 'ns_1@172.23.97.150' came up. Tags: []
          2021-09-14T05:14:23.898-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.97.151) - Node 'ns_1@172.23.97.151' saw that node 'ns_1@172.23.123.24' went down. Details: [{nodedown_reason,
                                                                                             connection_closed}]
          2021-09-14T05:14:23.899-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.123.24) - Node 'ns_1@172.23.123.24' saw that node 'ns_1@172.23.97.151' went down. Details: [{nodedown_reason,
                                                                                             disconnect}]
          2021-09-14T05:14:23.916-07:00, ns_orchestrator:0:info:message(ns_1@172.23.106.136) - Starting rebalance, KeepNodes = ['ns_1@172.23.106.134','ns_1@172.23.106.136',
                                           'ns_1@172.23.106.137','ns_1@172.23.106.138',
                                           'ns_1@172.23.120.58','ns_1@172.23.120.73',
                                           'ns_1@172.23.120.74','ns_1@172.23.120.75',
                                           'ns_1@172.23.120.77','ns_1@172.23.120.81',
                                           'ns_1@172.23.120.86','ns_1@172.23.121.118',
                                           'ns_1@172.23.121.77','ns_1@172.23.123.24',
                                           'ns_1@172.23.123.25','ns_1@172.23.123.26',
                                           'ns_1@172.23.123.31','ns_1@172.23.123.32',
                                           'ns_1@172.23.123.33','ns_1@172.23.96.122',
                                           'ns_1@172.23.96.14','ns_1@172.23.96.243',
                                           'ns_1@172.23.97.105','ns_1@172.23.97.148',
                                           'ns_1@172.23.97.149','ns_1@172.23.97.150',
                                           'ns_1@172.23.97.151'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = a6ae6d7290debe4b130c2cbd55c0be62
          

          and the rebalance fails in the indexer

          2021-09-14T05:14:42.348-07:00, ns_orchestrator:0:critical:message(ns_1@172.23.106.136) - Rebalance exited with reason {service_rebalance_failed,index,
          {agent_died,<30267.20320.405>,noconnection}}.
          Rebalance Operation Id = a6ae6d7290debe4b130c2cbd55c0be62
          

          The setting of ipv4only and node-to-node encryption lead to nodes restarting and the rebalance button getting enable. If the node restarts would have completed before the next rebalance was attempted that rebalance would have succeeded and the rebalance button would be disabled. On the rebalance attempt after the indexer failure the rebalance failed in eventing. I believe had a successful rebalance occurred the button would have been disabled.

          On my macbook I start 3 7.0.2 nodes with all services and configure the cluster and add a default bucket.
          I then do:

          ./couchbase-cli setting-autofailover -c localhost:9000 -u Administrator -p asdasd --enable-auto-failover 0
           ./couchbase-cli ip-family -c localhost:9000 -u Administrator -p asdasd --set --ipv4only
           ./couchbase-cli node-to-node-encryption -c localhost:9000 -u Administrator -p asdasd --enable
           ./couchbase-cli setting-security -c localhost:9000 -u Administrator -p asdasd --set --cluster-encryption-level strict
          

          and see the nodes restart and then require a rebalance. I then click on the rebalance button and let the rebalance complete. At that point the rebalance button was not clickable.

          steve.watanabe Steve Watanabe added a comment - From the node .136 diag.log we see the successful rebalance. 2021-09-14T04:27:56.502-07:00, ns_orchestrator:0:info:message(ns_1@172.23.106.136) - Starting rebalance, KeepNodes = ['ns_1@172.23.106.134','ns_1@172.23.106.136', 'ns_1@172.23.106.137','ns_1@172.23.106.138', 'ns_1@172.23.120.58','ns_1@172.23.120.73', 'ns_1@172.23.120.74','ns_1@172.23.120.75', 'ns_1@172.23.120.77','ns_1@172.23.120.81', 'ns_1@172.23.120.86','ns_1@172.23.121.118', 'ns_1@172.23.121.77','ns_1@172.23.123.24', 'ns_1@172.23.123.25','ns_1@172.23.123.26', 'ns_1@172.23.123.31','ns_1@172.23.123.32', 'ns_1@172.23.123.33','ns_1@172.23.96.122', 'ns_1@172.23.96.14','ns_1@172.23.96.243', 'ns_1@172.23.97.105','ns_1@172.23.97.148', 'ns_1@172.23.97.149','ns_1@172.23.97.150', 'ns_1@172.23.97.151'], EjectNodes = [], Failed over and being ejected nodes = [], Delta recovery nodes = ['ns_1@172.23.120.58'], Delta recovery buckets = all; Operation Id = c737e578d27f5e0e0fcef4a7992e78042021-09-14T04:34:57.713-07:00, ns_orchestrator:0:info:message(ns_1@172.23.106.136) - Rebalance completed successfully. At this point, from the description in the ticket, the tombstones were being purged [ns_server:debug,2021-09-14T04:40:43.040-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838542. Tombstones: From the http_access.log we see ipv4only being configure (/opt/couchbase/bin/couchbase-cli ip-family -c http://localhost:8091 -u Administrator -p password --set --ipv4only) 172.23.106.134 - Administrator [14/Sep/2021:05:13:16 -0700] "POST /node/controller/setupNetConfig HTTP/1.1" 200 0 - "couchbase-cli 7.0.2-6668" 199 and node-to-node encryption enabled (/opt/couchbase/bin/couchbase-cli node-to-node-encryption -c http://localhost:8091 -u Administrator -p password --enable) 172.23.106.134 - Administrator [14/Sep/2021:05:13:44 -0700] "POST /node/controller/setupNetConfig HTTP/1.1" 200 0 - "couchbase-cli 7.0.2-6668" 3039 which lead to a flood of activity as the network changes, the nodes restart, etc... 2021-09-14T05:13:16.221-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.106.134) - Hot-reloaded memcached.json for config change of the following keys: [<<"interfaces">>] 2021-09-14T05:13:16.423-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.106.137) - Hot-reloaded memcached.json for config change of the following keys: [<<"interfaces">>] <snip> 2021-09-14T05:13:20.348-07:00, memcached_config_mgr:0:info:message(ns_1@172.23.97.151) - Hot-reloaded memcached.json for config change of the following keys: [<<"interfaces">>] 2021-09-14T05:13:38.371-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.106.134) - Node 'ns_1@172.23.106.134' saw that node 'ns_1@172.23.106.136' went down. Details: [{nodedown_reason, disconnect}] 2021-09-14T05:13:38.372-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.106.136) - Node 'ns_1@172.23.106.136' saw that node 'ns_1@172.23.106.134' went down. Details: [{nodedown_reason, connection_closed}] 2021-09-14T05:13:38.460-07:00, ns_node_disco:4:info:node up(ns_1@172.23.106.134) - Node 'ns_1@172.23.106.134' saw that node 'ns_1@172.23.106.136' came up. Tags: [] 2021-09-14T05:13:38.461-07:00, ns_node_disco:4:info:node up(ns_1@172.23.106.136) - Node 'ns_1@172.23.106.136' saw that node 'ns_1@172.23.106.134' came up. Tags: [] 2021-09-14T05:13:38.461-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.106.134) - Node 'ns_1@172.23.106.134' <and so on, and so on> As nodes are restarting the rebalance button will be enabled and appears to have been hit again to start this rebalance...note there's still node communication reestablishment going on. 021-09-14T05:14:23.888-07:00, ns_node_disco:4:info:node up(ns_1@172.23.97.150) - Node 'ns_1@172.23.97.150' saw that node 'ns_1@172.23.123.24' came up. Tags: [] 2021-09-14T05:14:23.895-07:00, ns_node_disco:4:info:node up(ns_1@172.23.123.24) - Node 'ns_1@172.23.123.24' saw that node 'ns_1@172.23.97.150' came up. Tags: [] 2021-09-14T05:14:23.898-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.97.151) - Node 'ns_1@172.23.97.151' saw that node 'ns_1@172.23.123.24' went down. Details: [{nodedown_reason, connection_closed}] 2021-09-14T05:14:23.899-07:00, ns_node_disco:5:warning:node down(ns_1@172.23.123.24) - Node 'ns_1@172.23.123.24' saw that node 'ns_1@172.23.97.151' went down. Details: [{nodedown_reason, disconnect}] 2021-09-14T05:14:23.916-07:00, ns_orchestrator:0:info:message(ns_1@172.23.106.136) - Starting rebalance, KeepNodes = ['ns_1@172.23.106.134','ns_1@172.23.106.136', 'ns_1@172.23.106.137','ns_1@172.23.106.138', 'ns_1@172.23.120.58','ns_1@172.23.120.73', 'ns_1@172.23.120.74','ns_1@172.23.120.75', 'ns_1@172.23.120.77','ns_1@172.23.120.81', 'ns_1@172.23.120.86','ns_1@172.23.121.118', 'ns_1@172.23.121.77','ns_1@172.23.123.24', 'ns_1@172.23.123.25','ns_1@172.23.123.26', 'ns_1@172.23.123.31','ns_1@172.23.123.32', 'ns_1@172.23.123.33','ns_1@172.23.96.122', 'ns_1@172.23.96.14','ns_1@172.23.96.243', 'ns_1@172.23.97.105','ns_1@172.23.97.148', 'ns_1@172.23.97.149','ns_1@172.23.97.150', 'ns_1@172.23.97.151'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = a6ae6d7290debe4b130c2cbd55c0be62 and the rebalance fails in the indexer 2021-09-14T05:14:42.348-07:00, ns_orchestrator:0:critical:message(ns_1@172.23.106.136) - Rebalance exited with reason {service_rebalance_failed,index, {agent_died,<30267.20320.405>,noconnection}}. Rebalance Operation Id = a6ae6d7290debe4b130c2cbd55c0be62 The setting of ipv4only and node-to-node encryption lead to nodes restarting and the rebalance button getting enable. If the node restarts would have completed before the next rebalance was attempted that rebalance would have succeeded and the rebalance button would be disabled. On the rebalance attempt after the indexer failure the rebalance failed in eventing. I believe had a successful rebalance occurred the button would have been disabled. On my macbook I start 3 7.0.2 nodes with all services and configure the cluster and add a default bucket. I then do: ./couchbase-cli setting-autofailover -c localhost:9000 -u Administrator -p asdasd --enable-auto-failover 0 ./couchbase-cli ip-family -c localhost:9000 -u Administrator -p asdasd --set --ipv4only ./couchbase-cli node-to-node-encryption -c localhost:9000 -u Administrator -p asdasd --enable ./couchbase-cli setting-security -c localhost:9000 -u Administrator -p asdasd --set --cluster-encryption-level strict and see the nodes restart and then require a rebalance. I then click on the rebalance button and let the rebalance complete. At that point the rebalance button was not clickable.

          Balakumaran Gopal I believe the nodes restarting after setting ipv4only and enabling node-to-node encryption leads to the cluster needing rebalancing and had the node restart, reestablish communication finished the rebalance would have succeeded and the button disabled.

          steve.watanabe Steve Watanabe added a comment - Balakumaran Gopal  I believe the nodes restarting after setting ipv4only and enabling node-to-node encryption leads to the cluster needing rebalancing and had the node restart, reestablish communication finished the rebalance would have succeeded and the button disabled.

          Steve Watanabe - Thanks for the analysis. Even after repeated rebalance I noticed that the rebalance button was still enabled. Now, I think it could be related to MB-48468.

          It should also be noted we already have a similar bug raise by Sumedh Basarkod on Neo MB-48001. So, I think there is something here.

          Balakumaran.Gopal Balakumaran Gopal added a comment - Steve Watanabe - Thanks for the analysis. Even after repeated rebalance I noticed that the rebalance button was still enabled. Now, I think it could be related to MB-48468 . It should also be noted we already have a similar bug raise by Sumedh Basarkod on Neo MB-48001 . So, I think there is something here.

          Balakumaran Gopal I'm going to need a reproduction as I feel I've triaged the cause of the issue in this case (ipv4only and node-to-node encryption lead to restarts). Looks like the cited bugs MB-48468 and MB-48001 may be resolved so reproduction should include any fixes from those tickets.

          steve.watanabe Steve Watanabe added a comment - Balakumaran Gopal  I'm going to need a reproduction as I feel I've triaged the cause of the issue in this case (ipv4only and node-to-node encryption lead to restarts). Looks like the cited bugs MB-48468 and MB-48001 may be resolved so reproduction should include any fixes from those tickets.

          Balakumaran Gopal I see there hasn't been any updates for over a month. I'd like to assume this ticket may no longer be relevant. Please reopen and provide any additional details if this is not the case. We are scrubbing tickets to get a better representation of count for Neo. Appreciate if we can get your assistance cleaning up old tickets like this.

          meni.hillel Meni Hillel (Inactive) added a comment - Balakumaran Gopal I see there hasn't been any updates for over a month. I'd like to assume this ticket may no longer be relevant. Please reopen and provide any additional details if this is not the case. We are scrubbing tickets to get a better representation of count for Neo. Appreciate if we can get your assistance cleaning up old tickets like this.
          ritam.sharma Ritam Sharma added a comment -

          Meni Hillel - Please do not mark defects as incomplete,  logs are attached to the ticket - and these tests are time consuming to reproduce.   I am going to reopen the defect and mark it as fixed - since it might be indirectly fixed by other changes. 

          ritam.sharma Ritam Sharma added a comment - Meni Hillel  - Please do not mark defects as incomplete,  logs are attached to the ticket - and these tests are time consuming to reproduce.   I am going to reopen the defect and mark it as fixed - since it might be indirectly fixed by other changes. 

          Did not noticed this after upgrade from 6.6.5-10076 —> 7.1.0-2117. Hence marking this closed.

          Balakumaran.Gopal Balakumaran Gopal added a comment - Did not noticed this after upgrade from 6.6.5-10076 —> 7.1.0-2117. Hence marking this closed.

          People

            Balakumaran.Gopal Balakumaran Gopal
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty