Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51350

[Upgrade Test] Rebalance is failing when adding Index node after upgrading from 6.6.5 to 7.1.0

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Yes

    Description

      Rebalance-in of upgraded index node fails when node is upgraded from 6.6.5-10080 to 7.1.0-2434

       

      Steps to reproduce:

      1. create 4 node cluster with following configuration kv:query-kv-index-index
      2. Create a bucket and load data to it
      3. Create few indexes - 

        CREATE INDEX `index_name_0` ON standard_bucket0(job_title) USING GSI  WITH {'defer_build': True}
        BUILD INDEX on standard_bucket0(`index_name_0`) USING GSI
         
        CREATE INDEX `employee2584caf24e2b4bc9a31c0eab2752d307job_title` ON standard_bucket0(job_title) WHERE  job_title IS NOT NULL  USING GSI  WITH {'nodes': ['172.23.122.119:8091'], 'defer_build': True}
         
        CREATE INDEX `employee2584caf24e2b4bc9a31c0eab2752d307join_yr` ON standard_bucket0(join_yr) WHERE  join_yr > 2010 and join_yr < 2014  USING GSI  WITH {'nodes': ['172.23.122.119:8091'], 'defer_build': True}
         
        BUILD INDEX on standard_bucket0(employee2584caf24e2b4bc9a31c0eab2752d307job_title,employee2584caf24e2b4bc9a31c0eab2752d307join_yr) USING GSI

      4. Remove only one index node and upgrade it to 7.1.0-2434
      5. Add back index node and rebalance in
      6. Rebalance fails with this error - 

        {'node': 'ns_1@172.23.123.61', 'type': 'warning', 'code': 102, 'module': 'menelaus_web', 'tstamp': 1646741308694, 'shortText': 'client-side error report', 'text': 'Client-side error-report for user "Administrator" on node \'ns_1@172.23.123.61\':\nUser-Agent:Python-httplib2/0.13.1 (gzip)\nStarting rebalance from test, ejected nodes [\'ns_1@172.23.122.119\', \'ns_1@172.23.122.122\', \'ns_1@172.23.122.123\']', 'serverTime': '2022-03-08T04:08:28.694Z'}
        2022-03-08 04:09:08,781 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 11, 'module': 'menelaus_web', 'tstamp': 1646741308630, 'shortText': 'message', 'text': 'Deleted bucket "standard_bucket0"\n', 'serverTime': '2022-03-08T04:08:28.630Z'}
        2022-03-08 04:09:08,781 - root - ERROR - {'node': 'ns_1@172.23.122.122', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1646741307658, 'shortText': 'message', 'text': 'Shutting down bucket "standard_bucket0" on \'ns_1@172.23.122.122\' for deletion', 'serverTime': '2022-03-08T04:08:27.658Z'}
        2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1646741307649, 'shortText': 'message', 'text': 'Shutting down bucket "standard_bucket0" on \'ns_1@172.23.123.61\' for deletion', 'serverTime': '2022-03-08T04:08:27.649Z'}
        2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.122.122', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1646741214581, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.122.122' saw that node 'ns_1@172.23.122.119' came up. Tags: []", 'serverTime': '2022-03-08T04:06:54.581Z'}
        2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.122.123', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1646741214374, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.122.123' saw that node 'ns_1@172.23.122.119' came up. Tags: []", 'serverTime': '2022-03-08T04:06:54.374Z'}
        2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 0, 'module': 'ns_config', 'tstamp': 1646741214205, 'shortText': 'message', 'text': 'Conflicting configuration changes to field email_alerts:\n[{recipients,["root@localhost"]},\n {sender,"couchbase@localhost"},\n {enabled,false},\n {email_server,[{user,[]},\n                {pass,"*****"},\n                {host,"localhost"},\n                {port,25},\n                {encrypt,false}]},\n {alerts,[auto_failover_node,auto_failover_maximum_reached,\n          auto_failover_other_nodes_down,auto_failover_cluster_too_small,\n          auto_failover_disabled,ip,disk,overhead,ep_oom_errors,\n          ep_item_commit_failed,audit_dropped_events,indexer_ram_max_usage,\n          ep_clock_cas_drift_threshold_exceeded,communication_issue]}] and\n[{recipients,["root@localhost"]},\n {sender,"couchbase@localhost"},\n {enabled,false},\n {email_server,[{user,[]},\n                {pass,"*****"},\n                {host,"localhost"},\n                {port,25},\n                {encrypt,false}]},\n {alerts,[auto_failover_node,auto_failover_maximum_reached,\n          auto_failover_other_nodes_down,auto_failover_cluster_too_small,\n          auto_failover_disabled,ip,disk,overhead,ep_oom_errors,\n          ep_item_commit_failed,audit_dropped_events,indexer_ram_max_usage,\n          ep_clock_cas_drift_threshold_exceeded,communication_issue,\n          time_out_of_sync,disk_usage_analyzer_stuck]},\n {pop_up_alerts,[auto_failover_node,auto_failover_maximum_reached,\n                 auto_failover_other_nodes_down,\n                 auto_failover_cluster_too_small,auto_failover_disabled,ip,\n                 disk,overhead,ep_oom_errors,ep_item_commit_failed,\n                 audit_dropped_events,indexer_ram_max_usage,\n                 ep_clock_cas_drift_threshold_exceeded,communication_issue,\n                 time_out_of_sync,disk_usage_analyzer_stuck]}], choosing the former.', 'serverTime': '2022-03-08T04:06:54.205Z'}
        2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1646741214201, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.123.61' saw that node 'ns_1@172.23.122.119' came up. Tags: []", 'serverTime': '2022-03-08T04:06:54.201Z'}
        2022-03-08 04:09:08,783 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 0, 'module': 'ns_cluster', 'tstamp': 1646741214010, 'shortText': 'message', 'text': "Started node add transaction by adding node 'ns_1@172.23.122.119' to nodes_wanted (group: undefined)", 'serverTime': '2022-03-08T04:06:54.010Z'}
        2022-03-08 04:09:08,783 - root - ERROR - {'node': 'ns_1@172.23.122.122', 'type': 'warning', 'code': 5, 'module': 'ns_node_disco', 'tstamp': 1646740948791, 'shortText': 'node down', 'text': "Node 'ns_1@172.23.122.122' saw that node 'ns_1@172.23.122.119' went down. Details: [{nodedown_reason,\n                                                                                     connection_closed}]", 'serverTime': '2022-03-08T04:02:28.791Z'} 

      I'm able to reproduce the test locally as well.

       

      Jenkins job run is available here - http://qa.sc.couchbase.com/job/test_suite_executor/448863/consoleText - Test1

       

      Attachments

        For Gerrit Dashboard: MB-51350
        # Subject Branch Project Status CR V

        Activity

          People

            pavan.pb Pavan PB
            hemant.rajput Hemant Rajput
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty