Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
6.6.5-100080
7.1.0-2343
-
Untriaged
-
1
-
Yes
Description
Rebalance-in of upgraded index node fails when node is upgraded from 6.6.5-10080 to 7.1.0-2434
Steps to reproduce:
- create 4 node cluster with following configuration kv:query-kv-index-index
- Create a bucket and load data to it
- Create few indexes -
CREATE INDEX `index_name_0` ON standard_bucket0(job_title) USING GSI WITH {'defer_build': True}
BUILD INDEX on standard_bucket0(`index_name_0`) USING GSI
CREATE INDEX `employee2584caf24e2b4bc9a31c0eab2752d307job_title` ON standard_bucket0(job_title) WHERE job_title IS NOT NULL USING GSI WITH {'nodes': ['172.23.122.119:8091'], 'defer_build': True}
CREATE INDEX `employee2584caf24e2b4bc9a31c0eab2752d307join_yr` ON standard_bucket0(join_yr) WHERE join_yr > 2010 and join_yr < 2014 USING GSI WITH {'nodes': ['172.23.122.119:8091'], 'defer_build': True}
BUILD INDEX on standard_bucket0(employee2584caf24e2b4bc9a31c0eab2752d307job_title,employee2584caf24e2b4bc9a31c0eab2752d307join_yr) USING GSI
- Remove only one index node and upgrade it to 7.1.0-2434
- Add back index node and rebalance in
- Rebalance fails with this error -
{'node': 'ns_1@172.23.123.61', 'type': 'warning', 'code': 102, 'module': 'menelaus_web', 'tstamp': 1646741308694, 'shortText': 'client-side error report', 'text': 'Client-side error-report for user "Administrator" on node \'ns_1@172.23.123.61\':\nUser-Agent:Python-httplib2/0.13.1 (gzip)\nStarting rebalance from test, ejected nodes [\'ns_1@172.23.122.119\', \'ns_1@172.23.122.122\', \'ns_1@172.23.122.123\']', 'serverTime': '2022-03-08T04:08:28.694Z'}
2022-03-08 04:09:08,781 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 11, 'module': 'menelaus_web', 'tstamp': 1646741308630, 'shortText': 'message', 'text': 'Deleted bucket "standard_bucket0"\n', 'serverTime': '2022-03-08T04:08:28.630Z'}
2022-03-08 04:09:08,781 - root - ERROR - {'node': 'ns_1@172.23.122.122', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1646741307658, 'shortText': 'message', 'text': 'Shutting down bucket "standard_bucket0" on \'ns_1@172.23.122.122\' for deletion', 'serverTime': '2022-03-08T04:08:27.658Z'}
2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 0, 'module': 'ns_memcached', 'tstamp': 1646741307649, 'shortText': 'message', 'text': 'Shutting down bucket "standard_bucket0" on \'ns_1@172.23.123.61\' for deletion', 'serverTime': '2022-03-08T04:08:27.649Z'}
2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.122.122', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1646741214581, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.122.122' saw that node 'ns_1@172.23.122.119' came up. Tags: []", 'serverTime': '2022-03-08T04:06:54.581Z'}
2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.122.123', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1646741214374, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.122.123' saw that node 'ns_1@172.23.122.119' came up. Tags: []", 'serverTime': '2022-03-08T04:06:54.374Z'}
2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 0, 'module': 'ns_config', 'tstamp': 1646741214205, 'shortText': 'message', 'text': 'Conflicting configuration changes to field email_alerts:\n[{recipients,["root@localhost"]},\n {sender,"couchbase@localhost"},\n {enabled,false},\n {email_server,[{user,[]},\n {pass,"*****"},\n {host,"localhost"},\n {port,25},\n {encrypt,false}]},\n {alerts,[auto_failover_node,auto_failover_maximum_reached,\n auto_failover_other_nodes_down,auto_failover_cluster_too_small,\n auto_failover_disabled,ip,disk,overhead,ep_oom_errors,\n ep_item_commit_failed,audit_dropped_events,indexer_ram_max_usage,\n ep_clock_cas_drift_threshold_exceeded,communication_issue]}] and\n[{recipients,["root@localhost"]},\n {sender,"couchbase@localhost"},\n {enabled,false},\n {email_server,[{user,[]},\n {pass,"*****"},\n {host,"localhost"},\n {port,25},\n {encrypt,false}]},\n {alerts,[auto_failover_node,auto_failover_maximum_reached,\n auto_failover_other_nodes_down,auto_failover_cluster_too_small,\n auto_failover_disabled,ip,disk,overhead,ep_oom_errors,\n ep_item_commit_failed,audit_dropped_events,indexer_ram_max_usage,\n ep_clock_cas_drift_threshold_exceeded,communication_issue,\n time_out_of_sync,disk_usage_analyzer_stuck]},\n {pop_up_alerts,[auto_failover_node,auto_failover_maximum_reached,\n auto_failover_other_nodes_down,\n auto_failover_cluster_too_small,auto_failover_disabled,ip,\n disk,overhead,ep_oom_errors,ep_item_commit_failed,\n audit_dropped_events,indexer_ram_max_usage,\n ep_clock_cas_drift_threshold_exceeded,communication_issue,\n time_out_of_sync,disk_usage_analyzer_stuck]}], choosing the former.', 'serverTime': '2022-03-08T04:06:54.205Z'}
2022-03-08 04:09:08,782 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 4, 'module': 'ns_node_disco', 'tstamp': 1646741214201, 'shortText': 'node up', 'text': "Node 'ns_1@172.23.123.61' saw that node 'ns_1@172.23.122.119' came up. Tags: []", 'serverTime': '2022-03-08T04:06:54.201Z'}
2022-03-08 04:09:08,783 - root - ERROR - {'node': 'ns_1@172.23.123.61', 'type': 'info', 'code': 0, 'module': 'ns_cluster', 'tstamp': 1646741214010, 'shortText': 'message', 'text': "Started node add transaction by adding node 'ns_1@172.23.122.119' to nodes_wanted (group: undefined)", 'serverTime': '2022-03-08T04:06:54.010Z'}
2022-03-08 04:09:08,783 - root - ERROR - {'node': 'ns_1@172.23.122.122', 'type': 'warning', 'code': 5, 'module': 'ns_node_disco', 'tstamp': 1646740948791, 'shortText': 'node down', 'text': "Node 'ns_1@172.23.122.122' saw that node 'ns_1@172.23.122.119' went down. Details: [{nodedown_reason,\n connection_closed}]", 'serverTime': '2022-03-08T04:02:28.791Z'}
I'm able to reproduce the test locally as well.
Jenkins job run is available here - http://qa.sc.couchbase.com/job/test_suite_executor/448863/consoleText - Test1
Attachments
For Gerrit Dashboard: MB-51350 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
171459,1 | MB-51350 fix incorrect return message when auth is skipped | neo | indexing | Status: MERGED | +2 | +1 |
172081,2 | MB-51350 fix incorrect return message when auth is skipped | unstable | indexing | Status: MERGED | +2 | +1 |