Details
Description
Many of the upgrade jobs are failing during swap rebalance :
2022-06-24 00:18:39,397 - root - INFO - rebalance operation started |
2022-06-24 00:18:39,401 - root - INFO - rebalance percentage : 0.00 % |
2022-06-24 00:18:49,414 - root - INFO - rebalance percentage : 66.00 % |
2022-06-24 00:18:59,421 - root - INFO - rebalance percentage : 66.00 % |
2022-06-24 00:19:19,458 - root - INFO - rebalance progress took 40.06 seconds |
2022-06-24 00:19:19,458 - root - INFO - sleep for 10 seconds after rebalance... |
2022-06-24 00:19:29,502 - root - INFO - removed all the nodes from cluster associated with ip:172.23.98.165 port:8091 ssh_username:root ? [('ns_1@172.23.107.246', 8091), ('ns_1@172.23.98.15', 8091), ('ns_1@172.23.98.16', 8091)] |
with following errors after rebalance:
upgrade.upgrade_tests.UpgradeTests.test_upgrade
|
testrunner logs, diags and results are available under /data/workspace/centos-p0-upgrade-vset00-00-ce_ee-new-from-70x_2b/logs/testrunner-22-Jun-24_00-06-06/test_1 |
Exception in thread Thread-101: |
Traceback (most recent call last):
|
File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner |
self.run()
|
File "/usr/local/lib/python3.7/threading.py", line 870, in run |
self._target(*self._args, **self._kwargs)
|
File "pytests/upgrade/upgrade_tests.py", line 749, in create_index_with_replica_and_query |
self.num_index_replicas)
|
File "lib/couchbase_helper/tuq_helper.py", line 1172, in verify_replica_indexes |
index_host_name, index_id = self.get_index_details_using_index_name(index_name, index_map)
|
File "lib/couchbase_helper/tuq_helper.py", line 1257, in get_index_details_using_index_name |
raise Exception("Index does not exist - {0}".format(index_name)) |
Exception: Index does not exist - random_index_648043
|
|
Exception in thread Thread-111: |
Traceback (most recent call last):
|
File "pytests/upgrade/upgrade_tests.py", line 975, in online_upgrade_swap_rebalance |
sleep_before_rebalance=15) |
File "lib/couchbase_helper/cluster.py", line 488, in rebalance |
return _task.result(timeout) |
File "lib/tasks/future.py", line 160, in result |
return self.__get_result() |
File "lib/tasks/future.py", line 112, in __get_result |
raise self._exception
|
File "lib/tasks/task.py", line 840, in check |
raise Exception(msg)
|
Exception: Vbuckets were suffled! Expected active_vb for 172.23.98.16 are 683. And now are 341 |
|
During handling of the above exception, another exception occurred:
|
|
Traceback (most recent call last):
|
File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner |
self.run()
|
File "/usr/local/lib/python3.7/threading.py", line 870, in run |
self._target(*self._args, **self._kwargs)
|
File "pytests/upgrade/upgrade_tests.py", line 921, in online_upgrade |
self.online_upgrade_swap_rebalance()
|
File "pytests/upgrade/upgrade_tests.py", line 993, in online_upgrade_swap_rebalance |
self.fail(ex)
|
File "/usr/local/lib/python3.7/unittest/case.py", line 693, in fail |
raise self.failureException(msg)
|
AssertionError: Vbuckets were suffled! Expected active_vb for 172.23.98.16 are 683. And now are 341 |
example of jobs failing are:
- http://qa.sc.couchbase.com/job/test_suite_executor/485153/
- http://qa.sc.couchbase.com/job/test_suite_executor/485059/
Thuan Nguyen would be good if could rerun those jobs with log collection if needed.