Details
-
Bug
-
Resolution: Resolved
-
Critical
-
7.0.2, 7.1.0
-
Centos 7 64 bit; CB EE 7.0.2-6644
-
Untriaged
-
Centos 64-bit
-
-
1
-
Unknown
Description
Script to Repo
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/durability_volume.ini -t volumetests.Collections.volume.test_volume_taf,nodes_init=6,bucket_spec=volume_templates.buckets_scalable_stats_for_volume_test,iterations=1,rerun=False,get-cbcollect-info=True,skip_validations=True,services_for_rebalance_in=kv:index,services_init=kv-n1ql-n1ql-kv:index-kv:index-kv:index,number_of_indexes=300,quota_percent=80,use_https=True,enforce_tls=True'
|
Steps
1. Create a 6 node cluster with strict level of n2n encryption
2021-09-06 22:32:40,663 | test | INFO | pool-3-thread-9 | [table_view:display:72] Rebalance Overview
|
+----------------+--------------+-----------------------+---------------+--------------+
|
| Nodes | Services | Version | CPU | Status |
|
+----------------+--------------+-----------------------+---------------+--------------+
|
| 172.23.105.175 | kv | 7.0.2-6644-enterprise | 2.13032581454 | Cluster node |
|
| 172.23.106.233 | ['n1ql'] | | | <--- IN --- |
|
| 172.23.106.236 | ['n1ql'] | | | <--- IN --- |
|
| 172.23.106.238 | ['kv,index'] | | | <--- IN --- |
|
| 172.23.106.250 | ['kv,index'] | | | <--- IN --- |
|
| 172.23.106.251 | ['kv,index'] | | | <--- IN --- |
|
+----------------+--------------+-----------------------+---------------+--------------+
|
2. Create 15 buckets with 1000 collections with a few documents
3. Flush the documents
4. Create 300 gsi indexes on collections
2021-09-06 23:15:55,128 | test | INFO | MainThread | [Collections:build_deferred_indexes:224] online indexes count: 300
|
5. Load a few documents
6. Rebalance in a node with kv, index services along with data loading
2021-09-06 23:20:55,394 | test | INFO | pool-3-thread-1 | [table_view:display:72] Rebalance Overview
|
+----------------+--------------+-----------------------+---------------+--------------+
|
| Nodes | Services | Version | CPU | Status |
|
+----------------+--------------+-----------------------+---------------+--------------+
|
| 172.23.105.175 | kv | 7.0.2-6644-enterprise | 11.5120711563 | Cluster node |
|
| 172.23.106.250 | index, kv | 7.0.2-6644-enterprise | 10.356448477 | Cluster node |
|
| 172.23.106.236 | n1ql | 7.0.2-6644-enterprise | 3.07789740342 | Cluster node |
|
| 172.23.106.251 | index, kv | 7.0.2-6644-enterprise | 10.9372979961 | Cluster node |
|
| 172.23.106.233 | n1ql | 7.0.2-6644-enterprise | 2.97769893563 | Cluster node |
|
| 172.23.106.238 | index, kv | 7.0.2-6644-enterprise | 9.12857697786 | Cluster node |
|
| 172.23.121.78 | ['kv,index'] | | | <--- IN --- |
|
+----------------+--------------+-----------------------+---------------+--------------+
|
it failed at 43% of progress
2021-09-06 23:24:05,075 | test | ERROR | pool-3-thread-1 | [rest_client:_rebalance_status_and_progress:1639] {u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try again.', u'type': u'rebalance', u'masterRequestTimedOut': False, u'statusId': u'da3dd761c86b379aa8bb6f4ec3475051', u'statusIsStale': False, u'lastReportURI': u'/logs/rebalanceReport?reportID=d890e2a55b0366d043c9bab3b7cd7bed', u'status': u'notRunning'} - rebalance failed
|
2021-09-06 23:24:05,125 | test | INFO | pool-3-thread-1 | [rest_client:print_UI_logs:2785] Latest logs from UI on 172.23.105.175:
|
2021-09-06 23:24:05,125 | test | ERROR | pool-3-thread-1 | [rest_client:print_UI_logs:2787] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.105.175', u'tstamp': 1630995838453L, u'shortText': u'message', u'serverTime': u'2021-09-06T23:23:58.453Z', u'text': u'Rebalance exited with reason {service_rebalance_failed,index,\n {agent_died,<32558.27011.3>,\n {linked_process_died,<32558.10764.9>,\n {\'ns_1@172.23.121.78\',\n {timeout,\n {gen_server,call,\n [<32558.32400.3>,\n {call,"ServiceAPI.StartTopologyChange",\n #Fun<json_rpc_connection.0.77329884>},\n 60000]}}}}}}.\nRebalance Operation Id = 7a4d174bbdc32df7b48a00f09fe881a8'}
|
However, retrying the rebalance after a few hours succeeded.
Attachments
Issue Links
- is a backport of
-
MB-48351 [Enforce-TLS] 'Rebalance exited with reason {service_rebalance_failed,index}..'
- Closed