Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.5.0
-
Untriaged
-
Unknown
Description
Build : 6.5.0-2647
Following failure is an intermittent failure seen in one of the tests. After the test is completed successfully, the framework cleans up the cluster by deleting the buckets and removing out all the nodes and perform a rebalance. Rebalance operation after removal of the FTS node here fails. Have seen this very intermittently.
[ns_server:error,2019-03-18T22:38:09.589-07:00,ns_1@172.23.104.105:service_rebalancer-fts<0.8814.4>:service_rebalancer:run_rebalance:82]Agent terminated during the rebalance: {'DOWN',
|
#Ref<0.541932241.3854303233.130179>,
|
process,<22576.24369.1>,
|
{linked_process_died,<22576.24478.1>,
|
{timeout,
|
{gen_server,call,
|
[<22576.24448.1>,
|
{call,
|
"ServiceAPI.GetCurrentTopology",
|
#Fun<json_rpc_connection.0.102434519>},
|
60000]}}}}
|
[ns_server:error,2019-03-18T22:38:09.591-07:00,ns_1@172.23.104.105:service_rebalancer-fts<0.8814.4>:service_agent:process_bad_results:810]Service call unset_rebalancer (service fts) failed on some nodes:
|
[{'ns_1@172.23.104.107',nack}]
|
[ns_server:warn,2019-03-18T22:38:09.591-07:00,ns_1@172.23.104.105:service_rebalancer-fts<0.8814.4>:service_rebalancer:run_rebalance:91]Failed to unset rebalancer on some nodes:
|
{error,{bad_nodes,fts,unset_rebalancer,[{'ns_1@172.23.104.107',nack}]}}
|
[user:error,2019-03-18T22:38:09.592-07:00,ns_1@172.23.104.105:<0.2687.0>:ns_orchestrator:do_log_rebalance_completion:1206]Rebalance exited with reason {service_rebalance_failed,fts,
|
{linked_process_died,<22576.24478.1>,
|
{timeout,
|
{gen_server,call,
|
[<22576.24448.1>,
|
{call,"ServiceAPI.GetCurrentTopology",
|
#Fun<json_rpc_connection.0.102434519>},
|
60000]}}}}. Operation Id = 87198007415e937d5007037fa814171e
|
Logs attached.
172.23.104.107 is the node being removed in this step. The buckets were deleted before starting the rebalance :
[2019-03-18 22:37:14,480] - [bucket_helper:143] INFO - deleting existing buckets [u'default', u'sasl_bucket_1', u'sasl_bucket_2', u'sasl_bucket_3', u'standard_bucket_1', u'standard_bucket_2', u'standard_bucket_3'] on 172.23.104.105
For QE Reference :
*Test* : ./testrunner -i /tmp/testexec.304.ini -p get-cbcollect-info=True,disable_HTP=True,index_type=upside_down,get-logs=False,stop-on-failure=False,fts_quota=750 -t fts.stable_topology_fts.StableTopFTS.create_simple_default_index,items=1000,cluster=D,F,standard_buckets=3,sasl_buckets=3,index_per_bucket=3,update=True,expires=30,memory_only=True,GROUP=P0
|
*Job* : centos-fts_mem-only-indexes
|
Attachments
Issue Links
- mentioned in
-
Page Loading...
For Gerrit Dashboard: MB-33436 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
107580,4 | MB-33436 - FTS rebalance failures | master | cbft | Status: MERGED | +2 | +1 |
109492,5 | MB-33436 - intermittent rebalance failure | master | cbft | Status: MERGED | +2 | +1 |
111818,12 | MB-33436: Add timeout condition for CtlMgr's GetTaskList API | master | cbgt | Status: MERGED | +2 | +1 |