Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Cannot Reproduce
-
Cheshire-Cat
-
Untriaged
-
1
-
Unknown
Description
Build:7.0.0-4584
Test: -test tests/fts/cheshire-cat/test_fts_clusterops_cheshire_cat_coll_crud.yml -scope tests/fts/cheshire-cat/scope_fts_cheshire_cat.yml
Test Cycle: 1
In the test,
- there are 5 buckets, out of which 20 static fts indexes are created on collections of 3 buckets. Mutations are going on these collections
- For the collections on other 2 buckets, we create and drop indexes and no mutations are going on these collections.
- Continuously run queries on the indexes of collections of bucket1 and bucket2
- wait for 15 mins
- kill cbft on 172.23.97.217 and wait for 15 mins
- stop all mutations and wait for 10 mins
- add fts node 172.23.107.4 and start rebalance and wait for 15 mins
- stop create index loop on bucket4 and bucket5
- wait for rebalance to complete
- Once rebalance is complete, kill cbft on 172.23.107.5 and wait for 15 mins
- start mutations on the collections of bucket1, bucket2 and buckt3
- wait for 2 mins and rebalance to remove node 172.23.97.232 and wait for 5 mins
- kill memcached on 172.23.97.237 and wait for 15 mins
- Add data node 172.23.97.232 and rebalance and wait for 10 mins
- kill cbft on 172.23.107.5 ( which was at 2021-02-22T14:33:48-08:00) and wait for 15 mins
- Rebalance out fts node : 172.23.97.217
Rebalance fails with below:
2021-03-02T20:43:43.556-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.97.215) - Rebalance exited with reason {service_rebalance_failed,fts, |
{worker_died,
|
{'EXIT',<0.21787.89>, |
{rebalance_failed,inactivity_timeout}}}}.
|
Rebalance Operation Id = 4561892a1d3ce812c994a80d6f31df3a
|
Logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.2.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.3.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.4.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.5.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.215.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.216.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.217.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.227.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.232.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.235.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.236.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.237.zip
Attachments
Issue Links
- duplicates
-
MB-44541 [System Test]Rebalance out fts node taking more than 21 hrs
-
- Closed
-
Girish Benakappa, can we have another run for this as we have improved logging around the last occurrence area. So, if the issue happens around those, it would be easier to debug this time.