Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44725

[System Test]service_rebalance_failed,fts: rebalance_failed,inactivity_timeout

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      Build:7.0.0-4584

      Test: -test tests/fts/cheshire-cat/test_fts_clusterops_cheshire_cat_coll_crud.yml -scope tests/fts/cheshire-cat/scope_fts_cheshire_cat.yml
      Test Cycle: 1

      In the test,

      • there are 5 buckets, out of which 20 static fts indexes are created on collections of 3 buckets. Mutations are going on these collections
      • For the collections on other 2 buckets, we create and drop indexes and no mutations are going on these collections.
      • Continuously run queries on the indexes of collections of bucket1 and bucket2
      • wait for 15 mins
      • kill cbft on 172.23.97.217 and wait for 15 mins
      • stop all mutations and wait for 10 mins
      • add fts node 172.23.107.4 and start rebalance and wait for 15 mins
      • stop create index loop on bucket4 and bucket5
      • wait for rebalance to complete
      • Once rebalance is complete, kill cbft on 172.23.107.5 and wait for 15 mins
      • start mutations on the collections of bucket1, bucket2 and buckt3
      • wait for 2 mins and rebalance to remove node 172.23.97.232 and wait for 5 mins
      • kill memcached on 172.23.97.237 and wait for 15 mins
      • Add data node 172.23.97.232 and rebalance and wait for 10 mins
      • kill cbft on 172.23.107.5 ( which was at 2021-02-22T14:33:48-08:00) and wait for 15 mins
      • Rebalance out fts node : 172.23.97.217

      Rebalance fails with below:

       

      2021-03-02T20:43:43.556-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.97.215) - Rebalance exited with reason {service_rebalance_failed,fts,
                                    {worker_died,
                                     {'EXIT',<0.21787.89>,
                                      {rebalance_failed,inactivity_timeout}}}}.
      Rebalance Operation Id = 4561892a1d3ce812c994a80d6f31df3a 

       

      Logs:

               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.2.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.3.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.4.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.107.5.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.215.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.216.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.217.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.227.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.232.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.235.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.236.zip
               url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1614748070/collectinfo-2021-03-03T050752-ns_1%40172.23.97.237.zip

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Girish Benakappa, can we have another run for this as we have improved logging around the last occurrence area. So, if the issue happens around those, it would be easier to debug this time.

            Sreekanth Sivasankaran Sreekanth Sivasankaran added a comment - Girish Benakappa , can we have another run for this as we have improved logging around the last occurrence area. So, if the issue happens around those, it would be easier to debug this time.

            Girish Benakappa Can you start this system test run with a build later than couchbase-server-7.0.0-4894 or with this weeks build?

            raju Raju Suravarjjala added a comment - Girish Benakappa Can you start this system test run with a build later than couchbase-server-7.0.0-4894 or with this weeks build?

            Yes Raju Suravarjjala With 7.0.0-4897, System test run to reproduce this issue has been running in centos2 from last 13hrs. Have not hit this issue yet.

            girish.benakappa Girish Benakappa added a comment - Yes Raju Suravarjjala With 7.0.0-4897, System test run to reproduce this issue has been running in centos2 from last 13hrs. Have not hit this issue yet.

            Test was run multiple times with 7.0.0-4897. The issue has not been seen yet in 2 iterations. Now running once more with 7.0.0-4916.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Test was run multiple times with 7.0.0-4897. The issue has not been seen yet in 2 iterations. Now running once more with 7.0.0-4916.

            Closing as this issue hasn't been seen in multiple test runs, with 7.0.0-4916 & 7.0.0-4897. Will reopen later if this is seen again.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Closing as this issue hasn't been seen in multiple test runs, with 7.0.0-4916 & 7.0.0-4897. Will reopen later if this is seen again.

            People

              mihir.kamdar Mihir Kamdar (Inactive)
              girish.benakappa Girish Benakappa
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty