Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-57814

[System Test on cloud] Index/Query nodes getting failed over and added back frequently

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      A 5 node cluster with the following config -

      3 KV + 2 GSI/Query ( n2-standard-8 200 GB disk + n2-standard-8 450 GB disk)

      During the system test run it was seen that the index/query nodes keep getting auto failed over and added back over and over. This is the message seen in diag.log

       Node ('ns_1@svc-qi-node-005.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com') was automatically failed over. Reason: The index service took too long to respond to the health check 
      

      The auto-failover interval is set to 10 seconds for cloud instances, but I'm not really sure why the nodes are auto-failed over so often ( 64 instances of auto-failover message seen during the system test run of about 20 hours so far).

      cbcollect ->

      https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-d-node-001.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-d-node-002.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-d-node-003.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-qi-node-004.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-qi-node-005.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip
      

      Not really sure if this is in anyway related to a different issue that was seen during the same run.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              pavan.pb Pavan PB
              pavan.pb Pavan PB
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty