Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60647

Node auto failover | 7.6.0 2076

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 7.6.0
    • 7.6.0
    • fts
    • Untriaged
    • 0
    • Unknown

    Description

      The following bug was found during Capella testing.
      ami - 
      couchbase-cloud-server-7.6.0-2076-x86_64-v1.0.28
      CSP - AWS
      all search node had ~99% cpu utilization for 3.5 hours.
      The node auto-failover did not occur in this window but as soon as there was an OOM kill, the node was auto failed over(service restart,etc?)

      From the ui it can be seen that there was no RAM contention in any of the fts nodes and high cpu utilisation was always at ~99% so there's a blackbox for the reason of node failover, i.e. is it because of cpu contention or oom kill that's causing the failover.
      {{}}

      the node got successfully added into the cluster again after about 6 hours.

      the node to faillover was - svc-s-node-010.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com

      server logs - 
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-d-node-004.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-d-node-005.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-d-node-006.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-s-node-007.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-s-node-008.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-s-node-009.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-s-node-010.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      https://cb-engineering.s3.amazonaws.com/Aman/collectinfo-2024-02-02T125628-ns_1%40svc-s-node-011.zumnyloovvbb6uxi.sandbox.nonprod-project-avengers.com-redacted.zip
      cluster configs - 

      KV Nodes - 3

      FTS Nodes - 5

      16vCPUS, 32GB RAM for each node

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            aman.srivastava Aman Srivastava
            aman.srivastava Aman Srivastava
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty