Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49882

SystemEventLogs: 'Unable to automatically failover a node' event is not generated by ns_server

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not a Bug
    • Neo
    • Neo
    • ns_server
    • Enterprise Edition 7.1.0 build 1833

    Description

      Build: 7.1.0-1833

      Following unable to auto_failover node(s) logs is missing in logs.

      Unable to Automatically Fail Over Node Which node is required auto failover
      Why did automatic failover not succeed
      Which node was the orchestrator at the time
      The automatic failover threshold

      Steps:

      • 5 node cluster, bucket with replica=1
      • Bring down all query nodes or bring down 2 KV nodes at a same time

      Observation:

      Able to see the log in UI saying,

       

      Could not auto-failover node ('ns_1@172.23.105.211'). Number of remaining nodes that are running data service is 1. You need at least 2 nodes.

      Expectation:

       

      Respective event should be logged in system_event_logs

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          hareen.kancharla Hareen Kancharla added a comment - - edited

          Ashwin Govindarajulu: There are two category of not being able to auto-failover.

          1) when an auto-failover was started but was aborted/not-completed due to a specific reason - we log events in this case. 

          2) warnings where the auto-failover is not even attempted/run at all due to some reason - we don't log events for this case.

          Given the terming used in the PRD "why did autofailover not succeed" and "which node was orchestrator" which seems to be only applicable to case 1 only those were logged.

          hareen.kancharla Hareen Kancharla added a comment - - edited Ashwin Govindarajulu : There are two category of not being able to auto-failover. 1) when an auto-failover was started but was aborted/not-completed due to a specific reason - we log events in this case.  2) warnings where the auto-failover is not even attempted/run at all due to some reason - we don't log events for this case. Given the terming used in the PRD "why did autofailover not succeed" and "which node was orchestrator" which seems to be only applicable to case 1 only those were logged.

          People

            hareen.kancharla Hareen Kancharla
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty