Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49795

MultiNodeFailure: Failover is attempted when it can't failover 'N' nodes at the same time

    XMLWordPrintable

Details

    Description

      Build: 7.1.0-1787

      Scenario:

      • 7 node cluster
      • Couchbase bucket with replicas=3
      • Set auto-failover with max_events=1 and timeout=5
      • Stop memcached on 2 nodesĀ  (172.23.100.13 and 172.23.100.14)

      Observation:

      Seeing auto-failover is getting triggered and fails with the following reason and getting retied continuously.

      Could not auto-failover more nodes (['ns_1@172.23.100.14']). Maximum number of auto-failover events (1) has been reached

      Expected behavior:

      Auto-failover should never get attempted if the max_events configured is less than the failed nodes in cluster.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          ashwin.govindarajulu Ashwin Govindarajulu created issue -
          meni.hillel Meni Hillel (Inactive) made changes -
          Field Original Value New Value
          Assignee Meni Hillel [ JIRAUSER25407 ] Hareen Kancharla [ JIRAUSER25304 ]
          meni.hillel Meni Hillel (Inactive) made changes -
          Assignee Hareen Kancharla [ JIRAUSER25304 ] Ashwin Govindarajulu [ ashwin.govindarajulu ]
          meni.hillel Meni Hillel (Inactive) made changes -
          Resolution Not a Bug [ 10200 ]
          Status Open [ 1 ] Resolved [ 5 ]
          ashwin.govindarajulu Ashwin Govindarajulu made changes -
          Assignee Ashwin Govindarajulu [ ashwin.govindarajulu ] Meni Hillel [ JIRAUSER25407 ]
          Resolution Not a Bug [ 10200 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          meni.hillel Meni Hillel (Inactive) made changes -
          Assignee Meni Hillel [ JIRAUSER25407 ] Artem Stemkovski [ artem ]
          artem Artem Stemkovski made changes -
          Resolution Fixed [ 1 ]
          Status Reopened [ 4 ] Resolved [ 5 ]
          ashwin.govindarajulu Ashwin Govindarajulu made changes -
          VERIFICATION STEPS Seeing following log,
          Could not auto-failover more nodes (['ns_1@172.23.105.212']). Maximum number of auto-failover nodes (1) has been reached.
          Assignee Artem Stemkovski [ artem ] Ashwin Govindarajulu [ ashwin.govindarajulu ]
          Status Resolved [ 5 ] Closed [ 6 ]

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty