Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-24986

Autofailover of node is taking more than 8 secs when failure type is due to memcached failure.

    XMLWordPrintable

Details

    Description

      1. Create a cluster with 3 nodes and atleast 1 bucket in the cluster
      2. Enable autofailover and set the timeout to 5 secs.
      3. On any of the node, stop the memcached process (the tests do it by sending kill SIGSTOP signal to the memcached process). Note the time when the failure was injected.
      4. Wait for the autofailover of the node to be completed. Note the time when autofailover was initiated.
      We expect the failover to be initiated within 8 secs (5 sec is ideal but we give 3 sec buffer to the initiation). But the failover is initiated after around 9-10 secs.
      This is a regression as compared to last week's build. The tests for memcached failures were passing till last weeks build (5.0.0-3088) but are failing due to autofailover being initated after the expected time.
      The tests can be found here. http://qa.sc.couchbase.com/view/nserver/job/cen006-nserv-autofailover-memcached/35/consoleFull
      Test_1, test_3, test_10, test_12, test_13 all failed due to this issue.
      The issue can be reproduced by running the following test
      ./testrunner -i <ini file> -t failover.AutoFailoverTests.AutoFailoverTests.test_autofailover,timeout=5,num_node_failures=1,failover_action=stop_memcached,nodes_init=3

      Attaching the logs from the run mentioned above for test_1 for the cluster.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            bharath.gp Bharath G P
            bharath.gp Bharath G P
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty