Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-12055

ns_janitor may lose replicas of nearly completed vbucket moves (was: {DCP} : Delta Recovery Impossible after re-try of graceful failover since in first attempt failed)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 3.0
    • 3.0
    • ns_server
    • Security Level: Public
    • 1186, centos 6x, 10.6.2.144-10.6.2.160
    • Untriaged
    • Unknown

    Description

      build 1186

      Scenario
      1. Create a 7 Node cluster
      2. Create default bucket with 200 K items
      3. Graceful failover a node
      4. Kill memcached of another node during graceful failover
      5. Graceful failover the same node in step 3
      6. Add-back the node with delta recovery
      7. Hit Rebalance

      We see the following warning:: "Fail Over Warning: Rebalance required, some data is not currently replicated!”

      In Step 7, Rebalance fails for delta recovery. Says delta recovery is not possible. Although we see nodes in the cluster are in healthy state. This is true when we have 200 K items Vs 100K items where it passes.

      I am attaching the logs for you to analyze. Since the above warning comes in both cases. Not sure about the internal state of the system which stops the add-back delta recovery.

      Test fails for 2k items
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.144-8222014-1436-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.144-8222014-1445-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.145-8222014-1437-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.145-8222014-1445-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.146-8222014-1439-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.146-8222014-1445-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.147-8222014-1440-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.147-8222014-1446-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.148-8222014-1441-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.148-8222014-1446-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.149-8222014-1442-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.149-8222014-1446-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.150-8222014-1444-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.150-8222014-1446-couch.tar.gz

      Test passes for 1 K Items

      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.144-8222014-1458-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.144-8222014-159-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.145-8222014-150-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.145-8222014-159-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.146-8222014-151-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.146-8222014-159-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.147-8222014-153-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.147-8222014-159-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.148-8222014-1510-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.148-8222014-154-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.149-8222014-1510-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.149-8222014-156-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.150-8222014-1510-couch.tar.gz
      https://s3.amazonaws.com/bugdb/jira/MB-12037/10.6.2.150-8222014-157-diag.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            parag Parag Agarwal (Inactive)
            parag Parag Agarwal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty