Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4832

failed rebalance followed by failover leads to very large data loss

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Major
    • 2.0-developer-preview-4
    • 2.0-developer-preview-4
    • couchbase-bucket
    • Security Level: Public
    • None
    • dp4 -rc715
      3-5 nodes

    Description

      After loading about 6 million docs into a 3 node cluster, I attempted to add 2 more nodes and rebalance, but it hung and I stopped it after ~20 minutes. The cluster however showed that the nodes were added so I failed over 2 of the original nodes and the dataloss was 72% (6million to 700k docs). Seems rather high, though I suspect this was due to a failure to distribute items to nodes that were first rebalanced in.

      If this behavior is expected, I wonder if it will be possible to also warn user about how much data will be lost.

      Real bug is probably with rebalance. diags attached.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty