Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8017

[Internal Only Doc] cbrecovery tool to recovery data for missing partitions

    Details

    • Sprint:
      PCI Team - Sprint 1, PCI Team - Sprint 2, PCI Team - Sprint 3, PCI Team - Sprint 4

      Description

      Attached is FINAL draft content for review on cbrecovery for missing data partitions:

      "Data Recovery from Remote Clusters" on pages 114 – 118.

      Mandatory Reviewers: Bin Cui, Abhinav

      Optional Reviewers: PerryK, JamesM

      Bug tracking doc update: http://www.couchbase.com/issues/browse/MB-8017

      Review timeframe: *Please provide your review input by EOD Wednesday, May 22 of next week.

      *Please add comments directly to the PDF or to the bug link above.
      ==================================

      Provide a tool to help with data recovery when nodes (beyond the number of replicas) fail. Currently if a rebalance operation is performed without restoring data, it causes loss of data. To help restore data, we need to develop a tool that manages the following:

      • check with partitions are missing all-together
      • Block service for those partitions
      • Recover the missing partitions from the backup cluster
      • Start service on those partitions

      This will require data to be backed up using XDCR on a separate cluster. The first version of the tool will be available in the early April time frame. It will be productized in the 2.1 release.

      Addition details here:

      Engineering spec: http://hub.internal.couchbase.com/confluence/display/~farshid/cbrecovery+tool.
      PM requirements: http://hub.internal.couchbase.com/confluence/display/PM/CBRecovery+Tool

      Ticket: http://www.couchbase.com/issues/browse/CBSE-301
      Ticket: http://www.couchbase.com/issues/browse/MB-8017

      Test-Plan: http://hub.internal.couchbase.com/confluence/display/QA/Cbrecovery+test+plan
      Bin, design: https://github.com/couchbase/couchbase-cli/blob/2.0.2/docs/cbrecovery-design_spec.md

      Need to build by 2.0.2 / April

      Customer request

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        kzeller kzeller added a comment -

        <para>
        <emphasis role="bold">Replicas set to Zero for
        Bucket</emphasis>. In this case you have zero replicas for a
        bucket in your cluster; this is the functional equivalent of
        having more than one bucket in the cluster and each bucket in
        the cluster has a different number of replicas. You can
        experience loss of vBuckets in either case but can recover if
        you have XDCR set up with another cluster.
        </para>

        If you have questions on this part please contact Anil.

        Show
        kzeller kzeller added a comment - <para> <emphasis role="bold">Replicas set to Zero for Bucket</emphasis>. In this case you have zero replicas for a bucket in your cluster; this is the functional equivalent of having more than one bucket in the cluster and each bucket in the cluster has a different number of replicas. You can experience loss of vBuckets in either case but can recover if you have XDCR set up with another cluster. </para> If you have questions on this part please contact Anil.
        Hide
        perry Perry Krug added a comment -

        Thanks Karen. I understand technically what it is saying, but the reason I made that comment was because I don't think users will understand what it means or how to deal with a multi-bucket cluster...so I was hoping for some specific step-by-step descriptions to make it clearer to the reader.

        Thanks

        Show
        perry Perry Krug added a comment - Thanks Karen. I understand technically what it is saying, but the reason I made that comment was because I don't think users will understand what it means or how to deal with a multi-bucket cluster...so I was hoping for some specific step-by-step descriptions to make it clearer to the reader. Thanks
        Hide
        anil Anil Kumar added a comment -

        Perry, we have added step-by-step instruction with screenshots how the recovery process will take place. for ex: 1. Failover the server 2. Add replacement server or make sure enough capacity 3. Run CBRecovery etc...

        As per the second scenario i.e. In case Bucket has zero replicas I get your point to add some description in case of multi-bucket. we will do that. thanks!

        Show
        anil Anil Kumar added a comment - Perry, we have added step-by-step instruction with screenshots how the recovery process will take place. for ex: 1. Failover the server 2. Add replacement server or make sure enough capacity 3. Run CBRecovery etc... As per the second scenario i.e. In case Bucket has zero replicas I get your point to add some description in case of multi-bucket. we will do that. thanks!
        Hide
        kzeller kzeller added a comment -

        Added scenario and images from Anil.

        Show
        kzeller kzeller added a comment - Added scenario and images from Anil.
        Hide
        kzeller kzeller added a comment -

        Added scenario and images from Anil.

        Show
        kzeller kzeller added a comment - Added scenario and images from Anil.

          People

          • Assignee:
            kzeller kzeller
            Reporter:
            steve Steve Yen
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Agile

                Gerrit Reviews

                There are no open Gerrit changes