Attached is FINAL draft content for review on cbrecovery for missing data partitions:
"Data Recovery from Remote Clusters" on pages 114 – 118.
Mandatory Reviewers: Bin Cui, Abhinav
Optional Reviewers: PerryK, JamesM
Bug tracking doc update: http://www.couchbase.com/issues/browse/MB-8017
Review timeframe: *Please provide your review input by EOD Wednesday, May 22 of next week.
*Please add comments directly to the PDF or to the bug link above.
Provide a tool to help with data recovery when nodes (beyond the number of replicas) fail. Currently if a rebalance operation is performed without restoring data, it causes loss of data. To help restore data, we need to develop a tool that manages the following:
- check with partitions are missing all-together
- Block service for those partitions
- Recover the missing partitions from the backup cluster
- Start service on those partitions
This will require data to be backed up using XDCR on a separate cluster. The first version of the tool will be available in the early April time frame. It will be productized in the 2.1 release.
Addition details here:
Engineering spec: http://hub.internal.couchbase.com/confluence/display/~farshid/cbrecovery+tool.
PM requirements: http://hub.internal.couchbase.com/confluence/display/PM/CBRecovery+Tool
Need to build by 2.0.2 / April