Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
5.5.5, 6.0.2, 6.5.0
-
Triaged
-
Unknown
Description
Currently delta recovery goes through the following steps:
1. Queries vbucket states on delta nodes and deletes those vbuckets that diverged (only in madhatter).
2. Creates a special transitional vbucket map with all vbuckets to be recovered listed as replicas.
3. Waits for all delta nodes to warmup all buckets of interest (no replications are created here).
4. In the beginning of bucket rebalance janitor cleanup is called. This creates replications to all vbuckets to be recovered.
5. Alternatively, if the rebalance is interrupted, regular janitor run will create the replications.
The steps 4 and 5 are problematic. If many vbuckets need to be rolled back, this may overload memcached on delta nodes (for an example of this, see CBSE-7262).
This behavior is a bit better in madhatter with this commit: https://github.com/couchbase/ns_server/commit/35f3c77c08b39ab3744094fb00cb0dd3dfab054f. But that still doesn't address all cases.
If possible, we should try to address this in madhatter time frame. It would seem that the way to address this is to add extra metadata to bucket configs for vbuckets being recovered and have ns_janitor:cleanup/janitor_agent:apply_new_bucket_config not create replications for those.
Attachments
Issue Links
Gerrit Reviews
For Gerrit Dashboard: MB-35782 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
114998,17 | MB-35782 do not create replications during delta recovery all at the | master | ns_server | Status: NEW | 0 | 0 |