Description
This is the plan for turning back on P2P globally.
1. When P2P is turned on, whenever a replication starts, each node will wait for all other nodes to respond to its pull request.
The first step is to loosen this restriction. If a node in a cluster is slow running, then the pipeline should start after the timeout occurs, even if it means it starts from seqno 0, which is pre-MB-9982 behavior.
2. Once the change is in, we will run system tests/longevity tests on couchstore with the test turning on P2P.
2a. We should also run Pavithra Mahamani's feature test with these changes too.
3. If there are any unclean issues, I'll address them.
4. Once the runs are clean, we can reenable the p2p globally.
5. Once globally turned on, we can tackle any other non-critical path issues after that.
Attachments
Issue Links
- relates to
-
MB-49481 Investigate timeout observed in longevity with p2p enabled - RuntimeCtx:Execution timed out - Pipeline did not start in a timely manner, possibly due to busy source or target. Will try again...
- Closed
-
MB-49559 Investigate error observed in longevity with p2p enabled - Pipeline did not start in a timely manner, possibly due to busy source or target. Will try again...
- Closed
For Gerrit Dashboard: MB-49377 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
165188,6 | MB-49377: allow pipeline to start if p2p ckpt pull times out | master | goxdcr | Status: MERGED | +2 | +1 |
165418,4 | MB-49377 - developing multi-cluster rebalancing test cases | master | goxdcr | Status: MERGED | +2 | +1 |
165935,2 | MB-49377: re-enable p2p ckpt pull | master | goxdcr | Status: MERGED | +2 | +1 |