Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.6.0, 7.1.4, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.2.1, 7.1.5, 7.2.2
-
Untriaged
-
0
-
Yes
Description
XDCR's backfill pipeline was designed is such that the following occurs:
- Backfill pipeline is created and asks for a set of VBs
- Streams created for the subset of VBs to ask from KV DCP
- Once a single VB has finished out of the original subset of VBs, XDCR starts a timer
- If all the VBs requested originally finished within the timer expiring, things are all good.
- If a subset of VBs requested originally did not finish, the timer fires, then the pipeline will restart with the unfinished set of VBs.
MB-57304 in 7.2.2 introduced DCP backfill limit of 64 streams.
This leads to the fact that only 64 VBs will proceed at once, and the rest of the VBs will not. This means that as soon as 1 VB of the first batch is finished, the timer starts. The assumption that all VBs proceed at the same time is broken.
The end result is that the timer is too aggressive.
We should revisit the timeout timer to be more intelligent instead of a blanket timer. For example, maybe the timer can be reset if the number of VBs that are being done is progressing.
Attachments
Issue Links
- is a backport of
-
MB-59499 XDCR - make backfill pipeline idle detection more intelligent
- Closed