Details
-
Bug
-
Resolution: Fixed
-
Major
-
Cheshire-Cat
-
Untriaged
-
1
-
No
Description
I see a bunch of
2021-02-27T18:41:18.792-08:00 ERRO GOXDCR.GenericSupervisor: PipelineSupervisor_2c3bc0c5670030c5aed22087051b502d/bucket2/bucket2 Received error report : Collections Router 2c3bc0c5670030c5aed22087051b502d/bucket2/bucket2 error - unable to find last known target manifest version 3386 from collectionsManifestSvc - err: Unable to find target manifest for version 3386
|
that may need some investigating
Update:
It is possible that when a pipeline resumes, all the VB's are able to be resume a DCP stream. This leads to XDCR declaring the pipeline "ready for checkpoint" (checkpoint manager's isCheckpointAllowed())
However, if XMEM is stuck or things are timing out, it is possible for certain VBs in ThroughSeqnoTracker to not have any data flow. Then, the next time ckpt mgr performs a checkpoint, it may retrieve 0 for the target manifest ID and commit that into a checkpoint. If this happens too often it'll render all the ckpts with 0 as their target manifest IDs. This will then lead to collections Manifest service throwing away all manifests for the target.
Attachments
Issue Links
- relates to
-
MB-44734 XDCR - collections router incorrectly handles manifest for rollback
- Closed
For Gerrit Dashboard: MB-44683 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
147659,7 | MB-44683 - tgt manifests can be lost when pipeline restarts due to target errors | master | goxdcr | Status: MERGED | +2 | +1 |