Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 3.1.6, 4.1.2, 4.5.1, 5.0.0
Affects Version/s: 3.1.3
Component/s: ns_server
Labels:
None

Triage:
Untriaged
Is this a Regression?:
Unknown

Description

As seen by a customer.

Managed to repro by modifying the code so it stores checkpoints at the start of each replication.
Here's what should happen so the xdcr replicator goes into infinite loop:
1. Some data for vbucket is replicated
2. Checkpoint created with latest seqno and UUID
3. VBucket becomes replica (add another node to the cluster)
4. Stream closes with status=2, dcp_notifier still remembers last seqno and UUID
5. xdcr replicator will ping notifier several times without creating a stream, because seqno and UUID match
6. it will not go into infinite loop, because eventually dcp notifier is getting terminated, die to vbucket becoming a replica
7. VBucket becomes active (the added node is removed from the cluster)
8. xdcr replicator is created, it reads seqno and UUID from the checkpoint
9. xdcr replicator calls dcp notifier and gets immediate reply since the stream is closed and seqno and UUID match
10. It wakes up, doesn't get any data and then calls dcp notifier again with the same seqno and UUID
11. infinite loop