Verified the fix using Enterprise Edition 6.6.5 build 10086.
STEPS TO REPRODUCE
- Cluster consists of 2 kv nodes and 1 eventing nodes.
- Create 4 namely src_bucket, metadata, dst_bucket, dst_bucket1.
- Create 2 handlers bucket_op and timers using first 2 buckets as source bucket and metadata and remaining 2 bucket alias - 1 each for both handlers.
- Deploy both eventing handlers.
- Stop persistence on the active kv node.
- Load 50,000 documents into source bucket.
- Once consumer receives all mutations, kill memcached on the active node in order to trigger rollback scenario.
CASE A
Reproduced original issue on 6.6.5-10080.
Eventing does a rollback to the sequence no which DCP asks it to.
2022-02-16T21:32:40.650-08:00 [Info] Consumer::dcpRequestStreamHandle [worker_timers_0:/tmp/127.0.0.1:8091_0_3400968880.sock:26360] vb: 1 DCP stream start vbKvAddr: 172.23.106.64:11210 vbuuid: 222891922269008 startSeq: 1 snapshotStart: 1 snapshotEnd: 1
|
2022-02-16T21:32:40.651-08:00 [Info] Consumer::processDCPEvents [worker_timers_0:/tmp/127.0.0.1:8091_0_3400968880.sock:26360] vb: 1 got STREAMREQ status: ROLLBACK
|
2022-02-16T21:32:41.087-08:00 [Info] Consumer::handleFailoverLog [worker_timers_0:/tmp/127.0.0.1:8091_0_3400968880.sock:26360] vb: 1 rollback requested by DCP. Retrying DCP stream start vbuuid: 222891922269008 startSeq: 0 flog startSeqNo: 0
|
CASE B
Verified the fix on 6.6.5-10086.
Eventing avoids restarting DCP stream from 0 in certain scenarios by using previous Vbuuid available in failover log.
For eg -
2022-02-22T02:49:10.452-08:00 [Warn] DCPT[eventing:zWOibW56-8240:eventing:zWOibW56-8239:worker_timers_0_0_172.23.120.107:11210_172.23.104.67:8096/0] ##18e STREAMREQ(398) with rollback 0
|
2022-02-22T02:49:16.452-08:00 [Info] Consumer::processDCPEvents [worker_timers_0_0:/tmp/127.0.0.1:8091_0_3689154071.sock:26618] vb: 398 got STREAMREQ status: ROLLBACK
|
2022-02-22T02:49:16.452-08:00 [Info] Consumer::handleFailoverLog [worker_timers_0_0:/tmp/127.0.0.1:8091_0_3689154071.sock:26618] vb: 398 rollback requested by DCP. New vbuuid: 134910398526963 startSeq: 0 flog startSeqNo: 1306003
|
Build couchbase-server-6.6.5-10086 contains eventing commit 2049505 with commit message:
MB-50947: Store failover log in checkpoint blob