Description
Customer has a cluster with 3 data nodes(02, 05, 06), and 8 replications out of the source cluster. One of the replication is in a weird state. On node 02, it is running without any problem. On nodes 05, 06, it is not running. We see
2023-05-07T20:10:00.887+03:00 INFO GOXDCR.ReplMgr: Updating status for paused replication ... total_docs=14535906, docs_processed=0, changes_left=14535906 |
pause/resume replication does not fix the problem. We see in the log:
2023-05-08T15:27:47.856+03:00 INFO GOXDCR.PipelineMgr: Update-now message is already delivered for ... |
We see 7 updater go-routines for nodes 05,06 while node 02 has 8 (since there are 8 replications):
7 @ 0x43d376 0x44ccf2 0xa0105c 0x483822 0xa00e07 0xa00dd5 0x46cde1 |
# 0xa0105b github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).run.func1+0x21b /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:1534 |
# 0x483821 sync.(*Once).doSlow+0xc1 /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.7/go/src/sync/once.go:68 |
# 0xa00e06 sync.(*Once).Do+0x46 /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.7/go/src/sync/once.go:59 |
# 0xa00dd4 github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).run+0x14 /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:1523 |
So the problem nodes are missing the Updater for this pipeline.
Attachments
Issue Links
- relates to
-
MB-57353 XDCR RAS
- Open