Details
-
Bug
-
Resolution: Fixed
-
Major
-
master
-
None
Description
Steps to repro:
- Start a 3 node couchbase server cluster using Enterprise Edition 5.0.0 build 3456
- Replicas: 1
- Autofailover: enabled (max timeout: 30s)
- Create cbdatasource.NewBucketDataSource() and pass in an array of urls: ["http://host1:8091", "http://host2:8091", "http://host3:8091"]
- On host2, run sudo systemctl stop couchbase-server to abruptly stop the node
- Add docs to couchbase server via GoCB SDK
It appears that the cbdatasource client (Sync Gateway) is not receiving DCP DataUpdate messages for docs that hash to vbuckets owned by host2, the one that's abruptly killed. The expected behavior is that the cluster map would be refreshed somehow, and it would be noticed that another node is now the active node for those vbuckets, and events for those vbuckets would be received over the DCP feed for that node.
If the cluster is rebalanced to remove host2 from the serverlist, the cbdatasource client still is not receiving the DCP DataUpdate messages.
If host2 is restarted, and then joins the cluster and a rebalance add operation is run, now the cbddatasource client receives the DCP DataUpdate messages for those vbuckets.
Adam Fraser and I spent some time digging through the cbdatasource code to try to understand the mechanism how the cluster map is reloaded in this scenario, and couldn't find it. (but maybe it's there and we just couldn't spot it). Is there anything different about the way we need to call cbdatasource in order to make this work? The Sync Gateway code that interfaces with cbdatasource is in dcp_feed.go
Related Sync Gateway issue: SG Issue 2197
go-couchbase version: 6c44a8829958bfe71283ed9fec2c28d722a3be27