Description
What's the issue?
When streaming data from a cluster during a rebalance, it's possible to end up in a situation where:
1) The 'OpenStream' request is cancelled (by gocbcore and not the user)
2) The 'End' function is called with an 'ErrRequestCanceled' error
3) The 'OpenStream' callback is never run
Why is this a problem?
When cbbackupmgr streams vBuckets sequentially, we expect the 'OpenStream' callback to always be run whether that's in an error scenario or not. If we handle 'ErrRequestCanceled' as a known error and trigger a retry for the stream, we end up with a stream request timeout because the 'OpenStream' callback was never run.
Steps to reproduce
I've attached a small script which will stream all the data from the cluster in a sequential fashion. This will be used to help reproduce the issue; you may need to change some of the constants at the top of the script to ensure that it works for you (they should be very clear).
1) Install Couchbase server on three nodes
2) Cluster two of the nodes together
3) Create a bucket and load a decent amount of data into it (I'm using 750,000 50 byte documents)
4) Use the 'servers' tab in the web ui to remove a single node
5) Add the third node into the cluster
6a) Trigger a rebalance
6b) Run the test script provided
We should see most vBuckets stream successfully, however, we should see that at least one vBucket will print that the stream errored and then the stream request will timeout (at which point it will continue streaming data from the other vBuckets).
(vb 128) Stream errored with: request canceled
|
(vb 128) Stream request timed out
|
Attachments
Issue Links
- causes
-
MB-39017 CBM failed to backup during cluster rebalance
- Closed
For Gerrit Dashboard: GOCBC-874 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
126834,3 | GOCBC-874: Always remove reqs from waitingIn if status != success | master | gocbcore | Status: MERGED | +2 | +1 |
138470,4 | Update agent_diag.go | master | gocbcore | Status: ABANDONED | -1 | 0 |