Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
None
-
1
Description
It's possible for the sdk to establish a connection to the server but during bootstrap for the server to terminate the connection. When this happens the SDK can get into a state where the memdclient is already closed and tries to dispatch one of the bootstrap operations.
This leads to the memdclient rejecting the operation with a request cancelled error. If this error is returned for a bootstrap critical operation (e.g. auth) then bootstrap is failed and the error returned to the memdclient dialler.
If bootstrap fails against a node then the dialler will wait a set amount of time before re-attempting bootstrap against that node. This wait does not happen when bootstrap fails with a request cancelled error, like returned in the scenario above. This leads the SDK to immediately retry bootstrap, which can lead to all this retriggering from the top and leading to a loop of bootstrapping.
An effect of this is that we create a large number of buffered readers for connections in rapid succession, each 20MB. This can lead to an explosion in the memory usage of the SDK.
Attachments
For Gerrit Dashboard: GOCBC-1177 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
162906,4 | GOCBC-1177: Be more selective on when to apply node dial backoff | master | gocbcore | Status: MERGED | +2 | +1 |