Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
None
-
1
Description
It's possible for the sdk to establish a connection to the server but during bootstrap for the server to terminate the connection. When this happens the SDK can get into a state where the memdclient is already closed and tries to dispatch one of the bootstrap operations.
This leads to the memdclient rejecting the operation with a request cancelled error. If this error is returned for a bootstrap critical operation (e.g. auth) then bootstrap is failed and the error returned to the memdclient dialler.
If bootstrap fails against a node then the dialler will wait a set amount of time before re-attempting bootstrap against that node. This wait does not happen when bootstrap fails with a request cancelled error, like returned in the scenario above. This leads the SDK to immediately retry bootstrap, which can lead to all this retriggering from the top and leading to a loop of bootstrapping.
An effect of this is that we create a large number of buffered readers for connections in rapid succession, each 20MB. This can lead to an explosion in the memory usage of the SDK.