Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
2.11.4, 2.12.3
-
None
Description
As explained and fixed in [this pull request|https://github.com/couchbase/spymemcached/pull/38], if there is a payload error code from the memcached server (such as ERR_2BIG, ERR_BUSY, etc), the operation transitions to COMPLETED unexpectedly; if there are still pending writes for that operation, the attempt to send them will cause a NPE to be thrown and ultimately the MemcachedIO thread will die and all IO will cease irrecoverably because the API will continue to accept requests without signaling error.
This is a difficult to detect without a global thread exception handler.
This has been the root cause of a mysterious performance degradation issue in production for us for over a year and the linked PR resolves it. I am posting this issue here with the hope that it gets more visibility as the github issue tracker doesn't seem to have much couchbase activity.