When investigating an issue, I found that the server would unexpectedly respond with a 0x0a command for optimized get requests. This seems to be in violation of the protocol, and it causes all kinds of havoc for a client library.
The scenario is spymemcached's internals doing an optimized get. It's reading the response for an operation during a rebalance. It has been observed with both the node being rebalanced into the cluster and with a node leaving the cluster.
Reading the minimum header, the client library is checking the magic, and then the response command. It's expecting a 0x00, but receiving a 0x0a.
I poked through the server side code, and 0x0a corresponds to TAP_NOOP. I poked around a little further, and I don't see any situation where we would respond with this 0x0a to a non-TAP client.
The assertion this is raising is right here:
If I force spymemcached to not optimize, the problem goes away.
I will attach a packet capture.
|Fix Version/s||2.0-developer-preview-5 [ 10290 ]|
|Assignee||Damien Katz [ damien ]||Matt Ingenthron [ ingenthr ]|
|Component/s||clients [ 10042 ]|
|Component/s||storage-engine [ 10175 ]|
|Fix Version/s||.next [ 10205 ]|