Details
-
Bug
-
Resolution: Won't Do
-
Minor
-
None
-
1.0.2
-
Security Level: Public
-
None
-
tested on linux
Description
Let's assume for now that a new node has been added to the cluster.
1. When a configuration change is detected, the reconfigure() method of TapConnectionProvider is called.
2. TapConnectionProvider will call ((CouchbaseConnection)conn).reconfigure(bucket);
3. Inside CouchbaseConnection reconfigure() the new server will be found and added to the list newServers
4. Then createConnections(newServers) will be called in the parent class MemcachedConnection
5. Depending on the log level you'll see the message logged from getLogger().info("Added %s to connect queue", qa);
6. Then the code will hang at the line: https://github.com/couchbase/spymemcached/blob/master/src/main/java/net/spy/memcached/MemcachedConnection.java#L155
qa.setSk(ch.register(selector, ops, qa));
The code will hang until another packet is received on the channel. This is the expected behavior in Java NIO. The recommended practice is perform registrations from the same thread as selects. In this case we're registering from the thread that was monitoring for configuration changes, and selecting from the main run loop of the MemcachedConnection.
It's not a huge problem for us, because even in an idle situation we eventually receive a NOOP. However, this combined with another bug I was hitting and made it really hard to troubleshoot. I'd suggest we look at ways to avoid this problem.
It's also possible this is a bug in Spy and not the Java client, but I haven't studied how createConnections() is used in other contexts within spy, so it might just be how we use it when reconfiguring from the java client.