Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
1.8.1, 2.0, 2.0.1
-
Security Level: Public
-
- 6 nodes couchbase cluster v2.0.1 Community on CentOS 6.4 X86_64
- Application stack is node.js with node-memcache and client-side moxi v1.8.1 running on CentOS 6.4 X86_64
-
Centos 64-bit
Description
When the couchbase topologie has not changed, the bucketsStreaming API returns the same information but in a different sorting order regarding the known couchbase nodes ("nodes":[])
This makes the Moxi simple string::compare detect some changes then trigger a topologie change and the corresponding memory leak referenced as MB-3121
The probability that the sorting order is different from chunk to chunck is higher with the number of nodes in your cluster
In our conf with only 6 nodes, Moxi triggers a topologie change at every new bucketsStreaming chunk (new chunk every ~5/10 Sec), consuming GB of memory every 24 hours and triggering the kernel oom killer
Fixing MB-3121 is of course mandatory, but we think the bucketsStreaming API should return a sorted list of nodes to avoid unnecessary Moxi config change (or Moxi should use a more robust topologie comparison mechanism)
Moreover, MB-3121 has been created on Dec 6 2010 : If this Moxi bug is so hard to fix, may be sorting the bucketsStreaming could be prioritized
Any chance to get it with 2.0.2 ?
—
- How did we came to this conclusion :
If we get it right Moxi makes a long HTTP connexion to the bucketsStreaming API on one of the couchbase cluster node and get chuncks delimited by "\n\n\n\n"
The last received chunk represents the current known topologie of the couchbase cluster
Moxi makes some kind of string::compare between the new chunk and the previous one received from couchbase :
- case 1 : if they compare equal, Moxi consider there is no topologie changes
- case 2 : if they do not compare equal, Moxi consider there is some topologie changes
When "case 2" occurs, Moxi updates its known topologie accordingly, and increment its internal "config_ver" number <- And then happens the memory leak referenced as MB-3121
We can confirm that our client side moxis detect erroneous topologie changes when visualizing the config_ver number increasing at (almost) each new bucketsStreaming chunck
while true; do echo e "-----\n$(date)"; echo "stats proxy" | nc localhost 11219 | grep config_ver; sleep 5; done
------
mar. juin 4 10:55:10 GMT 2013
STAT 11219:active:info:config_ver 842
------
mar. juin 4 10:55:15 GMT 2013
STAT 11219:active:info:config_ver 842
------
mar. juin 4 10:55:20 GMT 2013
STAT 11219:active:info:config_ver 843
------
mar. juin 4 10:55:25 GMT 2013
STAT 11219:active:info:config_ver 844
------
mar. juin 4 10:55:30 GMT 2013
STAT 11219:active:info:config_ver 846
------
mar. juin 4 10:55:40 GMT 2013
STAT 11219:active:info:config_ver 847