Details
-
Bug
-
Resolution: Won't Fix
-
Blocker
-
2.1.0
-
Security Level: Public
-
None
-
windows 2008 r2 64-bit
-
Windows 64-bit
Description
Environment:
7 windows server 2008 r2 64bit with SSD and 8GB RAM
1:10.3.121.173
2:10.3.121.169
3:10.3.121.171
4:10.3.3.214
5:10.3.121.47
6:10.3.3.180
7:10.3.3.181
Cluster setup:
7 nodes cluster
1 default bucket (3GB with replica index enable)
1 sasl bucket (3GB with replica index disable)
Load item to both buckets until active resident ratio down to 70%
Run access phase more than one day with
default bucket: set 5% get 80% expired 5%
sasl bucket: set 5% get 5% delete 20% and expired 80%
After more than one day running, one node in cluster (10.3.121.171) become unstable with erlang crashed
See memory use is used over 3GB over mem quota for couchbase server as show in capture
Link to manifest file of this build http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.2-807-rel.setup.exe.manifest.xml
Link to collect info of 6 nodes https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_2/2013_05/7nodes_202-807_nodeDown_erlCrash_20130522-174845.tgz
Due to node 171 is down, I could not get the collect info file
erlang crash dump also attached
Cluster is in failed state now
Impact of this bug. Node 10.3.121.171 is in unstable state. One solution is failover that node, add another node in and rebalance out failed node.
.