Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
1.8.0
-
Security Level: Public
-
None
-
windows 2008 R2 64 bit in ec2
Description
Install couchbase 1.8.0 GA on 6 nodes in ec2 with 14 GB RAM and 100 GB storage for each node on Feb 23 2012
Create a cluster of 4 nodes with 2 bucket: default and default-0 buckets
Load 50+ million items to default bucket and 8+ million items to default-0 bucket with set, get, delete, expire and different item size.
./pytests/performance/mcsoda.py 23.20.45.23:11211 vbuckets=1024 expiration=120 min-value-size=10,50,128,256,512 ratio-sets=0.4 ratio-creates=0.7 ratio-deletes=0.2 ratio-expirations=0.2 max-items=20000000 max-ops=0 doc-cache=0 prefix=media_1_
We do rebalance in and out node with different ways during 2 months with load running like
add a node and rebalance, remove a node and rebalance, add and remove a node at the same time and rebalance.
Then we see several time memcached exited with status 255
Port server memcached on node 'ns_1@10.12.98.26' exited with status 255. Restarting. Messages: Suspend eq_tapq:replication_ns_1@10.12.95.171 for 5.00 secs
Suspend eq_tapq:replication_ns_1@10.12.95.171 for 5.00 secs
Suspend eq_tapq:replication_ns_1@10.12.95.171 for 5.00 secs
Suspend eq_tapq:replication_ns_1@10.12.95.171
and other error like:
Control connection to memcached on 'ns_1@10.12.87.41' disconnected: {{badmatch,
{error,
timeout}},
[
,
{mc_client_binary, stats, 4},
{ns_memcached, ensure_bucket_config, 4},
{ns_memcached, handle_info, 2},
{gen_server, handle_msg, 5},
{proc_lib, init_p_do_apply, 3}]}
Here is the link to log files zipped (100 MB)
https://s3.amazonaws.com/packages.couchbase/sandbox/windows_long_large_cluster_log.tar.gz