Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
2.0
-
Security Level: Public
-
centos 6.2 64bit build 2.0.0-1781
Description
Cluster information:
- 8 centos 6.2 64bit server with 4 cores CPU
- Each server has 32 GB RAM and 400 GB SSD disk.
- 24.8 GB RAM for couchbase server at each node
- SSD disk format ext4 on /data
- Each server has its own SSD drive, no disk sharing with other server.
- Create cluster with 6 nodes installed couchbase server 2.0.0-1781
- Cluster has 2 buckets, default (12GB) and saslbucket (12GB).
- Each bucket has one doc and 2 views for each doc (default d1 and saslbucket d11)
- Enable consistent view on cluster (by default)
10.6.2.37
10.6.2.38
10.6.2.44
10.6.2.45
10.6.2.42
10.6.2.43
- Load 14 million items to both bucket. Each key has size from 512 bytes to 1024 bytes
- Queries all 4 views from 2 docs
10.6.2.39
10.6.2.40
- Data path /data
- View path /data
Manifest info from build 1781
http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.0-1781-rel.rpm.manifest.xml
-
- Add 2 nodes: 39 and 40 and rebalance. During rebalance, reboot node 42 and 43. Rebalance failed as expected.
- After node finished warmup, rebalance again. Rebalance failed with bug
MB-6490on node 44. - Failover node 44 and rebalance
- Check diags and couchdb log of node 45, I see a lot errors like
couchdb:info,2012-10-02T5:07:39.687,ns_1@10.6.2.45:<0.7295.16>:couch_log:info:39]Updater, set view `saslbucket`, replica group `_design/d11`, stopped with reason:
{updater_error, shutdown}couchdb:info,2012-10-02T5:21:08.305,ns_1@10.6.2.45:<0.5482.262>:couch_log:info:39]Updater, set view `default`, replica group `_design/d1`, stopped with reason:
{updater_error, shutdown}Link to collect info of all nodes https://s3.amazonaws.com/packages.couchbase/collect_info/orange/2_0_0/201210/8nodes-col-info-1781-rebalance-hang-20121002-114333.tgz