Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
2.5.0
-
Security Level: Public
-
windows 2008 R2 64-bit, centos 6.2 64-bit
-
Windows 64-bit
Description
Environment:
9 windows server 2008 R2 64-bit (each server has 8 GB RAM, SSD storage)
10.3.4.127
10.3.4.132
10.3.4.133
10.3.4.134
10.3.4.135
10.3.4.136
10.3.4.137
10.3.4.138
10.3.4.139
Cluster setup:
7 nodes cluster installed couchbase server 2.5.0-915
10.3.4.127
10.3.4.132
10.3.4.133
10.3.4.134
10.3.4.135
10.3.4.136
10.3.4.137
2 buckets:
sasl-1 with 1 replica, 1 doc with 1 public view
sasl-2 with 2 replica, 1 doc with 1 public view
Test step:
Load 7 M items with size from 128 bytes to 512 bytes into each bucket. Then continue load items
into bucket until active resident ratio down to 80%.
Change load set up so that the load has set, get, delete, update, and expired in few hours.
Then while the load is running, add node 138 to cluster and rebalance. About 30 minutes after rebalance started, rebalance second bucket (sasl-1) existed with error
Haven't heard from a higher priority node or a master, so I'm taking over. mb_master000 ns_1@10.3.4.132 23:33:52 - Mon Nov 18, 2013
Rebalance exited with reason {{{badmatch,[
[{misc,sync_shutdown_many_i_am_trapping_exits, 1}, {misc,try_with_maybe_ignorant_after,2}, {gen_server,terminate,6}, {proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.6315.374>, {shutdown_replicator,'ns_1@10.3.4.127'},
infinity]}}
ns_orchestrator002 ns_1@10.3.4.127 20:37:37 - Mon Nov 18, 2013
<0.6297.374> exited with {{{badmatch,[{<0.6316.374>,noproc}
]},
[
,
{misc,try_with_maybe_ignorant_after,2},
{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.6315.374>,
,
infinity]}} ns_vbucket_mover000 ns_1@10.3.4.127 20:37:37 - Mon Nov 18, 2013
Bucket "sasl-1" rebalance does not seem to be swap rebalance ns_vbucket_mover000 ns_1@10.3.4.127 20:32:02 - Mon Nov 18, 2013
Bucket "sasl-1" loaded on node 'ns_1@10.3.4.138' in 0 seconds. ns_memcached001 ns_1@10.3.4.138 20:31:58 - Mon Nov 18, 2013
Started rebalancing bucket sasl-1 ns_rebalancer000 ns_1@10.3.4.127 20:31:58 - Mon Nov 18, 2013
Bucket "sasl-2" rebalance does not seem to be swap rebalance ns_vbucket_mover000 ns_1@10.3.4.127 20:09:10 - Mon Nov 18, 2013
Bucket "sasl-2" loaded on node 'ns_1@10.3.4.138' in 0 seconds. ns_memcached001 ns_1@10.3.4.138 20:09:06 - Mon Nov 18, 2013
Started rebalancing bucket sasl-2 ns_rebalancer000 ns_1@10.3.4.127 20:09:05 - Mon Nov 18, 2013
Starting rebalance, KeepNodes = ['ns_1@10.3.4.134','ns_1@10.3.4.136',
'ns_1@10.3.4.132','ns_1@10.3.4.137',
'ns_1@10.3.4.133','ns_1@10.3.4.127',
'ns_1@10.3.4.135','ns_1@10.3.4.138'], EjectNodes = []
ns_orchestrator004 ns_1@10.3.4.127 20:09:04 - Mon Nov 18, 2013
Attachments
Issue Links
- relates to
-
MB-7739 windows] memcached connection is lost and rebalance failed with reason {{bulk_set_vbucket_state_failed
- Closed
-
MB-7943 Memcached drops tap connections without warning
- Closed
-
MB-9390 Tap producer closes the connection because it didn't receive any ACK for 10K TAP messages sent to the consumer
- Closed
-
MB-9070 [system test] [windows] rebalance failed due to bad replica
- Closed
-
MB-9639 [system test] rebalance failed with error badmatch,{error,etimedout
- Closed
For Gerrit Dashboard: MB-9596 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
30639,1 | MB-9596: improved rebalance diagnostics | for-rackaware | ns_server | Status: MERGED | +2 | +1 |
30641,1 | Merge remote-tracking branch 'origin/for-rackaware' | master | ns_server | Status: MERGED | +2 | +1 |