Details
Description
this happens in failover tests after we fail over a node and then rebalance from 10 nodes to 9 nodes.
10.1.5.61 is the only node in the cluster that crashes and never comes back up.
the previous test also failed because curr_item didn't match the curr_items_tot and I am guessing 10.1.5.61 has stopped responding before 6:20 which is the time stamped in the core.
core log :
http://qa.hq.northscale.net/job/multiple-nodes-ubuntu-64-extended/28/artifact/core-10.1.5.61-0.log
all other diags are also here :
master node : http://qa.hq.northscale.net/job/multiple-nodes-ubuntu-64-extended/28/artifact/10.1.3.105-diag.txt.zip
other nodes :
http://qa.hq.northscale.net/job/multiple-nodes-ubuntu-64-extended/28/artifact/