Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: 1.8.1
Affects Version/s: 1.8.1-release-candidate
Component/s: ns_server
Security Level: Public
Labels:
None
Environment:
18 node cluster, Centos
Build 181-918
2 buckets, 1024vbuckets

Description

Setup
1.Setup a 18 node cluster with 2 buckets- bucket1, bucket2
2. Enable auto-failover
3. Add a new node 126
4. Rebalance

Output
1. Rebalance works fine. But seeing these log messages -

Attached are the web-logs and logs from master node-104.

https://s3.amazonaws.com/bugdb/jira/web-log-largeCluster/ns-diag-20120618095246.txt
https://s3.amazonaws.com/bugdb/jira/web-log-largeCluster/10.3.2.104-8091-diag.txt.gz

Other related conversation
I have enabled auto-failover on the large-cluster and every time I rebalance In a node, I get an error message showing " Could not automatically failover node 'ns_1@10.3.121.126<ns_1@10.3.121.126><ns_1@10.3.121.126<ns_1@10.3.121.126>>' because I think rebalance is running" .
The node 126 is newly added and rebalance issued, is this message displayed because the node is not yet ready to join the cluster ?
The rebalance works fine, but I do not understand why is auto-failover attempted in here. Any idea?

No. according to logs at 19:32:04 bucket1 was loaded. Maybe there are some other buckets that are still not ready on this node. May I have logs?

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Aleksey Kondratenko (Inactive)

Reporter:: Ketaki Gangal (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 18/Jun/12 2:55 PM

Updated:: 09/Jan/13 8:59 PM

Resolved:: 19/Jun/12 4:53 PM

Gerrit Reviews

There are no open Gerrit changes

Show There are 3 closed Gerrit changes

Hide There are 3 closed Gerrit changes

MB-5602: consider buckets' servers list when computing down nodes: Gerrit Review:

Merge commit '0e6b2f70276f271d08bf1fe46c4b8da528c67c66' into master: Gerrit Review:

Merge remote branch 'origin/branch-181' into branch-18: Gerrit Review:

auto-failover fails over a node if some of the buckets are already rebalanced out but rebalance has been stopped or interrupted ( auto-failover should failover if all buckets are down)

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty