Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: 2.0
Affects Version/s: 2.0
Component/s: couchbase-bucket, ns_server
Security Level: Public
Labels:
- system-test
Environment:
build 10.3.3.59

Description

1 node went down while loading data on 22 node cluster. (possibly related to xen-hypervisor as it could not ping gateway and network interface needed to be restarted)
While node was down I tried to fail it over and rebalance.
However, rebalance never completes and looks like there is no rebalance activity occuring on tap.

Some activity seen in logs at time of node down:

10.3.3.59 sees .60 nodedown :

[user:warn,2012-10-22T11:06:38.896,ns_1@10.3.3.59:ns_node_disco:ns_node_disco:handle_info:168]Node 'ns_1@10.3.3.59' saw that node 'ns_1@10.3.3.60' went down.

at the same time stamp node .60 shows:

[ns_server:error,2012-10-22T11:06:00.350,ns_1@10.3.3.60:<0.12281.36>:ns_janitor:cleanup_with_states:84]Bucket "default" not yet ready on ['ns_1@10.3.2.84','ns_1@10.3.2.
85',
'ns_1@10.3.2.110','ns_1@10.3.2.111',
'ns_1@10.3.2.112','ns_1@10.3.2.113',
'ns_1@10.3.2.114','ns_1@10.3.2.115',
'ns_1@10.3.3.59','ns_1@10.3.3.62',
'ns_1@10.3.3.65','ns_1@10.3.3.66',
'ns_1@10.3.3.69','ns_1@10.3.3.70',
'ns_1@10.3.121.90','ns_1@10.3.121.91',
'ns_1@10.3.2.107','ns_1@10.3.2.108',
'ns_1@10.3.2.109']
[ns_server:debug,2012-10-22T11:06:07.388,ns_1@10.3.3.60:<0.12508.36>:janitor_agent:new_style_query_vbucket_states_loop:116]Exception from query_vbucket_states of "defau
lt":'ns_1@10.3.2.85'
{'EXIT',{{nodedown,'ns_1@10.3.2.85'},
{gen_server,call,
[

{'janitor_agent-default','ns_1@10.3.2.85'}

,
query_vbucket_states,infinity]}}}

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

10.3.3.59.debug.tar.gz
3.89 MB
23/Oct/12 1:39 PM
10.3.3.60.debug.tar.gz
830 kB
23/Oct/12 1:39 PM

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Chiyoung Seo (Inactive)

Reporter:: Tommie McAfee (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 23/Oct/12 1:39 PM

Updated:: 09/Jan/13 11:55 PM

Resolved:: 26/Oct/12 5:18 PM

Gerrit Reviews

There are no open Gerrit changes

Show There are 2 closed Gerrit changes

Hide There are 2 closed Gerrit changes

MB-6992 Add more informative logs to checkpoint prioritization: Gerrit Review:

MB-6992 Control the flusher execution by the transaction size: Gerrit Review:

rebalance hangs after failing over disconnected node

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty