Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: 2.0-developer-preview-4
Affects Version/s: 2.0-developer-preview-4
Component/s: couchbase-bucket
Security Level: Public
Labels:
None
Environment:
6 centos vms, 4cpu/4gb memory each

Running py-view.conf from jenkins against build 553

Description

In a cluster of 6 nodes and 100k documents, 5 nodes are ejected while 10k documents are being deleted. At the same time, the python client queries the view to trigger re-indexing in the cluster and waits for all changes to be written to disk (waits for ep_queue_size == 0).

However, the stats api returned an error when 'ep_queue_size' was queried, which then caused the test to exit and somehow left 1 node in an unknown state:

ERROR http://10.1.2.31:8091/nodes/self error 404 reason: unknown "Node is unknown to this cluster."

If I attempt to add the node to the cluster it reports:
2012-01-24 18:25:39,492 - root - INFO - adding remote node : 10.1.2.31 to this cluster @ : 10.1.2.30
2012-01-24 18:26:09,558 - root - ERROR - http://10.1.2.30:8091/controller/addNode error 400 reason: unknown ["Prepare join failed. Timeout connecting to \"10.1.2.31\" on port 8091. This could be due to an incorrect host/port combination or a firewall in place between the servers."]
2012-01-24 18:26:09,558 - root - ERROR - add_node error : ["Prepare join failed. Timeout connecting to \"10.1.2.31\" on port 8091. This could be due to an incorrect host/port combination or a firewall in place between the servers."]

This sometimes blocks all remaining tests in the view run list once a node enters this state. Attaching logs from rebalance test case and cluster diags

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

ns-diag-20120124175014.txt.gz
1.04 MB
25/Jan/12 2:10 AM
viewTests.log
246 kB
25/Jan/12 2:10 AM

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Tommie McAfee (Inactive)

Reporter:: Tommie McAfee (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Due:: 27/Jan/12

Created:: 25/Jan/12 2:10 AM

Updated:: 18/Jun/13 9:12 PM

Resolved:: 07/Feb/12 11:57 AM

Gerrit Reviews

There are no open Gerrit changes

After a failed rebalance nodes/self api returns 404 "...unknown to this cluster"

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty