Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4694

After a failed rebalance nodes/self api returns 404 "...unknown to this cluster"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 2.0-developer-preview-4
    • 2.0-developer-preview-4
    • couchbase-bucket
    • Security Level: Public
    • None
    • 6 centos vms, 4cpu/4gb memory each

      Running py-view.conf from jenkins against build 553

    Description

      In a cluster of 6 nodes and 100k documents, 5 nodes are ejected while 10k documents are being deleted. At the same time, the python client queries the view to trigger re-indexing in the cluster and waits for all changes to be written to disk (waits for ep_queue_size == 0).

      However, the stats api returned an error when 'ep_queue_size' was queried, which then caused the test to exit and somehow left 1 node in an unknown state:

      ERROR http://10.1.2.31:8091/nodes/self error 404 reason: unknown "Node is unknown to this cluster."

      If I attempt to add the node to the cluster it reports:
      2012-01-24 18:25:39,492 - root - INFO - adding remote node : 10.1.2.31 to this cluster @ : 10.1.2.30
      2012-01-24 18:26:09,558 - root - ERROR - http://10.1.2.30:8091/controller/addNode error 400 reason: unknown ["Prepare join failed. Timeout connecting to \"10.1.2.31\" on port 8091. This could be due to an incorrect host/port combination or a firewall in place between the servers."]
      2012-01-24 18:26:09,558 - root - ERROR - add_node error : ["Prepare join failed. Timeout connecting to \"10.1.2.31\" on port 8091. This could be due to an incorrect host/port combination or a firewall in place between the servers."]

      This sometimes blocks all remaining tests in the view run list once a node enters this state. Attaching logs from rebalance test case and cluster diags

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tommie Tommie McAfee (Inactive)
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty