Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-1226

Rebalance after adding fourth node (separately) causes rebalance to hang

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • None
    • 1.6.0 beta1
    • UI
    • None
    • Operating System: RHEL 5
      Platform: X64

    Description

      1. Servers: 10.1.3.119, 10.1.3.120, 10.1.3.133, 10.1.3.134

      2.

      • Add 10.1.3.120, rebalance pending == 1
      • Add 10.1.3.133, rebalance pending == 1 still
      • Add 10.1.3.134, rebalance pending == 2

      3. Hit rebalance. Server list in UI only shows 10.1.3.119, 10.1.3.120, and 10.1.3.134

      4. If I access each of these independently, the server list without 10.1.3.133 shows up on each, except for 10.1.3.133, which I have to access as a new server, adding a password, etc

      5. Now, I add 10.1.3.133, set the password. Success, pending rebalance == 1

      6. Select rebalance, observe in firebug that continual GETS of status running:

      http://skitch.com/capttofu/de6bf/firebug-northscale-console-1.6.0a-166-gefc8175

      Now, using the command line tools on the node itself:

      [root@rhel-1 NorthScale]# /opt/NorthScale/bin/ep_engine/management/vbucketctl.py localhost:8080 list

      And it hangs. One time the server completely hung. Another time when the server wasn't hung I was able to observe:

      root@rhel-1 NorthScale]# ps auxwwww|grep vbucket
      101 8246 0.0 0.1 26652 1352 ? Ss 02:25 0:00 ./bin/vbucketmigrator/vbucketmigrator -h 10.1.3.119:11210 -d 10.1.3.120:11210 -v -b 13 -b 12

      And from my own testbox:

      patg@patg-desktop:~/northscale/membase-cli$ ./membase rebalance-status -c 10.1.3.119:8080 -d -u admin -p adminadmin
      servers {'add': {}, 'remove': {}}
      METHOD: GET
      PARAMS: {}
      ENCODED_PARAMS:
      REST CMD: GET /pools/default/rebalanceProgress
      response.status: 200
      output_result

      {"status":"running"}

      running
      patg@patg-desktop:~/northscale/membase-cli$ ./membase rebalance-status -c 10.1.3.119:8080 -d -u admin -p adminadmin
      servers {'add': {}, 'remove': {}}
      METHOD: GET
      PARAMS: {}
      ENCODED_PARAMS:
      REST CMD: GET /pools/default/rebalanceProgress
      response.status: 200
      output_result

      {"status":"running"}

      running
      patg@patg-desktop:~/northscale/membase-cli$ ./membase rebalance -c 10.1.3.119:8080 -d -u admin -p adminadmin
      servers {'add': {}, 'remove': {}}
      METHOD: GET
      PARAMS: {}
      ENCODED_PARAMS:
      REST CMD: GET /pools/default
      response.status: 200
      METHOD: POST
      PARAMS:

      {'ejectedNodes': '', 'knownNodes': u'ns_1@10.1.3.119,ns_1@10.1.3.120,ns_1@10.1.3.133,ns_1@10.1.3.134'}

      ENCODED_PARAMS: ejectedNodes=&knownNodes=ns_1%4010.1.3.119%2Cns_1%4010.1.3.120%2Cns_1%4010.1.3.133%2Cns_1%4010.1.3.134
      REST CMD: POST /controller/rebalance
      response.status: 500
      output_result ["Unexpected server error, request logged."]
      METHOD: POST
      PARAMS:

      {'ejectedNodes': '', 'knownNodes': u'ns_1@10.1.3.119,ns_1@10.1.3.120,ns_1@10.1.3.133,ns_1@10.1.3.134'}

      ENCODED_PARAMS: ejectedNodes=&knownNodes=ns_1%4010.1.3.119%2Cns_1%4010.1.3.120%2Cns_1%4010.1.3.133%2Cns_1%4010.1.3.134
      REST CMD: POST /controller/rebalance
      response.status: 500
      ERROR: Internal Server Error unable to start rebalance.
      patg@patg-desktop:~/northscale/membase-cli$ ./membase server-list -c 10.1.3.119:8080
      METHOD: GET
      PARAMS: {}
      ENCODED_PARAMS:
      REST CMD: GET /pools/default
      response.status: 200
      ns_1@10.1.3.119 healthy
      ns_1@10.1.3.120 healthy
      ns_1@10.1.3.133 unhealthy
      ns_1@10.1.3.134 unhealthy
      patg@patg-desktop:~/northscale/membase-cli$

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              sean@northscale.com Sean Lynch (Inactive)
              patg@patg.net Patrick Galbraith
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty