Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6175

master node is still orchestrating the rebalance operation even after its failed over

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Major
    • 2.0
    • 2.0-beta
    • ns_server
    • Security Level: Public
    • None
    • centos 6.2 64bit

    Description

      Install couchbase server 2.0.0-1554 on 10 nodes
      Load 55 miillion items to default bucket.
      Remove node 25. Rebalance passed
      Add back node 25, remove node 24 and failover node 23 and rebalance with no view, no autocompaction enable in data. Rebalance passed
      Add back node 24 and rebalance with no view, no autocompaction enable in data. Rebalance passed
      Add back node 23, remove node 14 and failover node 13 (master node). Node 13 still oschestrator. So during rebalance going, shut down couchbase server on node 13 and rebalance failed.

      2012-08-09 12:54:09.853 ns_orchestrator:0:info:message(ns_1@10.3.121.13) - Starting failing over 'ns_1@10.3.121.13'
      2012-08-09 12:54:09.897 ns_memcached:2:info:message(ns_1@10.3.121.13) - Shutting down bucket "default" on 'ns_1@10.3.121.13' for deletion
      2012-08-09 12:54:10.341 ns_orchestrator:6:info:message(ns_1@10.3.121.13) - Failed over 'ns_1@10.3.121.13': ok
      2012-08-09 12:54:17.480 ns_orchestrator:4:info:message(ns_1@10.3.121.13) - Starting rebalance, KeepNodes = ['ns_1@10.3.121.15','ns_1@10.3.121.16',
      'ns_1@10.3.121.17','ns_1@10.3.121.20',
      'ns_1@10.3.121.22','ns_1@10.3.121.24',
      'ns_1@10.3.121.25','ns_1@10.3.121.23'], EjectNodes = ['ns_1@10.3.121.14']

      2012-08-09 12:54:17.564 ns_storage_conf:0:info:message(ns_1@10.3.121.23) - Deleting old data files of bucket "default"
      2012-08-09 12:54:20.746 ns_rebalancer:0:info:message(ns_1@10.3.121.13) - Started rebalancing bucket default
      2012-08-09 12:54:21.371 ns_memcached:1:info:message(ns_1@10.3.121.23) - Bucket "default" loaded on node 'ns_1@10.3.121.23' in 0 seconds.

      Diags for all nodes is here
      https://s3.amazonaws.com/packages.couchbase/diag-logs/large_cluster_2_0/10nodes-1554-failed-to-transfer-master-node.tgz

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty