Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6819

Rebalance exited with reason {not_all_nodes_are_ready_..}, after upgrade 1.8.0->2.0 when add 2.0 node to mix cluster

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 2.0-beta-2, 2.0
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
      None

      Description

      steps:
      1. cluster 1.8.0r.deb 3 nodes( 1 standard, 1 sasl, 1 memcased bucket): 10.3.121.112, 10.3.121.113, 10.3.121.114
      2. upgrade master node(10.3.121.112) with 2.0.0-1797-rel.rpm
      observation: node 10.3.121.112 is pending state and working
      3. add 1 node( 10.3.121.115) with version 1797 and rebalance

      result: rebalance failed

      2012-10-04 04:54:17.399 ns_orchestrator:4:info:message(ns_1@10.3.121.112) - Starting rebalance, KeepNodes = ['ns_1@10.3.121.112','ns_1@10.3.121.113',
      'ns_1@10.3.121.114','ns_1@10.3.121.115'], EjectNodes = []

      2012-10-04 04:54:17.431 ns_storage_conf:0:info:message(ns_1@10.3.121.112) - Deleting old data files of bucket "sasl-data"
      2012-10-04 04:54:17.432 ns_storage_conf:0:info:message(ns_1@10.3.121.112) - Deleting old data files of bucket "1-data"
      2012-10-04 04:54:17.438 ns_rebalancer:0:info:message(ns_1@10.3.121.112) - Started rebalancing bucket mem
      2012-10-04 04:54:17.439 ns_rebalancer:0:info:message(ns_1@10.3.121.112) - Started rebalancing bucket sasl
      2012-10-0404:55:17.452 ns_orchestrator:2:info:message(ns_1@10.3.121.112) - Rebalance exited with reason

      {not_all_nodes_are_ready_yet, ['ns_1@10.3.121.113','ns_1@10.3.121.114']}

      these nodes are still with version 1.8.0

      1. 10.3.121.112-8091-diag.txt.gz
        860 kB
        Andrei Baranouski
      2. 10.3.121.113-8091-diag.txt.gz
        240 kB
        Andrei Baranouski
      3. 10.3.121.114-8091-diag.txt.gz
        236 kB
        Andrei Baranouski
      4. 10.3.121.115-8091-diag.txt.gz
        229 kB
        Andrei Baranouski
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        can you also upload the manifest file ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - can you also upload the manifest file ?
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        why 10.3.121.112 was in pending state after upgrade ?if we add the 2.0 node when 112 was in pending node then rebalance will fail and thats expected

        Show
        farshid Farshid Ghods (Inactive) added a comment - why 10.3.121.112 was in pending state after upgrade ?if we add the 2.0 node when 112 was in pending node then rebalance will fail and thats expected
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        Yes, I believe that this is a different issue, will create it on new configuration. in logs I see that 1.8 nodes were listed as 'not ready'
        Rebalance exited with reason

        {not_all_nodes_are_ready_yet, ['ns_1@10.3.121.113','ns_1@10.3.121.114']}

        http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.0-1797-rel.rpm.manifest.xml

        Show
        andreibaranouski Andrei Baranouski added a comment - Yes, I believe that this is a different issue, will create it on new configuration. in logs I see that 1.8 nodes were listed as 'not ready' Rebalance exited with reason {not_all_nodes_are_ready_yet, ['ns_1@10.3.121.113','ns_1@10.3.121.114']} http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.0-1797-rel.rpm.manifest.xml
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Duplicate of other upgrade bug. Same underlying cause, 2.0 nodes think entire cluster is 2.0.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Duplicate of other upgrade bug. Same underlying cause, 2.0 nodes think entire cluster is 2.0.

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            andreibaranouski Andrei Baranouski
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes