Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5434

rebalance fails with wait_for_memcached if user stops rebalancing after one bucket was already rebalanced out because ns-server looks for "rebalanced out" buckets in the next rebalance attempt

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.2
    • Fix Version/s: 1.8.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:

      Description

      obvious from these logs :
      Rebalance exited with reason

      {wait_for_memcached_failed,"Authentication", ['ns_1@10.113.11.5','ns_1@10.113.11.11', 'ns_1@10.113.11.12','ns_1@10.113.11.16', 'ns_1@10.113.11.17','ns_1@10.113.11.19', 'ns_1@10.113.11.21','ns_1@10.113.11.23', 'ns_1@10.113.11.24','ns_1@10.113.11.27']}
      (repeated 1 times) ns_orchestrator002 ns_1@10.113.11.11 01:24:57 - Mon Jun 4, 2012
      Rebalance exited with reason {wait_for_memcached_failed,"Authentication", ['ns_1@10.113.11.5','ns_1@10.113.11.11', 'ns_1@10.113.11.12','ns_1@10.113.11.16', 'ns_1@10.113.11.17','ns_1@10.113.11.19', 'ns_1@10.113.11.21','ns_1@10.113.11.23', 'ns_1@10.113.11.24','ns_1@10.113.11.27']}

      ns_orchestrator002 ns_1@10.113.11.11 01:19:18 - Mon Jun 4, 2012
      Starting rebalance, KeepNodes = ['ns_1@10.113.11.1','ns_1@10.113.11.2',
      'ns_1@10.113.11.3','ns_1@10.113.11.4',
      'ns_1@10.113.11.6','ns_1@10.113.11.7',
      'ns_1@10.113.11.8','ns_1@10.113.11.9',
      'ns_1@10.113.11.10','ns_1@10.113.11.13',
      'ns_1@10.113.11.14','ns_1@10.113.11.15',
      'ns_1@10.113.11.18','ns_1@10.113.11.20',
      'ns_1@10.113.11.22','ns_1@10.113.11.25',
      'ns_1@10.113.11.26','ns_1@10.113.11.28',
      'ns_1@10.113.11.29','ns_1@10.113.11.30'], EjectNodes = ['ns_1@10.113.11.5',
      'ns_1@10.113.11.11',
      'ns_1@10.113.11.12',
      'ns_1@10.113.11.16',
      'ns_1@10.113.11.17',
      'ns_1@10.113.11.19',
      'ns_1@10.113.11.21',
      'ns_1@10.113.11.23',
      'ns_1@10.113.11.24',
      'ns_1@10.113.11.27']
      ns_orchestrator004 ns_1@10.113.11.11 01:19:07 - Mon Jun 4, 2012
      Rebalance exited with reason stopped
      ns_orchestrator002 ns_1@10.113.11.11 01:13:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.17' for server shutdown ns_memcached002 ns_1@10.113.11.17 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.19' for server shutdown ns_memcached002 ns_1@10.113.11.19 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.23' for server shutdown ns_memcached002 ns_1@10.113.11.23 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.11' for server shutdown ns_memcached002 ns_1@10.113.11.11 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.12' for server shutdown ns_memcached002 ns_1@10.113.11.12 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.5' for server shutdown ns_memcached002 ns_1@10.113.11.5 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.16' for server shutdown ns_memcached002 ns_1@10.113.11.16 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.24' for server shutdown ns_memcached002 ns_1@10.113.11.24 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.21' for server shutdown ns_memcached002 ns_1@10.113.11.21 01:02:19 - Mon Jun 4, 2012
      Shutting down bucket "Authentication" on 'ns_1@10.113.11.27' for server shutdown ns_memcached002 ns_1@10.113.11.27 01:02:19 - Mon Jun 4, 2012
      Starting rebalance, KeepNodes = ['ns_1@10.113.11.1','ns_1@10.113.11.2',
      'ns_1@10.113.11.3','ns_1@10.113.11.4',
      'ns_1@10.113.11.6','ns_1@10.113.11.7',
      'ns_1@10.113.11.8','ns_1@10.113.11.9',
      'ns_1@10.113.11.10','ns_1@10.113.11.13',
      'ns_1@10.113.11.14','ns_1@10.113.11.15',
      'ns_1@10.113.11.18','ns_1@10.113.11.20',
      'ns_1@10.113.11.22','ns_1@10.113.11.25',
      'ns_1@10.113.11.26','ns_1@10.113.11.28',
      'ns_1@10.113.11.29','ns_1@10.113.11.30'], EjectNodes = ['ns_1@10.113.11.5',
      'ns_1@10.113.11.11',
      'ns_1@10.113.11.12',
      'ns_1@10.113.11.16',
      'ns_1@10.113.11.17',
      'ns_1@10.113.11.19',
      'ns_1@10.113.11.21',
      'ns_1@10.113.11.23',
      'ns_1@10.113.11.24',
      'ns_1@10.113.11.27']

      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-5434=message:MB-5434

        Activity

        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        right.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - right.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Right. The idea is in worst case you can rebalance bucket after bucket. And then usual failover would just get rid of this nodes being in server list of buckets

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Right. The idea is in worst case you can rebalance bucket after bucket. And then usual failover would just get rid of this nodes being in server list of buckets
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -
        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - fixed for 1.8.1: http://review.couchbase.org/16783
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ns-server-2-0 #364 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/364/)
        MB-5434: don't add ejected nodes back to bucket's servers list (Revision 9ecbe6c6f1c596faf995dabc7e2abb3facd4c8b7)

        Result = SUCCESS
        Aliaksey Kandratsenka :
        Files :

        • src/ns_rebalancer.erl
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ns-server-2-0 #364 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/364/ ) MB-5434 : don't add ejected nodes back to bucket's servers list (Revision 9ecbe6c6f1c596faf995dabc7e2abb3facd4c8b7) Result = SUCCESS Aliaksey Kandratsenka : Files : src/ns_rebalancer.erl
        Hide
        ketaki Ketaki Gangal added a comment -

        Tested on build 927 - Works fine. Seeing no error.

        Show
        ketaki Ketaki Gangal added a comment - Tested on build 927 - Works fine. Seeing no error.

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            farshid Farshid Ghods (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes