Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4559

Rolling upgrade from 172 to 180r-38 failed, unable to determine master node

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.7.2, 1.8.0
    • Fix Version/s: 1.8.0
    • Component/s: installer
    • Security Level: Public
    • Labels:
      None
    • Environment:
      Centos 5.4-64bit.

      Description

      Steps:-
      1) Created a 6node 172 cluster.
      2) Load data
      3) Swap a 172 node with 180 node.
      4) Rebalance

      Attaching diags

      ERROR:-
      Server error during processing: ["web request failed",

      {path,"/controller/rebalance"}

      ,

      {type,exit}

      ,
      {what,
      {noproc,
      {gen_fsm,sync_send_event,
      [

      {global,ns_orchestrator}

      ,

      {start_rebalance, ['ns_1@10.72.51.127', 'ns_1@10.80.169.159', 'ns_1@10.243.15.240', 'ns_1@10.98.177.74','ns_1@10.2.183.29', 'ns_1@10.211.147.159'], [],[]}

      ]}}},
      {trace,
      [

      {gen_fsm,sync_send_event,2}

      ,

      {menelaus_web,do_handle_rebalance,3}

      ,

      {menelaus_web,loop,3}

      ,

      {mochiweb_http,headers,5}

      ,

      {proc_lib,init_p_do_apply,3}]}]


      ERROR REPORT <0.2671.0> 2011-12-19 20:36:53
      ===============================================================================

      ** State machine mb_master terminating
      ** Last event in was {heartbeat,'ns_1@10.80.169.159',candidate,
      [{peers, ['ns_1@10.2.183.29','ns_1@10.243.15.240', 'ns_1@10.72.51.127','ns_1@10.80.169.159', 'ns_1@10.98.177.74']},
      {versioning,true}]}
      ** When State == candidate
      ** Data == {state,undefined,undefined,
      ['ns_1@10.2.183.29','ns_1@10.243.15.240',
      'ns_1@10.72.51.127','ns_1@10.80.169.159',
      'ns_1@10.98.177.74'],
      {1324,327011,7604},
      ['ns_1@10.2.183.29','ns_1@10.243.15.240',
      'ns_1@10.72.51.127','ns_1@10.98.177.74'],
      compatible}
      ** Reason for termination =
      ** {{case_clause,true},
      [{mb_master,higher_priority_node,3},
      {mb_master,candidate,2},
      {gen_fsm,handle_msg,7},
      {proc_lib,init_p_do_apply,3}

      ]}

      CRASH REPORT <0.2671.0> 2011-12-19 20:36:53
      ===============================================================================
      Crashing process
      initial_call

      {mb_master,init,['Argument__1']}

      pid <0.2671.0>
      registered_name mb_master
      error_info
      {exit,{{case_clause,true},
      [

      {mb_master,higher_priority_node,3}

      ,

      {mb_master,candidate,2}

      ,

      {gen_fsm,handle_msg,7}

      ,

      {proc_lib,init_p_do_apply,3}]},
      [{gen_fsm,terminate,7},{proc_lib,init_p_do_apply,3}

      ]}
      ancestors [ns_server_sup,ns_server_cluster_sup,<0.51.0>]
      messages []
      links [<0.118.0>,<0.526.0>,<0.55.0>]
      dictionary []
      trap_exit true
      status running
      heap_size 2584
      stack_size 24
      reductions 725

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        fix is likely backporting similar 2.0 change where we try again until orchestrator is born

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - fix is likely backporting similar 2.0 change where we try again until orchestrator is born
        Hide
        steve Steve Yen added a comment -

        I heard this was fixed? True?

        Show
        steve Steve Yen added a comment - I heard this was fixed? True?
        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - http://review.couchbase.org/11808
        Hide
        karan Karan Kumar (Inactive) added a comment -

        1.8.0r-43 is the build with the patch. Verifying it now.

        Show
        karan Karan Kumar (Inactive) added a comment - 1.8.0r-43 is the build with the patch. Verifying it now.
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Verified. Not seen the issue again.

        Running upgrade tests in loop.

        Show
        karan Karan Kumar (Inactive) added a comment - Verified. Not seen the issue again. Running upgrade tests in loop.
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Verified with 1.8.0r-51

        Show
        karan Karan Kumar (Inactive) added a comment - Verified with 1.8.0r-51
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ns-server-2-0 #233 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/233/)

        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ns-server-2-0 #233 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/233/ )

          People

          • Assignee:
            karan Karan Kumar (Inactive)
            Reporter:
            karan Karan Kumar (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes