Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5108

Rolling upgrade from 172 to latest 181 fails with failed rebalance {type,exit}, {what,{noproc, {gen_fsm,sync_send_event,}}}

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.8.1-release-candidate
    • Fix Version/s: 1.8.1
    • Component/s: ns_server
    • Security Level: Public
    • Environment:
      Ubuntu 10.04

      Description

      Failing test
      upgradetests.MultipleNodeUpgradeTests.test_upgrade,initial_version=1.7.2,create_buckets=True,insert_data=True,start_upgraded_first=False,load_ratio=10,online_upgrade=True

      2012-04-18 05:05:23,369 - root - INFO - adding node : 10.3.121.92:8091 to the cluster
      2012-04-18 05:05:23,370 - root - INFO - adding remote node : 10.3.121.92 to this cluster @ : 10.3.121.98
      2012-04-18 05:05:24,121 - root - INFO - added node : ns_1@10.3.121.92 to the cluster
      2012-04-18 05:05:24,134 - root - INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%4010.3.121.94%2Cns_1%4010.3.121.92%2Cns_1%4010.3.121.98%2Cns_1%4010.3.121.93%2Cns_1%4010.3.121.97%2Cns_1%4010.3.121.95
      2012-04-18 05:05:24,140 - root - ERROR - http://10.3.121.98:8091/controller/rebalance error 500 reason: unknown ["Unexpected server error, request logged."]
      2012-04-18 05:05:24,140 - root - ERROR - rebalance operation failed

      INFO REPORT <0.6440.0> 2012-04-18 05:06:53
      ===============================================================================

      ns_log: logging menelaus_web:19:Server error during processing: ["web request failed",

      {path,"/controller/rebalance"}

      ,

      {type,exit}

      ,
      {what,
      {noproc,
      {gen_fsm,sync_send_event,
      [

      {global,ns_orchestrator}

      ,

      {start_rebalance, ['ns_1@10.3.121.94','ns_1@10.3.121.92', 'ns_1@10.3.121.98','ns_1@10.3.121.93', 'ns_1@10.3.121.97','ns_1@10.3.121.95'], [],[]}

      ]}}},
      {trace,
      [

      {gen_fsm,sync_send_event,2}

      ,

      {menelaus_web,do_handle_rebalance,3}

      ,

      {menelaus_web,loop,3}

      ,

      {mochiweb_http,headers,5}

      ,

      {proc_lib,init_p_do_apply,3}

      ]}]

      1. 10.3.121.92-8091-diag.txt.gz
        218 kB
        Karan Kumar
      2. 10.3.121.93-8091-diag.txt.gz
        331 kB
        Karan Kumar
      3. 10.3.121.94-8091-diag.txt.gz
        335 kB
        Karan Kumar
      4. 10.3.121.95-8091-diag.txt.gz
        333 kB
        Karan Kumar
      5. 10.3.121.97-8091-diag.txt.gz
        330 kB
        Karan Kumar
      6. 10.3.121.98-8091-diag.txt.gz
        347 kB
        Karan Kumar
      7. 1cdc7aec-3cf0-4614-bcbf-1902c99b111f-10.3.121.92-diag.txt.gz
        162 kB
        Karan Kumar
      8. 1cdc7aec-3cf0-4614-bcbf-1902c99b111f-10.3.121.93-diag.txt.gz
        295 kB
        Karan Kumar
      9. 1cdc7aec-3cf0-4614-bcbf-1902c99b111f-10.3.121.94-diag.txt.gz
        297 kB
        Karan Kumar
      10. 1cdc7aec-3cf0-4614-bcbf-1902c99b111f-10.3.121.95-diag.txt.gz
        312 kB
        Karan Kumar
      11. 1cdc7aec-3cf0-4614-bcbf-1902c99b111f-10.3.121.97-diag.txt.gz
        306 kB
        Karan Kumar
      12. 1cdc7aec-3cf0-4614-bcbf-1902c99b111f-10.3.121.98-diag.txt.gz
        315 kB
        Karan Kumar
      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-5108=message:MB-5108

        Activity

        Hide
        karan Karan Kumar (Inactive) added a comment -

        This looks to be regression in ns_server

        Show
        karan Karan Kumar (Inactive) added a comment - This looks to be regression in ns_server
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Found this to be issue in old ns_server. Rebalance requests needs to either be sent to node running new version or work around this issue by waiting and retrying. Commit that fixed it (for 1.8.0) is:

        commit d45ccaab92158d4a4fc882d3216d1557b7b39816
        Author: Aliaksey Kandratsenka <alk@tut.by>
        Date: Tue Nov 29 15:10:09 2011 +0300

        wait for orchestrator presense for key operations. MB-4214 MB-4559

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Found this to be issue in old ns_server. Rebalance requests needs to either be sent to node running new version or work around this issue by waiting and retrying. Commit that fixed it (for 1.8.0) is: commit d45ccaab92158d4a4fc882d3216d1557b7b39816 Author: Aliaksey Kandratsenka <alk@tut.by> Date: Tue Nov 29 15:10:09 2011 +0300 wait for orchestrator presense for key operations. MB-4214 MB-4559
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Not a "bug".

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Not a "bug".
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Still failing. The suggested workaround does not work.

        In the test we are issuing rebalance call to the newly upgraded 181 node..

        Results in the rebalance failure.

        Rebalance exited with reason {{case_clause,
        {badrpc,
        {'EXIT',
        {{badfun,#Fun<erl_eval.4.88154533>},
        [

        {erlang,apply,2}

        ,

        {rpc,'-handle_call_call/6-fun-0-',5}

        ]}}}},
        [

        {ns_vbm_sup,change_vbucket_filter,4}

        ,

        {ns_vbm_sup,'-set_replicas/3-fun-2-',5}

        ,

        {lists,foldl,3}

        ,

        {ns_vbm_sup,set_replicas,3}

        ,

        {ns_vbm_sup,'-set_replicas/2-fun-1-',3}

        ,

        {lists,foreach,2}

        ,

        {ns_vbm_sup,apply_changes,2}

        ,

        {ns_vbucket_mover,sync_replicas,0}

        ]}

        Show
        karan Karan Kumar (Inactive) added a comment - Still failing. The suggested workaround does not work. In the test we are issuing rebalance call to the newly upgraded 181 node.. Results in the rebalance failure. Rebalance exited with reason {{case_clause, {badrpc, {'EXIT', {{badfun,#Fun<erl_eval.4.88154533>}, [ {erlang,apply,2} , {rpc,'-handle_call_call/6-fun-0-',5} ]}}}}, [ {ns_vbm_sup,change_vbucket_filter,4} , {ns_vbm_sup,'-set_replicas/3-fun-2-',5} , {lists,foldl,3} , {ns_vbm_sup,set_replicas,3} , {ns_vbm_sup,'-set_replicas/2-fun-1-',3} , {lists,foreach,2} , {ns_vbm_sup,apply_changes,2} , {ns_vbucket_mover,sync_replicas,0} ]}
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Neither does waiting for the newly added node to become orchestrator solves this issue.

        Show
        karan Karan Kumar (Inactive) added a comment - Neither does waiting for the newly added node to become orchestrator solves this issue.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        That's different failure. Thanks for finding it.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - That's different failure. Thanks for finding it.
        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Fixed in http://review.couchbase.org/15366
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ns-server-2-0 #342 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/342/)
        reimplemented backwards-compat for change_vbucket_filter. MB-5108 (Revision 023a90b14d2530823602f9c0c1c03dc86c33013e)
        forward-ported new change_filter code (023a90b14). MB-5108 (Revision f6d217bec4b036b617f6ccf19404ac5ef8b0b793)

        Result = SUCCESS
        Aliaksey Kandratsenka :
        Files :

        • src/ns_vbm_sup.erl

        Aliaksey Kandratsenka :
        Files :

        • src/ns_vbm_sup.erl
        • src/cb_gen_vbm_sup.erl
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ns-server-2-0 #342 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/342/ ) reimplemented backwards-compat for change_vbucket_filter. MB-5108 (Revision 023a90b14d2530823602f9c0c1c03dc86c33013e) forward-ported new change_filter code (023a90b14). MB-5108 (Revision f6d217bec4b036b617f6ccf19404ac5ef8b0b793) Result = SUCCESS Aliaksey Kandratsenka : Files : src/ns_vbm_sup.erl Aliaksey Kandratsenka : Files : src/ns_vbm_sup.erl src/cb_gen_vbm_sup.erl

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            karan Karan Kumar (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews