Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7282

erlang's global naming facility apparently drops globally registered service with actual service still alive (was: impossible to change settings/autoFailover after rebalance)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Duplicate
    • Affects Version/s: 2.0
    • Fix Version/s: 3.0
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
      None
    • Operating System:
      Centos 64-bit
    • Flagged:
      Release Note
    • Sprint:
      12/Aug - 30/Aug

      Description

      build
      ./testrunner -i resources/jenkins/centos-64-5node-failover.ini get-logs=True -t autofailovertests.AutoFailoverTests.test_enable,replicas=2,keys-count=1000000,num-buckets=2
      http://qa.hq.northscale.net/job/centos-64-2.0-failover-tests/480/consoleFull

      test changes settings/autoFailover after rebalance

      test logs:

      2012-11-27 22:29:42,105 - INFO - MainProcess:MainThread - rest_client:_rebalance_progress - rebalance percentage : 99.5678770149 %
      2012-11-27 22:29:44,114 - INFO - MainProcess:MainThread - rest_client:_rebalance_progress - rebalance percentage : 99.6821896236 %
      2012-11-27 22:29:48,134 - INFO - MainProcess:MainThread - rest_client:monitorRebalance - rebalance progress took 1111.69762397 seconds
      2012-11-27 22:29:48,135 - INFO - MainProcess:MainThread - rest_client:monitorRebalance - sleep for 10 seconds after rebalance...
      2012-11-27 22:29:58,133 - INFO - MainProcess:MainThread - rest_client:update_autofailover_settings - settings/autoFailover params : enabled=true&timeout=30
      2012-11-27 22:29:58,140 - ERROR - MainProcess:MainThread - rest_client:_http_request - http://10.1.3.114:8091/settings/autoFailover error 500 reason: unknown ["Unexpected server error, request logged."]

      server logs:

      [menelaus:warn,2012-11-27T22:53:28.404,ns_1@10.1.3.114:<0.21541.23>:menelaus_web:loop:430]Server error during processing: ["web request failed",

      {path,"/settings/autoFailover"}

      ,

      {type,exit}

      ,
      {what,
      {noproc,
      {gen_server,call,
      [

      {global,auto_failover}

      ,

      {enable_auto_failover,30,1}

      ]}}},
      {trace,
      [

      {gen_server,call,2}

      ,

      {menelaus_web, handle_settings_auto_failover_post,1}

      ,

      {menelaus_web,loop,3}

      ,

      {mochiweb_http,headers,5}

      ,

      {proc_lib,init_p_do_apply,3}

      ]}]

      Alk, if we have to handle the error and retry again, please assign the ticket back to me

        Issue Links

          Activity

          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Workaround:

          wget --user=Administrator --password=asdasd --post-data='rpc:call(mb_master:master_node(), erlang, apply ,[fun () -> erlang:exit(erlang:whereis(mb_master), kill) end, []]).' http://localhost:8091/diag/eval

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Workaround: wget --user=Administrator --password=asdasd --post-data='rpc:call(mb_master:master_node(), erlang, apply ,[fun () -> erlang:exit(erlang:whereis(mb_master), kill) end, []]).' http://localhost:8091/diag/eval
          Hide
          andreibaranouski Andrei Baranouski added a comment -

          same issue when request rebalance failed MB-8682

          Show
          andreibaranouski Andrei Baranouski added a comment - same issue when request rebalance failed MB-8682
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          This is believed to be Erlang bug. And in order to fix it we currently plan to move to master-less cluster orchestration.See MB-8845.

          With that removing it from backlog.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - This is believed to be Erlang bug. And in order to fix it we currently plan to move to master-less cluster orchestration.See MB-8845 . With that removing it from backlog.
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -
          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - MB-9321
          Hide
          maria Maria McDuff (Inactive) added a comment -

          closing as dupes.

          Show
          maria Maria McDuff (Inactive) added a comment - closing as dupes.

            People

            • Assignee:
              alkondratenko Aleksey Kondratenko (Inactive)
              Reporter:
              andreibaranouski Andrei Baranouski
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Agile

                  Gerrit Reviews

                  There are no open Gerrit changes