Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5279

user is unable to add the node back to the cluster because the failed over node is stuck in warmup phase

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.8.1-release-candidate
    • Fix Version/s: 1.8.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
    • Environment:
      CentOS 64 bit

      1.8.1-807

      Description

      Have not seen this issue previously ever on 181. Seems to be something new.

      Test case that failed
      swaprebalance.SwapRebalanceFailedTests.test_add_back_failed_node,replica=3,num-buckets=5,num-swap=2,swap-orchestrator=True,keys-count=800000

      1) Failover 2 nodes, including the orchestrator
      2) Rebalance them out
      3) Add them back and rebalance in, without cleaning them.

      [rebalance:error] [2012-05-10 8:48:59] [ns_1@10.1.3.74:<0.27930.16>:ns_rebalancer:do_wait_buckets_shutdown:170] Failed to wait deletion of some buckets on some nodes: [{'ns_1@10.1.3.77',
      {'EXIT',

      {old_buckets_shutdown_wait_failed, ["bucket-2", "bucket-1", "bucket-0", "bucket-4", "bucket-3"]}

      }}]

      [user:info] [2012-05-10 8:48:59] [ns_1@10.1.3.74:<0.27893.16>:ns_orchestrator:handle_info:246] Rebalance exited with reason {buckets_shutdown_wait_failed,
      [{'ns_1@10.1.3.77',
      {'EXIT',

      {old_buckets_shutdown_wait_failed, ["bucket-2","bucket-1","bucket-0", "bucket-4","bucket-3"]}

      }}]}

        Activity

        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Found that indeed shutting down bucket-2 took minutes even though we (according to logs) passed force flag:

        [user:info] [2012-05-10 8:45:32] [ns_1@10.1.3.77:'ns_memcached-bucket-0':ns_memcached:terminate:296] Shutting down bucket "bucket-0" on 'ns_1@10.1.3.77' for deletion
        ... (no other messages from this erlang process in between) ...
        [ns_server:debug] [2012-05-10 8:49:38] [ns_1@10.1.3.77:'ns_memcached-bucket-0':ns_storage_conf:delete_bucket_db_directory:384] Result of deleting db directory: "/opt/couchbase/var/lib/couchbase/data/bucket-0-data": ok

        That's what rebalance failure actually mentioned.

        I've seen some memcached crash on this .77 node shortly before we've started bucket deletion. So my guess is that warmup prevents force shutdown to be quick. I.e.:

        [error_logger:info] [2012-05-10 8:45:04] [ns_1@10.1.3.77:error_logger:ale_error_logger_handler:log_report:72]
        =========================PROGRESS REPORT=========================
        supervisor:

        {local,ns_port_sup}

        started: [

        {pid,<0.29206.11>}

        ,
        {name,
        {memcached,"/opt/couchbase/bin/memcached",
        ["-X",
        "/opt/couchbase/lib/memcached/stdin_term_handler.so",
        "-l","0.0.0.0:11210,0.0.0.0:11209:1000","-E",
        "/opt/couchbase/lib/memcached/bucket_engine.so",
        "-B","binary","-r","-c","10000","-e",
        "admin=_admin;default_bucket_name=default;auto_create=false",
        []],
        [{env,
        [

        {"EVENT_NOSELECT","1"},
        {"MEMCACHED_TOP_KEYS","100"},
        {"ISASL_PWFILE", "/opt/couchbase/var/lib/couchbase/data/isasl.pw"},
        {"ISASL_DB_CHECK_TIME","1"}]},
        use_stdio,stderr_to_stdout,exit_status,
        port_server_send_eol,stream]}},
        {mfargs,
        {supervisor_cushion,start_link,
        [memcached,5000,ns_port_server,start_link,
        [memcached,"/opt/couchbase/bin/memcached",
        ["-X",
        "/opt/couchbase/lib/memcached/stdin_term_handler.so",
        "-l","0.0.0.0:11210,0.0.0.0:11209:1000",
        "-E",
        "/opt/couchbase/lib/memcached/bucket_engine.so",
        "-B","binary","-r","-c","10000","-e",
        "admin=_admin;default_bucket_name=default;auto_create=false",
        []],
        [{env,
        [{"EVENT_NOSELECT","1"}

        ,

        {"MEMCACHED_TOP_KEYS","100"}

        ,

        {"ISASL_PWFILE", "/opt/couchbase/var/lib/couchbase/data/isasl.pw"}

        ,

        {"ISASL_DB_CHECK_TIME","1"}

        ]},
        use_stdio,stderr_to_stdout,exit_status,
        port_server_send_eol,stream]]]}},

        {restart_type,permanent}

        ,

        {shutdown,10000}

        ,

        {child_type,worker}

        ]

        and prior to that:

        [error_logger:error] [2012-05-10 8:45:03] [ns_1@10.1.3.77:error_logger:ale_error_logger_handler:log_report:72]
        =========================CRASH REPORT=========================
        crasher:
        initial call: ns_port_server:init/1
        pid: <0.15086.9>
        registered_name: []
        exception exit:

        {abnormal,134}

        in function gen_server:terminate/6
        ancestors: [<0.15085.9>,ns_port_sup,ns_server_sup,ns_server_cluster_sup,
        <0.60.0>]
        messages: [

        {'EXIT',#Port<0.174933>,normal}

        ]
        links: [<0.15085.9>]
        dictionary: []
        trap_exit: true
        status: running
        heap_size: 121393
        stack_size: 24
        reductions: 1661845
        neighbours:

        and even more prior to that:

        [ns_server:info] [2012-05-10 8:44:34] [ns_1@10.1.3.77:<0.15086.9>:ns_port_server:log:161] memcached<0.15086.9>: [warn] Epoll ADD(5) on fd 247 failed. Old events were 0; read change was 1 (add); write change was 1 (add): Bad file descriptor
        memcached<0.15086.9>: Failed to add connection to libevent: Bad file descriptor
        memcached<0.15086.9>: memcached: bucket_engine.c:997: handle_disconnect: Assertion `es' failed.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Found that indeed shutting down bucket-2 took minutes even though we (according to logs) passed force flag: [user:info] [2012-05-10 8:45:32] [ns_1@10.1.3.77:'ns_memcached-bucket-0':ns_memcached:terminate:296] Shutting down bucket "bucket-0" on 'ns_1@10.1.3.77' for deletion ... (no other messages from this erlang process in between) ... [ns_server:debug] [2012-05-10 8:49:38] [ns_1@10.1.3.77:'ns_memcached-bucket-0':ns_storage_conf:delete_bucket_db_directory:384] Result of deleting db directory: "/opt/couchbase/var/lib/couchbase/data/bucket-0-data": ok That's what rebalance failure actually mentioned. I've seen some memcached crash on this .77 node shortly before we've started bucket deletion. So my guess is that warmup prevents force shutdown to be quick. I.e.: [error_logger:info] [2012-05-10 8:45:04] [ns_1@10.1.3.77:error_logger:ale_error_logger_handler:log_report:72] =========================PROGRESS REPORT========================= supervisor: {local,ns_port_sup} started: [ {pid,<0.29206.11>} , {name, {memcached,"/opt/couchbase/bin/memcached", ["-X", "/opt/couchbase/lib/memcached/stdin_term_handler.so", "-l","0.0.0.0:11210,0.0.0.0:11209:1000","-E", "/opt/couchbase/lib/memcached/bucket_engine.so", "-B","binary","-r","-c","10000","-e", "admin=_admin;default_bucket_name=default;auto_create=false", []], [{env, [ {"EVENT_NOSELECT","1"}, {"MEMCACHED_TOP_KEYS","100"}, {"ISASL_PWFILE", "/opt/couchbase/var/lib/couchbase/data/isasl.pw"}, {"ISASL_DB_CHECK_TIME","1"}]}, use_stdio,stderr_to_stdout,exit_status, port_server_send_eol,stream]}}, {mfargs, {supervisor_cushion,start_link, [memcached,5000,ns_port_server,start_link, [memcached,"/opt/couchbase/bin/memcached", ["-X", "/opt/couchbase/lib/memcached/stdin_term_handler.so", "-l","0.0.0.0:11210,0.0.0.0:11209:1000", "-E", "/opt/couchbase/lib/memcached/bucket_engine.so", "-B","binary","-r","-c","10000","-e", "admin=_admin;default_bucket_name=default;auto_create=false", []], [{env, [{"EVENT_NOSELECT","1"} , {"MEMCACHED_TOP_KEYS","100"} , {"ISASL_PWFILE", "/opt/couchbase/var/lib/couchbase/data/isasl.pw"} , {"ISASL_DB_CHECK_TIME","1"} ]}, use_stdio,stderr_to_stdout,exit_status, port_server_send_eol,stream]]]}}, {restart_type,permanent} , {shutdown,10000} , {child_type,worker} ] and prior to that: [error_logger:error] [2012-05-10 8:45:03] [ns_1@10.1.3.77:error_logger:ale_error_logger_handler:log_report:72] =========================CRASH REPORT========================= crasher: initial call: ns_port_server:init/1 pid: <0.15086.9> registered_name: [] exception exit: {abnormal,134} in function gen_server:terminate/6 ancestors: [<0.15085.9>,ns_port_sup,ns_server_sup,ns_server_cluster_sup, <0.60.0>] messages: [ {'EXIT',#Port<0.174933>,normal} ] links: [<0.15085.9>] dictionary: [] trap_exit: true status: running heap_size: 121393 stack_size: 24 reductions: 1661845 neighbours: and even more prior to that: [ns_server:info] [2012-05-10 8:44:34] [ns_1@10.1.3.77:<0.15086.9>:ns_port_server:log:161] memcached<0.15086.9>: [warn] Epoll ADD(5) on fd 247 failed. Old events were 0; read change was 1 (add); write change was 1 (add): Bad file descriptor memcached<0.15086.9>: Failed to add connection to libevent: Bad file descriptor memcached<0.15086.9>: memcached: bucket_engine.c:997: handle_disconnect: Assertion `es' failed.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Given all that and especially that most likely this is possibly not a regression (i.e. warmup likely caused bucket deletion to be delayed on 1.8.0 as well) assigning to Frank so that we consider if this is really blocker.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Given all that and especially that most likely this is possibly not a regression (i.e. warmup likely caused bucket deletion to be delayed on 1.8.0 as well) assigning to Frank so that we consider if this is really blocker.
        Hide
        dipti Dipti Borkar added a comment -

        From Farshid: doesn't seem like a regression and is a common use case that we should fix in 1.8.1. we need a fix for this.

        Show
        dipti Dipti Borkar added a comment - From Farshid: doesn't seem like a regression and is a common use case that we should fix in 1.8.1. we need a fix for this.
        Hide
        dipti Dipti Borkar added a comment -

        Perhaps the behavior here needs to be changed? don't allow "rebalance" to start if a node is warming up ? Is there something else going on under the covers?

        Show
        dipti Dipti Borkar added a comment - Perhaps the behavior here needs to be changed? don't allow "rebalance" to start if a node is warming up ? Is there something else going on under the covers?
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Something else.

        We failed over node.

        This means we requested deletion of bucket on this node. This deletion of bucket may take some time and we normally don't care. But we don't expect it to take ages.

        What happens next is node is attempted to be added back while this deletion is still in progress and we refuse because we're going to re-create this bucket. But we need to complete deletion of it first.

        This is something we long wanted from ep-engine. We want bucket deletion to be reasonably quick. Always.

        If you still think we need this logic "don't start rebalance unless you can be reasonably sure it will succeed" we can work on it. It doesn't look like very trivial task though.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Something else. We failed over node. This means we requested deletion of bucket on this node. This deletion of bucket may take some time and we normally don't care. But we don't expect it to take ages. What happens next is node is attempted to be added back while this deletion is still in progress and we refuse because we're going to re-create this bucket. But we need to complete deletion of it first. This is something we long wanted from ep-engine. We want bucket deletion to be reasonably quick. Always. If you still think we need this logic "don't start rebalance unless you can be reasonably sure it will succeed" we can work on it. It doesn't look like very trivial task though.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Correction "dont allow user to start rebalance unless you can reasonably sure it will succeed". Let me note that current behavior of performing all checks at the start of rebalance (after user has requested rebalance) is not very different. The difference is some potential confusion on behalf of user if rebalance fails immediately.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Correction "dont allow user to start rebalance unless you can reasonably sure it will succeed". Let me note that current behavior of performing all checks at the start of rebalance (after user has requested rebalance) is not very different. The difference is some potential confusion on behalf of user if rebalance fails immediately.
        Hide
        dipti Dipti Borkar added a comment -

        Is it easy to make it more obvious then as to why rebalance failed?

        Show
        dipti Dipti Borkar added a comment - Is it easy to make it more obvious then as to why rebalance failed?
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Isn't the following good enough?:

        Rebalance exited with reason {buckets_shutdown_wait_failed,
        [{'ns_1@10.1.3.77',
        {'EXIT',

        {old_buckets_shutdown_wait_failed, ["bucket-2","bucket-1","bucket-0", "bucket-4","bucket-3"]}

        }}]}

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Isn't the following good enough?: Rebalance exited with reason {buckets_shutdown_wait_failed, [{'ns_1@10.1.3.77', {'EXIT', {old_buckets_shutdown_wait_failed, ["bucket-2","bucket-1","bucket-0", "bucket-4","bucket-3"]} }}]}
        Hide
        dipti Dipti Borkar added a comment -

        Thanks Aliaksey. I understand this issue better now. Let's see what can be done in ep_engine.

        Show
        dipti Dipti Borkar added a comment - Thanks Aliaksey. I understand this issue better now. Let's see what can be done in ep_engine.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        ns-server should send bucket-deletion request to the memcached which is waming up and ep-engine should abort warmup process and continue with bucket deletion

        Show
        farshid Farshid Ghods (Inactive) added a comment - ns-server should send bucket-deletion request to the memcached which is waming up and ep-engine should abort warmup process and continue with bucket deletion
        Hide
        dipti Dipti Borkar added a comment -

        Chiyoung, can you estimate the level of change needed to support this? How risky are these changes?

        Show
        dipti Dipti Borkar added a comment - Chiyoung, can you estimate the level of change needed to support this? How risky are these changes?
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        BTW, do we have reliable evidence that bucket deletion was delayed because of warmup in this particular case? I.e. I know warmup is likely in progress but that's all I know about this bucket deletion.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - BTW, do we have reliable evidence that bucket deletion was delayed because of warmup in this particular case? I.e. I know warmup is likely in progress but that's all I know about this bucket deletion.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        yes we have confirmed that warmup process was going on and bucket deletion causes the whole thing to hang.

        if it helps we can get gdb backtrace when this is happening but given that this was seen many times by QE/customers i think we can safely assume thats the case.

        Show
        farshid Farshid Ghods (Inactive) added a comment - yes we have confirmed that warmup process was going on and bucket deletion causes the whole thing to hang. if it helps we can get gdb backtrace when this is happening but given that this was seen many times by QE/customers i think we can safely assume thats the case.
        Hide
        chiyoung Chiyoung Seo added a comment -

        Can the ns_server move the database files or delete them directly before the failedover node is added back?

        Show
        chiyoung Chiyoung Seo added a comment - Can the ns_server move the database files or delete them directly before the failedover node is added back?
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        we do delete files. But after bucket shutdown is complete

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - we do delete files. But after bucket shutdown is complete
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        we have patch for 1.8 that will make memcached exit immediately without shutting down buckets when failover happens. That could be easy treatment for this problem

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - we have patch for 1.8 that will make memcached exit immediately without shutting down buckets when failover happens. That could be easy treatment for this problem
        Hide
        chiyoung Chiyoung Seo added a comment -

        I'm a little confused now. Why the failed over node is warmed up again?

        Anyway, if the ns-server already has the patch for this issue, can I close this bug?

        Show
        chiyoung Chiyoung Seo added a comment - I'm a little confused now. Why the failed over node is warmed up again? Anyway, if the ns-server already has the patch for this issue, can I close this bug?
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        memcached crashed before node was failed over. Then ns_server restarted memcached and recreated buckets. And they started warming up.

        Then node was failed over. ns_server deletes all buckets (with force flag set to true) but that takes long time seemingly because of warm up. And that prevents rebalance back in. Because before starting rebalance we now wait for pending bucket deletions to complete.

        The patch I mentioned is against memcached. The idea is when we detect failover we ask memcached to just exit without any cleanup.

        I don't know why but Farshid and Dipti wanted ep-engine to be fixed. Given 2.0 and 1.8.1 warmup is very different I question whether we really need that btw.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - memcached crashed before node was failed over. Then ns_server restarted memcached and recreated buckets. And they started warming up. Then node was failed over. ns_server deletes all buckets (with force flag set to true) but that takes long time seemingly because of warm up. And that prevents rebalance back in. Because before starting rebalance we now wait for pending bucket deletions to complete. The patch I mentioned is against memcached. The idea is when we detect failover we ask memcached to just exit without any cleanup. I don't know why but Farshid and Dipti wanted ep-engine to be fixed. Given 2.0 and 1.8.1 warmup is very different I question whether we really need that btw.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        >>The idea is when we detect failover we ask memcached to just exit without any cleanup.

        is this like kill -9 ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - >>The idea is when we detect failover we ask memcached to just exit without any cleanup. is this like kill -9 ?
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        yes. But works on windows too

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - yes. But works on windows too
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        ok so if this is a force kill then we dont have to fix this on ep-engine.

        for this specific issue all we care about it the end user experience and if user will be able to add the node back to the cluster without any issues and your patch takes care of it so we can close this bug

        Show
        farshid Farshid Ghods (Inactive) added a comment - ok so if this is a force kill then we dont have to fix this on ep-engine. for this specific issue all we care about it the end user experience and if user will be able to add the node back to the cluster without any issues and your patch takes care of it so we can close this bug
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Patches backported and uploaded to gerrit. I can rubberstamp ns_server's patches. But not memcached.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Patches backported and uploaded to gerrit. I can rubberstamp ns_server's patches. But not memcached.
        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - http://review.couchbase.org/16432 http://review.couchbase.org/16436 http://review.couchbase.org/16437 http://review.couchbase.org/16438
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Should be fixed now with memcached & ns_server side merged

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Should be fixed now with memcached & ns_server side merged
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ns-server-2-0 #357 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/357/)
        MB-5279:bp: allow port_servers to send arbitrary data to port (Revision ab2b8ecf77658fda7266a9eaed1b474f29741e43)
        MB-5279:bp:made supervisor_cushion exit with same reason as child (Revision 5ba3d6e1634c613ff3b6c1678f3eb04449434834)
        MB-5279:bp: ask memcached to die real quick on failover (Revision 2f14d3a4107b6c7fccc4c479832048bb259f469c)
        MB-5279: send shutdown command instead of empty line to memcached (Revision 6118863758b88da374868d2adf7a890096067394)

        Result = SUCCESS
        Aliaksey Kandratsenka :
        Files :

        • src/ns_port_server.erl

        Aliaksey Kandratsenka :
        Files :

        • src/supervisor_cushion.erl

        Aliaksey Kandratsenka :
        Files :

        • src/ns_port_sup.erl
        • src/ns_server_sup.erl

        Aliaksey Kandratsenka :
        Files :

        • src/ns_port_server.erl
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ns-server-2-0 #357 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/357/ ) MB-5279 :bp: allow port_servers to send arbitrary data to port (Revision ab2b8ecf77658fda7266a9eaed1b474f29741e43) MB-5279 :bp:made supervisor_cushion exit with same reason as child (Revision 5ba3d6e1634c613ff3b6c1678f3eb04449434834) MB-5279 :bp: ask memcached to die real quick on failover (Revision 2f14d3a4107b6c7fccc4c479832048bb259f469c) MB-5279 : send shutdown command instead of empty line to memcached (Revision 6118863758b88da374868d2adf7a890096067394) Result = SUCCESS Aliaksey Kandratsenka : Files : src/ns_port_server.erl Aliaksey Kandratsenka : Files : src/supervisor_cushion.erl Aliaksey Kandratsenka : Files : src/ns_port_sup.erl src/ns_server_sup.erl Aliaksey Kandratsenka : Files : src/ns_port_server.erl

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            karan Karan Kumar (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: