Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35748

Under unknown circumstances leader_lease_acquirer hangs trying to shutdown worker

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      Two issues were found:

      1. A deadlock in async code exposed by ns_rebalancer process ignoring a rebalance stop request.
      2. A race condition in erlang runtime, whereby a process switching trap_exit from true to false, may continue to effectively trap exits for some time after the call to process_flag(trap_exit, false) returns. Since the process doesn't expect to be trapping exits anymore, the process termination request gets ignored. We couldn't get direct evidence that this is what happened in the specific instance we saw this behaviour, but the race condition explains whatever evidence we have.

      (1) is addressed by http://review.couchbase.org/#/c/114292/.
      (2) is addressed by https://github.com/couchbasedeps/erlang/commit/378cfabb7f5c48012a11f2fc3ad969f76d7f0caf.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Aliaksey Artamonau Aliaksey Artamonau (Inactive)
            artem Artem Stemkovski
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty