Details
-
Bug
-
Resolution: Fixed
-
Critical
-
5.5.5
-
Untriaged
-
Unknown
Description
Two issues were found:
1. A deadlock in async code exposed by ns_rebalancer process ignoring a rebalance stop request.
2. A race condition in erlang runtime, whereby a process switching trap_exit from true to false, may continue to effectively trap exits for some time after the call to process_flag(trap_exit, false) returns. Since the process doesn't expect to be trapping exits anymore, the process termination request gets ignored. We couldn't get direct evidence that this is what happened in the specific instance we saw this behaviour, but the race condition explains whatever evidence we have.
(1) is addressed by http://review.couchbase.org/#/c/114292/.
(2) is addressed by https://github.com/couchbasedeps/erlang/commit/378cfabb7f5c48012a11f2fc3ad969f76d7f0caf.