Description
This code isn't executed when a rebalance is stopped since we run rebalance_body via an async and the async code converts all terminate requests Reason to a "shutdown".
In this particular case an user intitated shutdown Reason "{shutdown, stop}" is converted to a "shutdown".
Here are logs from a rebalance stopped in the baseline code and this log isn't printed.
fd "debug.log" | xargs -I {} sh -c "rg \"Got rebalance stop\" {}"
|
|
Rebalance stopped via user log in ns_orchestrator:
|
|
fd "debug.log" | xargs -I {} sh -c "rg \"Rebalance stopped\" {}"
|
29150:[user:info,2022-10-07T11:31:04.021-07:00,n_0@192.168.86.150:<0.4997.0>:ns_orchestrator:log_rebalance_completion:1448]Rebalance stopped by user.
|
|
Based on a discussion with Artem - terminating ns_vbucket_mover async is dangerous and can potentially lead to data-loss, moving the fix to match on shutdown instead of {shutdown, stop}.
Artem Stemkovski 1:03 PMThe consequences of the orphaned mover doing something could be quite serious. Like data corruption
|
Attachments
Issue Links
- is caused by
-
MB-58734 Async always exits child asyncs with 'shutdown', which prevents some async processes receiving the intended exit reason
- Reopened