Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Cheshire-Cat
-
Triaged
-
1
-
Yes
-
KV-Engine 2021-Jan
Description
As seen in MB-43348, if an exception is thrown (and uncaught) inside a GlobalTask::run() method when using the FollyExecutorPool, it can result in that task being "stuck" and no longer getting scheduled.
In the case of MB-43348, an exception was thrown during Flusher::run(), which was only caught in folly::ThreadPoolExecutor::runTask() - i.e. the parent for our runTask method. As a result the various state updates to when next run the Flusher were not updated, and Flusher task was essentially hung.
For comparison CB3ExecutorPool doesn't catch any exceptions; so it would actually crash the entirety of memcached in this situation. While that is arguably perhaps worse behaviour, it at least means memcached will restart and (if the issue causing the exception to throw was transient) allow recovery.
Attachments
Issue Links
- has to be done before
-
MB-43529 dirtyQueueAge underflows if we get SuccessExistingItem in CheckpointManager
- Closed
For Gerrit Dashboard: MB-43373 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
143046,9 | MB-43373: Abort memcached if exception thrown from GlobalTask::run | master | kv_engine | Status: MERGED | +2 | +1 |
143806,2 | Revert "MB-43373: Abort memcached if exception thrown from GlobalTask::run" | master | kv_engine | Status: MERGED | +2 | +1 |
143807,4 | MB-43373: Abort memcached if exception thrown from GlobalTask::run v2 | master | kv_engine | Status: MERGED | +2 | +1 |
147330,2 | Revert "MB-43373: Abort memcached if exception thrown from GlobalTask::run v2" | master | kv_engine | Status: MERGED | +2 | +1 |