Details
-
Bug
-
Resolution: Fixed
-
Major
-
6.5.1, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5
-
Triaged
-
1
-
No
Description
The ActiveStreamCheckpointProcessorTask is incorrectly assigned to the AuxIO thread pool (which is intended for IO-bound tasks not directly related to reading / flushing data).
This bug traces back to when ActiveStreamCheckpointProcessorTask was created on the 3.x branch - from the review comments of the original patch (http://review.couchbase.org/c/ep-engine/+/57148/5/src/dcp-producer.cc#152):
Jim Walker
2015-11-24
The initial reason is to try and distribute the load. NONIO is already running the DcpConsumer Processor(s), itempager, expirypager, checkpoint remover, hashtable resizer and many more (http://src.couchbase.org/source/search?q=&defs=&refs=NONIO_TASK_IDX&path=&hist=&type=&project=3.1.1) whereas AUXIO is running DcpBackfill and access scanner (other tasks appear to be TAP related). As stated in the comments, for future (master branch) this should be revisited, maybe creating a DCP task type or increasing NONIO threads?
There is arguably two problems with it running AuxIO:
- AuxIO tasks can often have long runtimes (latencies) as they are performing synchronous disk IO - and hence could impact the scheduling latency of the ActiveStreamCheckpointProcessorTask which should ideally have low latency (given it directly affects overall replication latency).
- The AuxIO thread pool is small (compared to the NonIO thread pool) - thread counts for a few different machine sizes:
#CPUs #AuxIO threads #NonIO threads 8 1 2 16 2 4 32 4 8 As such, DCP streaming (for example for in-memory phase of rebalance) can be limited in how much CPU it can use and be prematurely CPU-bound.