Details
-
Improvement
-
Resolution: Fixed
-
Major
-
7.6.0
-
0
-
KV 2023-4
Description
Disk I/O Optimized thread counts for Reader and Writer threads were introduced in 6.5.0 alongside SyncWrites, to improve both throughput and latency of operations requiring disk read/writes; by increasing the number of threads created for performing Reader / Writer tasks.
Currently (7.2.0) when Disk I/O Optimized is selected, the number of threads which are created is a function of the number of logical CPU cores available to the node (NCPUS):
- Readers: NCPU threads, min 4 max 128.
- Writers: NCPU threads, min 4 max 128.
As per comments in the code, the rationale for these values is:
// Configure Reader threads based on CPU count; increased up
|
// to a maximum of 128 threads.
|
|
// Note: For maximum IO throughput we should create as many Reader
|
// threads as concurrent iops the system can support, given we use
|
// synchronous (blocking) IO and hence could utilise more threads than
|
// CPU cores. However, knowing the number of concurrent IOPs the system
|
// can support is hard, so we use #CPUs as a proxy for it - machines
|
// with lots of CPU cores are more likely to have more IO than little
|
// machines.
|
// However given we don't have test environments larger than
|
// ~128 cores, limit to 128.
|
(Writers has a similar comment and essentially same rationale).
However, recent experience with environments with (relatively) high latency disks, and/or disks with high max queue depths (NVMe) has highlighted that one IO thread per CPU core is woefully insufficient to saturate the available disk IO.
For example, on MB-56202 - specifically this summary table comment - we observed that the available GCP disk IOPS of ~15,000 could not be achieved without increasing the number of IO threads to 32 - 2x the number of CPU cores in that test. (Note that experiment was around DCP backfill using AuxIO threads, but the principle I believe applies here also).
I propose we increase the coefficient applied to NCPUs for the Disk I/O Optimized setting for Reader and Writer threads from 1 to a higher value - perhaps 2 or even 4. This should give users a much better out-of-the-box experience when Disk I/O Optimized is selected[*], particulary in high-latency disk environments like EBS-style disks in the Cloud.
[*] I believe Disk I/O Optimized is the setting used in Capella, so this would automatically apply there.
See also: MB-55086 where we recently increased the number of NonIO threads.