Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49786

Default number of AuxIO threads too low for HiDD workloads

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.1.0
    • None
    • couchbase-bucket
    • None

    Description

      The number of AuxIO threads is currently computed as:

      static const size_t EP_MAX_AUXIO_THREADS = 8;
      ...
      size_t ExecutorPool::calcNumAuxIO(size_t threadCount) const {
          // 1. compute: ceil of 10% of total threads
          size_t count = maxGlobalThreads / 10;
          if (!count || maxGlobalThreads % 10) {
              count++;
          }
          // 2. adjust computed value to be within range
          if (count > EP_MAX_AUXIO_THREADS) {
              count = EP_MAX_AUXIO_THREADS;
          }
          // 3. Override with user's value if specified
          if (threadCount) {
              count = threadCount;
          }
          return count;
      }
      

      That is - 1/10th of the number of CPU cores, with a min of 1 and max of 8.

      This has been seen to be insufficient when operating in heavy-DGM workloads which rely on backfill reading items from disk (instead of just returning values from in-memory HashTable), and to be negatively impact backfill throughput (MB-48693).

      As noted in MB-48693, other components such as FTS may create N streams per collection, per vbucket. As these streams normally start from 0, this may equate to a very large number of backfills (70k - 512k).

      KV will currently only attempt to serve 4096 backfills ( DcpConnMap::numBackfillsThreshold ), the rest will be snoozed and started when capacity is available.

      In small deployments, there may be only one AuxIO thread. This means that regardless of the number of BackfillManagerTasks [1] only one backfill will actually be reading items from disk at a time.

      As these tasks are typically IO bound, increasing the thread count is not expected to significantly increase CPU contention, but could improve the total rate at which items can be backfilled (in situations where disk IO is not saturated by a single backfill+frontend ops).


      [1] One per DcpProducer, which are generally one per consuming node per component

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              james.harrison James Harrison (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty