Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60276

"All" stats group goes to disk

    XMLWordPrintable

Details

    • Triaged
    • 0
    • Yes
    • KV 2023-4, Magma-Jan18-2024

    Description

      The stats group "all" (no key) goes to disk to fetch history stats as part of the vBucket visitor. For couchstore buckets in particular this is pointless as history (CDC) is not a supported feature for that store backend. This exacerbates slow disk issues and can cuase the node to be marked as not ready by ns_server if ns_server re-uses a connection currently performing "all" stats to gather the "warmup" stats that drive the buckets "readiness" status. We should only fetch these stats if the StorageProperties indicate that history retention is supported.

      Affected configurations

      Affects nodes with Couchstore buckets only.

      Presentation

      STAT(all) commands, which are issued by ns_server at every 10s, individually for every bucket, can cause latency spikes. This is because before MB-61376 these run on the front-end thread and because of this bug they also perform IO (which is slow) once for every vBucket (active or replica) on the node for that bucket.

      The customer may observe higher latency when those STAT commands run. Those STAT commands may show up as Slow operations, but only if they took longer than 500ms. These STAT commands will prevent other operations from running. An operation which was prevented from running might not be reported as a Slow operation, because Slow operations are timed from when the MCBP header was validated.

      Detection

      Check for Slow operations for "STAT" with keylen: 0 (which is the "all" group). If those are present, and clients observe timeouts or higher latency, they are likely seeing this bug. There might not be "Slow operation" logs for the operations which timed out on the client.

      Additionally, if MB-61466 is fixed, look at stats.log under the stat-timings header and observe the runtimes for the "all" stat group.

      If running on >=7.1.0, it is possible to manually run cbstats stat-timings to get the same information.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              ben.huddleston Ben Huddleston
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty