Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49782

Expose per stat group timings for cbstats

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • KV 2021-Nov, KV 2021-Dec, KV 2022-Jan

    Description

      Currently, cbstats timings exposes per-opcode timings.

      This is useful, as it gives distributions of op durations. However, a STAT operation does not correspond to a single internal code path; the requested stat key selects one of a number of stat groups, each of which may have a significantly different duration distribution.

      Consider the following:

      The following data is collected for "STAT"
      [  0.00 -   5.00]us (0.0000%)         1|
      [  5.00 -  37.00]us (10.0000%)    94980| #########################################
      [ 37.00 -  51.00]us (20.0000%)    84194| ####################################
      [ 51.00 -  59.00]us (30.0000%)   101334| ############################################
      [ 59.00 -  67.00]us (40.0000%)    62137| ##########################
      [ 67.00 -  79.00]us (50.0000%)    80450| ##################################
      [ 79.00 - 255.00]us (55.0000%)    43659| ##################
      [255.00 - 351.00]us (60.0000%)    41777| ##################
      [351.00 - 927.00]us (65.0000%)    47208| ####################
      [  0.93 -   1.02]ms (70.0000%)    48366| #####################
      [  1.02 -   1.22]ms (75.0000%)    32849| ##############
      [  1.22 -   1.41]ms (77.5000%)    30597| #############
      [  1.41 -   1.47]ms (80.0000%)    23194| ##########
      [  1.47 -   1.53]ms (82.5000%)    20802| #########
      [  1.53 -   1.60]ms (85.0000%)    12376| #####
      [  1.60 -   1.79]ms (87.5000%)    17822| #######
      [  1.79 -   1.92]ms (88.7500%)    10819| ####
      [  1.92 -   2.05]ms (90.0000%)    12051| #####
      [  2.05 -   2.17]ms (91.2500%)    22366| #########
      [  2.17 -   2.17]ms (92.5000%)        0|
      [  2.17 -   2.30]ms (93.7500%)    19817| ########
      [  2.30 -   2.30]ms (94.3750%)        0|
      [  2.30 -   2.30]ms (95.0000%)        0|
      [  2.30 -   2.43]ms (95.6250%)     8795| ###
      [  2.43 -   2.43]ms (96.2500%)        0|
      [  2.43 -   2.56]ms (96.8750%)     5250| ##
      [  2.56 -   2.69]ms (97.1875%)     4574| #
      [  2.69 -   2.69]ms (97.5000%)        0|
      [  2.69 -   2.82]ms (97.8125%)     3063| #
      [  2.82 -   3.07]ms (98.1250%)     4474| #
      [  3.07 -   3.07]ms (98.4375%)        0|
      [  3.07 -   3.20]ms (98.5938%)     3055| #
      [  3.20 -   3.20]ms (98.7500%)        0|
      [  3.20 -   3.33]ms (98.9062%)     1973|
      [  3.33 -   3.45]ms (99.0625%)     1146|
      [  3.45 -   3.58]ms (99.2188%)      597|
      [  3.58 -   3.71]ms (99.2969%)      612|
      [  3.71 -   3.84]ms (99.3750%)      819|
      [  3.84 -   4.09]ms (99.4531%)      405|
      [  4.09 -   9.22]ms (99.5312%)      691|
      [  9.22 -  13.31]ms (99.6094%)      647|
      [ 13.31 - 102.40]ms (99.6484%)      316|
      [102.40 - 139.26]ms (99.6875%)      399|
      [139.26 - 163.84]ms (99.7266%)      359|
      [163.84 - 180.22]ms (99.7656%)      394|
      [180.22 - 188.41]ms (99.8047%)      228|
      [188.41 - 196.61]ms (99.8242%)      239|
      [196.61 - 204.80]ms (99.8438%)      224|
      [204.80 - 204.80]ms (99.8633%)        0|
      [204.80 - 212.99]ms (99.8828%)      146|
      [212.99 - 229.38]ms (99.9023%)      197|
      [229.38 - 237.57]ms (99.9121%)       71|
      [237.57 - 245.76]ms (99.9219%)       52|
      [245.76 - 253.95]ms (99.9316%)       81|
      [253.95 - 278.53]ms (99.9414%)      109|
      [278.53 - 589.82]ms (99.9512%)       42|
      [  0.59 -   5.77]s (99.9561%)        42|
      [  5.77 -   7.86]s (99.9609%)        43|
      [  7.86 -   9.44]s (99.9658%)        51|
      [  9.44 -   9.96]s (99.9707%)        46|
      [  9.96 -  10.49]s (99.9756%)        35|
      [ 10.49 -  11.01]s (99.9780%)        23|
      [ 11.01 -  11.53]s (99.9805%)        14|
      [ 11.53 -  12.06]s (99.9829%)        65|
      [ 12.06 -  12.06]s (99.9854%)         0|
      [ 12.06 -  12.06]s (99.9878%)         0|
      [ 12.06 -  12.58]s (99.9890%)        17|
      [ 12.58 -  12.58]s (99.9902%)         0|
      [ 12.58 -  13.11]s (99.9915%)         8|
      [ 13.11 -  20.97]s (99.9927%)         7|
      [ 20.97 -  31.46]s (99.9939%)        10|
      [ 31.46 -  39.85]s (99.9945%)        13|
      [ 39.85 -  39.85]s (99.9951%)         0|
      [ 39.85 -  41.94]s (99.9957%)         2|
      [ 41.94 -  52.43]s (99.9963%)         9|
      [ 52.43 -  56.62]s (99.9969%)        10|
      [ 56.62 -  56.62]s (99.9973%)         0|
      [ 56.62 -  56.62]s (99.9976%)         0|
      [ 56.62 -  56.62]s (99.9979%)         0|
      [ 56.62 -  58.72]s (99.9982%)         3|
      [ 58.72 -  60.82]s (99.9985%)         9|
      [ 60.82 -  60.82]s (99.9986%)         0|
      [ 60.82 -  60.82]s (99.9988%)         0|
      [ 60.82 -  60.82]s (99.9989%)         0|
      [ 60.82 -  60.82]s (99.9991%)         0|
      [ 60.82 -  60.82]s (99.9992%)         0|
      [ 60.82 -  60.82]s (99.9993%)         0|
      [ 60.82 -  60.82]s (99.9994%)         0|
      [ 60.82 -  62.91]s (99.9995%)         2|
      [ 62.91 -  62.91]s (99.9995%)         0|
      [ 62.91 -  62.91]s (99.9996%)         0|
      [ 62.91 -  65.01]s (99.9997%)         1|
      [ 65.01 -  65.01]s (99.9997%)         0|
      [ 65.01 -  65.01]s (99.9997%)         0|
      [ 65.01 -  67.11]s (99.9998%)         2|
      [ 67.11 -  67.11]s (100.0000%)        0|
      Total: 846169 operations
      

      While it is useful to know that some stats call took 67s, that still leaves a lot of uncertainty in what exactly took that long. This may be expected or far longer than acceptable, depending on which stat group it really represents.

      To improve debugging of slow stats calls, each stat group (e.g., vbucket-details, collections-details ) should be broken out into its own histogram. These should only be exposed for cbstats; they don't necessarily add enough value when recorded over time to warrant the extra disk space usage for Prometheus.

      Attachments

        For Gerrit Dashboard: MB-49782
        # Subject Branch Project Status CR V

        Activity

          Build couchbase-server-7.1.0-1816 contains kv_engine commit 74e96cd with commit message:
          MB-49782: [1/2] Avoid unnecessary move construction of ScopeTimer1

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1816 contains kv_engine commit 74e96cd with commit message: MB-49782 : [1/2] Avoid unnecessary move construction of ScopeTimer1

          Build couchbase-server-7.1.0-1816 contains platform commit ac2dafa with commit message:
          MB-49782: Generalise ScopeTimer to support N listeners

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1816 contains platform commit ac2dafa with commit message: MB-49782 : Generalise ScopeTimer to support N listeners

          Build couchbase-server-7.1.0-1817 contains kv_engine commit d9406ce with commit message:
          MB-49782: [2/2] Replace ScopeTimer

          {1,2}

          usage with ScopeTimer

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1817 contains kv_engine commit d9406ce with commit message: MB-49782 : [2/2] Replace ScopeTimer {1,2} usage with ScopeTimer

          Build couchbase-server-7.1.0-1819 contains platform commit 139a8ad with commit message:
          MB-49782: Remove unused ScopeTimer1 and ScopeTimer2

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1819 contains platform commit 139a8ad with commit message: MB-49782 : Remove unused ScopeTimer1 and ScopeTimer2

          Build couchbase-server-7.1.0-2047 contains kv_engine commit 4040aad with commit message:
          MB-49782: Introduce per-stat-key getStats() timing histograms

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2047 contains kv_engine commit 4040aad with commit message: MB-49782 : Introduce per-stat-key getStats() timing histograms

          People

            james.harrison James Harrison
            james.harrison James Harrison
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty