Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49782

Expose per stat group timings for cbstats

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • KV 2021-Nov, KV 2021-Dec, KV 2022-Jan

    Description

      Currently, cbstats timings exposes per-opcode timings.

      This is useful, as it gives distributions of op durations. However, a STAT operation does not correspond to a single internal code path; the requested stat key selects one of a number of stat groups, each of which may have a significantly different duration distribution.

      Consider the following:

      The following data is collected for "STAT"
      [  0.00 -   5.00]us (0.0000%)         1|
      [  5.00 -  37.00]us (10.0000%)    94980| #########################################
      [ 37.00 -  51.00]us (20.0000%)    84194| ####################################
      [ 51.00 -  59.00]us (30.0000%)   101334| ############################################
      [ 59.00 -  67.00]us (40.0000%)    62137| ##########################
      [ 67.00 -  79.00]us (50.0000%)    80450| ##################################
      [ 79.00 - 255.00]us (55.0000%)    43659| ##################
      [255.00 - 351.00]us (60.0000%)    41777| ##################
      [351.00 - 927.00]us (65.0000%)    47208| ####################
      [  0.93 -   1.02]ms (70.0000%)    48366| #####################
      [  1.02 -   1.22]ms (75.0000%)    32849| ##############
      [  1.22 -   1.41]ms (77.5000%)    30597| #############
      [  1.41 -   1.47]ms (80.0000%)    23194| ##########
      [  1.47 -   1.53]ms (82.5000%)    20802| #########
      [  1.53 -   1.60]ms (85.0000%)    12376| #####
      [  1.60 -   1.79]ms (87.5000%)    17822| #######
      [  1.79 -   1.92]ms (88.7500%)    10819| ####
      [  1.92 -   2.05]ms (90.0000%)    12051| #####
      [  2.05 -   2.17]ms (91.2500%)    22366| #########
      [  2.17 -   2.17]ms (92.5000%)        0|
      [  2.17 -   2.30]ms (93.7500%)    19817| ########
      [  2.30 -   2.30]ms (94.3750%)        0|
      [  2.30 -   2.30]ms (95.0000%)        0|
      [  2.30 -   2.43]ms (95.6250%)     8795| ###
      [  2.43 -   2.43]ms (96.2500%)        0|
      [  2.43 -   2.56]ms (96.8750%)     5250| ##
      [  2.56 -   2.69]ms (97.1875%)     4574| #
      [  2.69 -   2.69]ms (97.5000%)        0|
      [  2.69 -   2.82]ms (97.8125%)     3063| #
      [  2.82 -   3.07]ms (98.1250%)     4474| #
      [  3.07 -   3.07]ms (98.4375%)        0|
      [  3.07 -   3.20]ms (98.5938%)     3055| #
      [  3.20 -   3.20]ms (98.7500%)        0|
      [  3.20 -   3.33]ms (98.9062%)     1973|
      [  3.33 -   3.45]ms (99.0625%)     1146|
      [  3.45 -   3.58]ms (99.2188%)      597|
      [  3.58 -   3.71]ms (99.2969%)      612|
      [  3.71 -   3.84]ms (99.3750%)      819|
      [  3.84 -   4.09]ms (99.4531%)      405|
      [  4.09 -   9.22]ms (99.5312%)      691|
      [  9.22 -  13.31]ms (99.6094%)      647|
      [ 13.31 - 102.40]ms (99.6484%)      316|
      [102.40 - 139.26]ms (99.6875%)      399|
      [139.26 - 163.84]ms (99.7266%)      359|
      [163.84 - 180.22]ms (99.7656%)      394|
      [180.22 - 188.41]ms (99.8047%)      228|
      [188.41 - 196.61]ms (99.8242%)      239|
      [196.61 - 204.80]ms (99.8438%)      224|
      [204.80 - 204.80]ms (99.8633%)        0|
      [204.80 - 212.99]ms (99.8828%)      146|
      [212.99 - 229.38]ms (99.9023%)      197|
      [229.38 - 237.57]ms (99.9121%)       71|
      [237.57 - 245.76]ms (99.9219%)       52|
      [245.76 - 253.95]ms (99.9316%)       81|
      [253.95 - 278.53]ms (99.9414%)      109|
      [278.53 - 589.82]ms (99.9512%)       42|
      [  0.59 -   5.77]s (99.9561%)        42|
      [  5.77 -   7.86]s (99.9609%)        43|
      [  7.86 -   9.44]s (99.9658%)        51|
      [  9.44 -   9.96]s (99.9707%)        46|
      [  9.96 -  10.49]s (99.9756%)        35|
      [ 10.49 -  11.01]s (99.9780%)        23|
      [ 11.01 -  11.53]s (99.9805%)        14|
      [ 11.53 -  12.06]s (99.9829%)        65|
      [ 12.06 -  12.06]s (99.9854%)         0|
      [ 12.06 -  12.06]s (99.9878%)         0|
      [ 12.06 -  12.58]s (99.9890%)        17|
      [ 12.58 -  12.58]s (99.9902%)         0|
      [ 12.58 -  13.11]s (99.9915%)         8|
      [ 13.11 -  20.97]s (99.9927%)         7|
      [ 20.97 -  31.46]s (99.9939%)        10|
      [ 31.46 -  39.85]s (99.9945%)        13|
      [ 39.85 -  39.85]s (99.9951%)         0|
      [ 39.85 -  41.94]s (99.9957%)         2|
      [ 41.94 -  52.43]s (99.9963%)         9|
      [ 52.43 -  56.62]s (99.9969%)        10|
      [ 56.62 -  56.62]s (99.9973%)         0|
      [ 56.62 -  56.62]s (99.9976%)         0|
      [ 56.62 -  56.62]s (99.9979%)         0|
      [ 56.62 -  58.72]s (99.9982%)         3|
      [ 58.72 -  60.82]s (99.9985%)         9|
      [ 60.82 -  60.82]s (99.9986%)         0|
      [ 60.82 -  60.82]s (99.9988%)         0|
      [ 60.82 -  60.82]s (99.9989%)         0|
      [ 60.82 -  60.82]s (99.9991%)         0|
      [ 60.82 -  60.82]s (99.9992%)         0|
      [ 60.82 -  60.82]s (99.9993%)         0|
      [ 60.82 -  60.82]s (99.9994%)         0|
      [ 60.82 -  62.91]s (99.9995%)         2|
      [ 62.91 -  62.91]s (99.9995%)         0|
      [ 62.91 -  62.91]s (99.9996%)         0|
      [ 62.91 -  65.01]s (99.9997%)         1|
      [ 65.01 -  65.01]s (99.9997%)         0|
      [ 65.01 -  65.01]s (99.9997%)         0|
      [ 65.01 -  67.11]s (99.9998%)         2|
      [ 67.11 -  67.11]s (100.0000%)        0|
      Total: 846169 operations
      

      While it is useful to know that some stats call took 67s, that still leaves a lot of uncertainty in what exactly took that long. This may be expected or far longer than acceptable, depending on which stat group it really represents.

      To improve debugging of slow stats calls, each stat group (e.g., vbucket-details, collections-details ) should be broken out into its own histogram. These should only be exposed for cbstats; they don't necessarily add enough value when recorded over time to warrant the extra disk space usage for Prometheus.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          james.harrison James Harrison created issue -
          owend Daniel Owen made changes -
          Field Original Value New Value
          Rank Ranked higher
          james.harrison James Harrison made changes -
          Assignee Daniel Owen [ owend ] James Harrison [ james.harrison ]
          james.harrison James Harrison made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          james.harrison James Harrison made changes -
          Sprint KV 2021-Nov [ 1866 ]
          owend Daniel Owen made changes -
          Rank Ranked higher
          owend Daniel Owen made changes -
          Sprint KV 2021-Nov [ 1866 ] KV 2021-Nov, Magma 2021-Sep-21 [ 1866, 1906 ]
          owend Daniel Owen made changes -
          Affects Version/s Neo [ 17615 ]
          owend Daniel Owen made changes -
          Sprint KV 2021-Nov, KV 2021-Dec [ 1866, 1906 ] KV 2021-Nov, KV 2021-Dec, Magma 2021-Sep-21 [ 1866, 1906, 1957 ]
          james.harrison James Harrison made changes -
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Resolved [ 5 ]
          ritam.sharma Ritam Sharma made changes -
          Labels request-dev-verify
          james.harrison James Harrison made changes -
          VERIFICATION STEPS Create 1 node cluster, with a bucket.

          perform some stats calls, with and without arguments e.g.,:

          {{cbstats ... collections-details}}
          {{cbstats ... collections-details 0}}


          Now call:

          {{cbstats ... stat-timings}}

          Expect output to contain histograms for all called stats (and others called by ns_server)

          {noformat}
           stat:collections-details (11 total)
                 0us - 59ms : ( 18.1818%) 2 ###########################################################
                59ms - 59ms : ( 18.1818%) 0
                59ms - 69ms : ( 36.3636%) 2 ###########################################################
                69ms - 69ms : ( 36.3636%) 0
                69ms - 73ms : ( 45.4545%) 1 #############################
                73ms - 77ms : ( 81.8182%) 4 ######################################################################################################################
                77ms - 77ms : ( 81.8182%) 0
                77ms - 86ms : ( 90.9091%) 1 #############################
                86ms - 86ms : ( 90.9091%) 0
                86ms - 98ms : (100.0000%) 1 #############################
              Avg : ( 73ms)
           stat:collections-details.single (4 total)
                 0us - 151us : ( 25.0000%) 1 #################################################################################
               151us - 151us : ( 25.0000%) 0
               151us - 183us : ( 75.0000%) 2 ###################################################################################################################################################################
               183us - 183us : ( 75.0000%) 0
               183us - 191us : (100.0000%) 1 #################################################################################
              Avg
          {noformat}
          Status Resolved [ 5 ] Closed [ 6 ]

          People

            james.harrison James Harrison
            james.harrison James Harrison
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty