Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46675

Round KV metric sample timestamps for Prometheus

    XMLWordPrintable

Details

    Description

      Prometheus encodes sample timestamps as a delta-of-deltas in milliseconds, e.g.,:

      TS      Delta     DoD
      1000
      2000    1000
      3000    1000      0
      4010    1010      10
      5010    1000      0
      

      These delta-of-deltas are encoded in a variable number of bits in the chunk files.

      If a DoD is exactly 0, it is encoded into a single bit. If greater than 0, the next bitwidth Prometheus will use is 14 bits, with a prefix. That is, if the DoD is even a single millisecond, the size on disk increases from 1 bit to 14 bits + 2 prefix bits.

      Pulling data from the breakdown of a set of logs in MB-45843 (bearing in mind that this is only a single data point), it appears over the time covered by the logs, roughly 12% of sample DoDs are exactly 0 and 86% were encoded as 14+2 bits. The remainder were large enough to require more than 14 bits to encode (i.e., a relatively small number of samples had a DoD >~16s).

      However, the vast majority (97%) of all DoDs could have been encoded in 5 bits, suggesting they were 31ms or lower. If a value could have been encoded in 5 bits, but needed to be padded out to the next predetermined bitwidth, 14 bits, 9 or more bits are essentially wasted.

      To a degree, DoDs represent "jitter" in the scrape interval. If the interval was consistent to the millisecond, all DoDs would be exactly 0. Even the small jitter seen in most cases (<=31ms) increases the disk usage of a given sample significantly.

      The Prometheus exposition format does have provision for an exporter to include a timestamp with each sample. This means KV can take control over the reported time for each sample.

      The simplest method to increase how many DoDs are exactly zero would be for KV to round the sample time to the nearest 100ms. This means as long as Prometheus scrapes are received with an interval consistent to 1/10th of a second (which appears to be the case, based on the <=31ms DoDs for most samples), the computed DoD will be 0.

      This means that the sample time stored in Prometheus could be up to 50ms away from the true sample time. Given that the scrape interval is typically 10 or more seconds, the error this deviation may introduce in e.g., rate calculations is likely an acceptable tradeoff.

      This does come with the risk of scrapes coming in at times very near to a rounding boundary, and the reported interval appearing to flip flop up and down depending on whether the time was rounded up or down. However, a +/- 100ms DoD is representable in 7/8 bits, which will be expanded out to 14+2 bits - no worse than is seen without rounding.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              james.harrison James Harrison (Inactive)
              james.harrison James Harrison (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty