Uploaded image for project: 'Couchbase Monitoring and Observability Stack'
  1. Couchbase Monitoring and Observability Stack
  2. CMOS-71

Output current checker state in Prometheus metrics

    XMLWordPrintable

Details

    Description

      Grafana 8 has "State timeline" and "Status History" panels that would be perfect to plug our checker results into:


      Unfortunately, the way we're currently outputing checker results (an incrementing gauge every time it changes) makes it difficult to work with this, because Grafana wants a single value to be the state to display (you can change what it represents using value mappings, for example "0 = good, 1 = bad", but it still expects a fixed set of values - and getting the current value out of the label using PromQL is an exercise in frustration).

      I suggest that, instead of the current approach (an ever-incrementing gauge where the current state is a label), we have the current state be the value of the gauge, with the usual set of cluster/node/checker-name labels - for example, output 0 for good, 1 for warn, 2 for alert, etc.

      /cc Patrick Stephens

      Attachments

        1. image-2021-09-30-12-41-29-671.png
          29 kB
          Marks Polakovs
        2. image-2021-09-30-12-41-24-156.png
          32 kB
          Marks Polakovs

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              marks.polakovs Marks Polakovs (Inactive)
              marks.polakovs Marks Polakovs (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty