Uploaded image for project: 'Couchbase Monitoring and Observability Stack'
  1. Couchbase Monitoring and Observability Stack
  2. CMOS-130

"Healthy Nodes" and "All Nodes Healthy" stats misleading

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 0.1
    • None
    • cmos
    • None

    Description

      In the single-cluster overview dashboard, we have two stats that show the number of healthy nodes and whether that's all the nodes in the cluster. This is calculated by combining cbnode_healthy and cbnode_cluster_membership from the Exporter.

      However, when I made the dashboard I misinterpreted the latter metric: it treats a node that's failed over as not a member of the cluster, meaning that e.g. with three nodes and one failed-over it'll show that 2/2 are healthy and thus everything is fine.

      Look into alternate ways of implementing this, or if there are none, remove the "All Nodes Healthy" metric.

      On a similar though less damaging note, "Unhealthy Nodes" can be misleading when there is no data (for example, misconfigured exporter) as it'll show 0.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              marks.polakovs Marks Polakovs (Inactive)
              marks.polakovs Marks Polakovs (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty