Uploaded image for project: 'Couchbase Monitoring and Observability Stack'
  1. Couchbase Monitoring and Observability Stack
  2. CMOS-328

Alerting Rules do not need to specify all labels

    XMLWordPrintable

Details

    Description

      Most alerts are defined with labels that already exist.  Whatever labels returned from the expression / PromQL are automatically added to the alert and sent to AlertManager, there is no reason to specify them.  For example the following labels can be removed: 

      • job
      • cluster
      • bucket
      • node

      These will automatically be available in the triggered alert, along with the "instance" label and whatever else is returned from the query. 

       

      - alert: CB90061-diskWriteQueueLength-Warn
            expr: |
              kv_ep_storage_age_seconds > 50 < 100
            for: 1m
            labels:
              job: couchbase_prometheus
              kind: bucket
              health_check_id: CB90061
              health_check_name: diskWriteQueueLength
              cluster: '{{ $labels.cluster }}'
              bucket: '{{ $labels.bucket }}'
              severity: warning
            annotations:
              title: "Long disk write queue on Bucket: {{ $labels.bucket }}"
              description: The age of items in the disk write queue is over 50 seconds.
              remediation: Review your hardware for malfunctions or sizing issues. 
                If the problem persists, then please contact Couchbase Technical Support. 
       

       

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Unassigned Unassigned
            aaron.benton Aaron Benton (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty