Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-61031

Prometheus chunks_head files can become corrupted (files contain all zeroes)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 7.6.0
    • 7.2.2
    • None
    • Untriaged
    • 0
    • No

    Description

      Creating this to track a known issue in Couchbase Server's Prometheus dependency (see https://github.com/prometheus/prometheus/pull/11338)

      The issue manifests itself as errors in the Prometheus logging like the following:

      caller=main.go:1063 level=error err="opening storage failed: /opt/couchbase/var/lib/couchbase/stats_data/chunks_head/<filename>: invalid magic number 0" 

      On inspection the file will be entirely made up of zero bytes (these files are collected as part of a cbcollect, so this can be checked by Support from a customer's uploaded logs to verify).

      The error causes Prometheus to repeatedly restart.

      The end user has to fix this manually by deleting the corrupted file.

      The 7.2.x releases using Prometheus v2.33.4 are vulnerable, possibly earlier versions too.

      Trinity/7.6 uses Prometheus v2.45.0, which includes the above linked PR fix in Prometheus, so should be able to recover from this state.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              jack.bakes Jack Bakes
              jack.bakes Jack Bakes
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty