Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-57615

Error after header write of chunked response on prometheus fetch results > 4K

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • Analytics Sprint 21

    Description

      Observed in MB-57601 logs.

      In a recent system test run, CBAS is generating a WARNing on every prometheus fetch, where the result is > 4K. There are over 9900 instances of this warning in the logs.

      2023-06-23T03:50:37.288-07:00 WARN CBAS.server.ChunkedResponse [HttpExecutor(port:8095)-0] Error after header write of chunked response
      2023-06-23T03:50:47.288-07:00 WARN CBAS.server.ChunkedResponse [HttpExecutor(port:8095)-3] Error after header write of chunked response
      2023-06-23T03:50:57.288-07:00 WARN CBAS.server.ChunkedResponse [HttpExecutor(port:8095)-6] Error after header write of chunked response
      2023-06-23T03:51:07.288-07:00 WARN CBAS.server.ChunkedResponse [HttpExecutor(port:8095)-5] Error after header write of chunked response
      2023-06-23T03:51:17.288-07:00 WARN CBAS.server.ChunkedResponse [HttpExecutor(port:8095)-8] Error after header write of chunked response
      2023-06-23T03:51:27.288-07:00 WARN CBAS.server.ChunkedResponse [HttpExecutor(port:8095)-9] Error after header write of chunked response
      

      Also, the result is returning a 500 to ns_server- I do not know if ns_server successfully scrapes otherwise well-formed results w/ a 500 status- if they do not scrape metrics w/ a 500 error, the issue is even more severe than the WARN spam in the log.

      EDIT: I can confirm from promtimer that all analytics stats are missing when the 500 is returned for ns_server; so this is a pretty severe issue.

       

       

      Issue Resolution
      When the Prometheus stats returned from Analytics exceeded four kilobytes, the status code was inadvertently set to 500 (Internal Error), and this resulted in a large number of warnings in the Analytics warning log. Couchbase Server discarded these statistics. This has been fixed to properly return a 200 (OK) status code when the size of Prometheus stats exceeds 4KiB, allowing these stats to be recorded properly. The warning is not displayed.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              michael.blow Michael Blow
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty