Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44646

High Memory usage even after increasing the scrape interval to 10s




      This may or may not be a bug, but it would be good to understand the issue on why the RAM usage on .74 node remained high (after waiting sufficiently long) after increasing the scrape_interval and scrape_timeout to 10s (from earlier 1s that was set)

      The cluster contains 30 buckets, 500 indexes x 2 replicas, close to 1000 collections, approx 1000 scopes and 15 XDCR replications. The scrape interval was set to 1s at some point during the volume test, and when we set it back to 10s it continued to remain at 95%

      Some consequences (as a result of the high RAM on .74 node probably):
      1. XDCR on .74 node shows up as "No XDCR setup)
      2. _prometheusMetrics endpoint on .74 node does not return any metrics and gets stuck when we call the endpoint. This can be seen on the targets page of Prometheus. (On other nodes the endpoint of xdcr metrics returns metrics appropriately)

      See and for RAM usage before resetting the prometheus settings.

      Also noticed lot of REST calls where timing out on UI. See because of which XDCR replications were not visible on UI.

      Also, here are the logs for when the scrape_interval was at 1s (ie; before it was increased back to 10s)


        1. prometheus_targets.png
          451 kB
          Sumedh Basarkod
        2. Screen Shot 2021-03-01 at 1.05.32 PM.png
          451 kB
          Balakumaran Gopal
        3. Screen Shot 2021-03-01 at 11.12.28 AM.png
          2.36 MB
          Balakumaran Gopal
        4. Screen Shot 2021-03-01 at 11.12.43 AM.png
          2.69 MB
          Balakumaran Gopal
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.



            dfinlay Dave Finlay
            sumedh.basarkod Sumedh Basarkod (Inactive)
            0 Vote for this issue
            5 Start watching this issue



              Gerrit Reviews

                There are no open Gerrit changes
