Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47502

Disable stats decimation as a 7.0.1 workaround for a memory leak in Prometheus

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      On test installation in aws prometheus consumed ~25G of memory and got OOM killed:

      level=info ts=2021-07-15T10:18:46.826Z caller=compact.go:494 component=tsdb msg="write block" mint=1626307200000 maxt=1626328800000 ulid=01FAMTT0VNVMV2149RSNW0M4GW duration=821.47751ms
      level=info ts=2021-07-15T10:18:47.407Z caller=compact.go:494 component=tsdb msg="write block" mint=1626328800000 maxt=1626336000000 ulid=01FAMTT1NAT3G7FPGE729R1PBR duration=580.965225ms
      level=info ts=2021-07-15T10:18:47.416Z caller=db.go:1152 component=tsdb msg="Deleting obsolete block" block=01FAMTR51NKY7WPP56W2201MMV
      level=info ts=2021-07-15T10:18:47.420Z caller=db.go:1152 component=tsdb msg="Deleting obsolete block" block=01FAMTR5VBSYVN4G5XEWFRXZPZ
      level=info ts=2021-07-15T10:18:47.436Z caller=db.go:1152 component=tsdb msg="Deleting obsolete block" block=01FAMTR3GW7WWJGKNF37FDXK60
      fatal error: runtime: out of memory
      runtime stack:
      runtime.throw(0x28d6767, 0x16)
          /home/couchbase/jenkins/workspace/cbdeps-platform-build/deps/go1.14.2/src/runtime/panic.go:1116 +0x72
      runtime.sysMap(0xc6cc000000, 0x4000000, 0x45aa2d8)
          /home/couchbase/jenkins/workspace/cbdeps-platform-build/deps/go1.14.2/src/runtime/mem_linux.go:169 +0xc5
      runtime.(*mheap).sysAlloc(0x45954a0, 0x400000, 0x45954a8, 0xb9)
          /home/couchbase/jenkins/workspace/cbdeps-platform-build/deps/go1.14.2/src/runtime/malloc.go:715 +0x1cd
      runtime.(*mheap).grow(0x45954a0, 0xb9, 0x0)
          /home/couchbase/jenkins/workspace/cbdeps-platform-build/deps/go1.14.2/src/runtime/mheap.go:1286 +0x11c
      runtime.(*mheap).allocSpan(0x45954a0, 0xb9, 0xfc10100, 0x45aa2e8, 0xc004f6bf28)
      

      Logs: https://s3.amazonaws.com/cb-engineering/stevewatanabe-19JUL21-AWS/collectinfo-2021-07-19T165101-ns_1%40127.0.0.1.zip

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-47502
          # Subject Branch Project Status CR V

          Activity

            timofey.barmin Timofey Barmin created issue -
            timofey.barmin Timofey Barmin made changes -
            Field Original Value New Value
            Link This issue relates to MB-44878 [ MB-44878 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Assignee Meni Hillel [ JIRAUSER25407 ] Steve Watanabe [ steve.watanabe ]
            steve.watanabe Steve Watanabe made changes -
            Assignee Steve Watanabe [ steve.watanabe ] Timofey Barmin [ timofey.barmin ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Link This issue blocks MB-47469 [ MB-47469 ]
            timofey.barmin Timofey Barmin made changes -
            Link This issue is cloned by MB-47816 [ MB-47816 ]
            timofey.barmin Timofey Barmin made changes -
            Summary Prometheus consumes inappropriate amount of memory and gets OOM killed Disable stats decimation as a workaround for 7.0.1 because of memory leak in Prometheus
            meni.hillel Meni Hillel (Inactive) made changes -
            Fix Version/s 7.0.2 [ 18012 ]
            Fix Version/s 7.0.1 [ 17104 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            Fix Version/s 7.0.2 [ 18012 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.2 [ 18012 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            timofey.barmin Timofey Barmin made changes -
            Summary Disable stats decimation as a workaround for 7.0.1 because of memory leak in Prometheus Disable stats decimation as a 7.0.1 workaround for a memory leak in Prometheus
            meni.hillel Meni Hillel (Inactive) made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            Fix Version/s 7.0.2 [ 18012 ]
            timofey.barmin Timofey Barmin made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]
            ritam.sharma Ritam Sharma made changes -
            Remote Link This issue links to "Page (Couchbase, Inc. Wiki)" [ 23033 ]
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Balakumaran.Gopal Balakumaran Gopal made changes -
            wayne Wayne Siu made changes -
            Labels approved-for-7.0.1
            wayne Wayne Siu made changes -
            Assignee Timofey Barmin [ timofey.barmin ] Ritam Sharma [ ritam.sharma ]
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            malarky Chris Malarky made changes -
            Link This issue relates to CBSP-3994 [ CBSP-3994 ]
            drigby Dave Rigby made changes -
            Link This issue causes CBSE-10772 [ CBSE-10772 ]

            People

              ritam.sharma Ritam Sharma
              timofey.barmin Timofey Barmin
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty