Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51950

[Magma] Magma bucket not honouring RAM quota allocated when analytics is ingesting data from it.

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 7.1.1
    • 7.1.2
    • couchbase-bucket
    • Enterprise Edition 7.1.1 build 3025
    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Description

      Observation -
      Although only 1 GB RAM is allocated to the magma bucket, the RAM usage is reaching around 3.9 GB. This is observed when analytics is trying to ingest data into it's datasets.

      Cluster Info -

      Node Services CPU_utilization Mem_total Mem_free Swap_mem_used Active / Replica Version
      172.23.108.0 cbas 2.46478873239 3.67 GiB 2.50 GiB 0.0 Byte / 3.50 GiB 0 / 0 7.1.1-3025-enterprise
      172.23.108.1 cbas 2.19143576826 3.67 GiB 2.67 GiB 0.0 Byte / 3.50 GiB 0 / 0 7.1.1-3025-enterprise
      172.23.108.102 kv 1.91194968553 3.67 GiB 3.05 GiB 72.00 MiB / 3.50 GiB 0 / 0 7.1.1-3025-enterprise
      172.23.108.100 kv, n1ql 2.16243399547 3.67 GiB 2.94 GiB 3.75 MiB / 3.50 GiB 0 / 0 7.1.1-3025-enterprise

       Steps to reproduce -
      1. Create cluster as mentioned above.
      2. Create a magma bucket with 512 MB RAM (since there are 2 KV nodes, total RAM for bucket becomes 1 GB) allocated to it and replica set as 1.
      3. Create 10 scopes (including the default) and each scope should have 25 collections.
      4. Load 5300 docs in each collection. Each doc size is 1 KB.
      5. Create dataset on each of the collection of the KV bucket created above.
      6. It is observed that the RAM usage of the bucket is exceeding the RAM allocation.

      RAM Quota :

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Umang Can you please upload the cbcollect logs? Is it full ejection mode for the bucket?

            sarath Sarath Lakshman added a comment - Umang Can you please upload the cbcollect logs? Is it full ejection mode for the bucket?
            umang.agrawal Umang added a comment - - edited Sarath Lakshman Please find the logs attached https://cb-jira.s3.us-east-2.amazonaws.com/logs/MB-51950/collectinfo-2022-04-28T065055-ns_1%40172.23.105.244.zip https://cb-jira.s3.us-east-2.amazonaws.com/logs/MB-51950/collectinfo-2022-04-28T065055-ns_1%40172.23.105.245.zip https://cb-jira.s3.us-east-2.amazonaws.com/logs/MB-51950/collectinfo-2022-04-28T065055-ns_1%40172.23.105.31.zip https://cb-jira.s3.us-east-2.amazonaws.com/logs/MB-51950/collectinfo-2022-04-28T065055-ns_1%40172.23.121.243.zip The bucket is in full eviction mode.
            rohan.suri Rohan Suri added a comment -

            Nodes 31, 244 are the kv nodes. Node 244 looks fine and is within the quota. Node 31 has the problem. Looking at the memory domain stats:

            primary (kv memory) varies between 270MB and 2G.

            secondary (magma memory) is constant at 104MB.

            Looks like a kv issue. Interestingly there are no operations happening during this period. 

            rohan.suri Rohan Suri added a comment - Nodes 31, 244 are the kv nodes. Node 244 looks fine and is within the quota. Node 31 has the problem. Looking at the memory domain stats: primary (kv memory) varies between 270MB and 2G. secondary (magma memory) is constant at 104MB. Looks like a kv issue. Interestingly there are no operations happening during this period. 

            Thanks Rohan Suri.

            Daniel Owen Can somebody from kv-engine team take a look?

            sarath Sarath Lakshman added a comment - Thanks Rohan Suri . Daniel Owen Can somebody from kv-engine team take a look?
            umang.agrawal Umang added a comment - Rohan Suri Sarath Lakshman Cluster is still available, so if you need to check it out you can. http://172.23.105.31:8091/ui/index.html#/overview/stats?commonBucket=default&scenarioZoom=minute&scenario=z8l0fyz0p&statsHostname=all

            I took a look at node 31. Observed that the live cluster is still seeing this spiky memory usage in the primary domain. Initially did a perf profile and found that we were spending 40% of the machines CPU on the ItemPager resetting histograms. That turned out to be a bit of a red herring though because residency is actually 0% already and the ItemPager is running due to memory usage baseline being above the high watermark. Questions now are why do we have massive memory usage spikes and why is the bucket remaining above the high watermark.

            I looked first at the spikes to see if I could work out what was allocating so much memory. I took a bunch of jemalloc profiles and got lucky capturing one of the allocations. The massive allocation was in the StatSnap task primarily due to DCP stats. The stats.json file was ~380MB! The majority of entries ended up being DCP stream stats. We have a LOT of DCP streams.

            [root@sa1502 ~]# cat /opt/couchbase/var/lib/couchbase/data/default/stats.json | tr , '\n' | grep num_streams
            "eq_dcpq:cbas:Local:default:3b72b91f41af06b0849f5eb9d17df449:0:num_streams":"42579"
            "eq_dcpq:cbas:Local:default:3b72b91f41af06b0849f5eb9d17df449:1:num_streams":"0"
            "eq_dcpq:cbas:Local:default:3b72b91f41af06b0849f5eb9d17df449:2:num_streams":"0"
            "eq_dcpq:cbas:Local:default:6038f10c08b51f865d5625a7ebab6c28:3:num_streams":"42579"
            "eq_dcpq:cbas:Local:default:6038f10c08b51f865d5625a7ebab6c28:4:num_streams":"21165"
            "eq_dcpq:cbas:Local:default:6038f10c08b51f865d5625a7ebab6c28:5:num_streams":"21165"
            "eq_dcpq:replication:ns_1@172.23.105.31->ns_1@172.23.105.244:default:num_streams":"512"
            

            For some reason analytics is creating 127,488 streams. We're using well over 100MB of memory for ActiveStream objects (which have a sizeof of 880 bytes). We can't get a dump of where memory is currently allocated in the process, only the difference between two points, so we can't say exactly where the memory is going, but I suspect we're using many hundreds of MB for analytics DCP streams in various buffers that are dynamically allocated.

            I'm going to spin out an MB to remove the DCP stats from StatSnap and stats.json because we don't use them (stats.json is not even captured in a cbcollect) and we can give this MB to analytics to investigate the number of streams that they are creating as it's pretty excessive.

            ben.huddleston Ben Huddleston added a comment - I took a look at node 31. Observed that the live cluster is still seeing this spiky memory usage in the primary domain. Initially did a perf profile and found that we were spending 40% of the machines CPU on the ItemPager resetting histograms. That turned out to be a bit of a red herring though because residency is actually 0% already and the ItemPager is running due to memory usage baseline being above the high watermark. Questions now are why do we have massive memory usage spikes and why is the bucket remaining above the high watermark. I looked first at the spikes to see if I could work out what was allocating so much memory. I took a bunch of jemalloc profiles and got lucky capturing one of the allocations. The massive allocation was in the StatSnap task primarily due to DCP stats. The stats.json file was ~380MB! The majority of entries ended up being DCP stream stats. We have a LOT of DCP streams. [root@sa1502 ~]# cat /opt/couchbase/var/lib/couchbase/data/default/stats.json | tr , '\n' | grep num_streams "eq_dcpq:cbas:Local:default:3b72b91f41af06b0849f5eb9d17df449:0:num_streams":"42579" "eq_dcpq:cbas:Local:default:3b72b91f41af06b0849f5eb9d17df449:1:num_streams":"0" "eq_dcpq:cbas:Local:default:3b72b91f41af06b0849f5eb9d17df449:2:num_streams":"0" "eq_dcpq:cbas:Local:default:6038f10c08b51f865d5625a7ebab6c28:3:num_streams":"42579" "eq_dcpq:cbas:Local:default:6038f10c08b51f865d5625a7ebab6c28:4:num_streams":"21165" "eq_dcpq:cbas:Local:default:6038f10c08b51f865d5625a7ebab6c28:5:num_streams":"21165" "eq_dcpq:replication:ns_1@172.23.105.31->ns_1@172.23.105.244:default:num_streams":"512" For some reason analytics is creating 127,488 streams. We're using well over 100MB of memory for ActiveStream objects (which have a sizeof of 880 bytes). We can't get a dump of where memory is currently allocated in the process, only the difference between two points, so we can't say exactly where the memory is going, but I suspect we're using many hundreds of MB for analytics DCP streams in various buffers that are dynamically allocated. I'm going to spin out an MB to remove the DCP stats from StatSnap and stats.json because we don't use them (stats.json is not even captured in a cbcollect) and we can give this MB to analytics to investigate the number of streams that they are creating as it's pretty excessive.

            MB-51968 tracks the StatSnap task issue.

            ben.huddleston Ben Huddleston added a comment - MB-51968 tracks the StatSnap task issue.

            Till Westmann could somebody from the Analytics team please look into why we have so many DCP streams here.

            ben.huddleston Ben Huddleston added a comment - Till Westmann could somebody from the Analytics team please look into why we have so many DCP streams here.
            michael.blow Michael Blow added a comment -

            In 7.1.1 Analytics creates one stream per distinct collection. WIth 250 collections over 1024 vbuckets across two KV nodes, 125K streams per node doesn't seem to be surprising- I would need to look at the logs to understand the distribution, as it seems odd.

            In general, it seems odd to stream 250 distinct collections to run with such low memory configuration. Is there a reason we are testing this combination of scale and memory Umang?

            michael.blow Michael Blow added a comment - In 7.1.1 Analytics creates one stream per distinct collection. WIth 250 collections over 1024 vbuckets across two KV nodes, 125K streams per node doesn't seem to be surprising- I would need to look at the logs to understand the distribution, as it seems odd. In general, it seems odd to stream 250 distinct collections to run with such low memory configuration. Is there a reason we are testing this combination of scale and memory Umang ?
            umang.agrawal Umang added a comment -

            Michael Blow There was no intention to run the test with this scale. The actual test was to verify CBAS ingestion when KV Magma bucket was in 40% DGM.
            Total RAM allocated to CBAS nodes is 6GB, shouldn't that be enough?

            umang.agrawal Umang added a comment - Michael Blow There was no intention to run the test with this scale. The actual test was to verify CBAS ingestion when KV Magma bucket was in 40% DGM. Total RAM allocated to CBAS nodes is 6GB, shouldn't that be enough?
            ben.huddleston Ben Huddleston added a comment - - edited

            Thanks Michael Blow.

            I understand that a stream per collection probably makes code simpler on the analytics side, but it does impact the data nodes in cases like this where we have thousands of streams. We could probably chalk this up as a sizing issue should a customer hit it, and we've seen cases during the testing of magma with 2i where the number of collections and indexes we were testing with was inappropriate for the spec of the cluster. That being said, Varun did a good write up on how 2i manages streams here https://issues.couchbase.com/browse/MB-48532?focusedCommentId=547098&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-547098. They create init streams which (ideally) use OSO backfills for single collections, and then 1 maintenance stream which streams the entire bucket to keep the collections up to date. This would reduce the number of streams required dramatically, and whilst it isn't a perfect solution for all cases (say you want to add analytics to only 1 of 250 collections) it is in keeping with how all DCP services have streamed DCP in the past (i.e. bucket wide). It might be worth chatting with the 2i team about this.

            Umang I can't comment on the CBAS quota, but I suspect that 1GB bucket quota might not be enough. We need to fix MB-51968 first to say for sure though.

            ben.huddleston Ben Huddleston added a comment - - edited Thanks Michael Blow . I understand that a stream per collection probably makes code simpler on the analytics side, but it does impact the data nodes in cases like this where we have thousands of streams. We could probably chalk this up as a sizing issue should a customer hit it, and we've seen cases during the testing of magma with 2i where the number of collections and indexes we were testing with was inappropriate for the spec of the cluster. That being said, Varun did a good write up on how 2i manages streams here https://issues.couchbase.com/browse/MB-48532?focusedCommentId=547098&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-547098 . They create init streams which (ideally) use OSO backfills for single collections, and then 1 maintenance stream which streams the entire bucket to keep the collections up to date. This would reduce the number of streams required dramatically, and whilst it isn't a perfect solution for all cases (say you want to add analytics to only 1 of 250 collections) it is in keeping with how all DCP services have streamed DCP in the past (i.e. bucket wide). It might be worth chatting with the 2i team about this. Umang I can't comment on the CBAS quota, but I suspect that 1GB bucket quota might not be enough. We need to fix MB-51968 first to say for sure though.
            michael.blow Michael Blow added a comment -

            We won’t use one stream per collection after initial backfill if/when Morpheus work is complete, it just did not fit into Cheshire-Cat and Neo. We wouldn’t stream the entire bucket, but we will only opt in for collections we are interested in.

            michael.blow Michael Blow added a comment - We won’t use one stream per collection after initial backfill if/when Morpheus work is complete, it just did not fit into Cheshire-Cat and Neo. We wouldn’t stream the entire bucket, but we will only opt in for collections we are interested in.

            Perfect, thanks Michael Blow.

            Umang, I'll leave this assigned to me for now, once we fix MB-51968 we can run the test again.

            ben.huddleston Ben Huddleston added a comment - Perfect, thanks Michael Blow . Umang , I'll leave this assigned to me for now, once we fix MB-51968 we can run the test again.
            ben.huddleston Ben Huddleston added a comment - - edited

            MB-51968 is fixed in build 7.2.0-1106. Not sure what the status is with 7.1.1 changes, and if it makes the bar, will check next week.

            ben.huddleston Ben Huddleston added a comment - - edited MB-51968 is fixed in build 7.2.0-1106. Not sure what the status is with 7.1.1 changes, and if it makes the bar, will check next week.
            owend Daniel Owen added a comment -

            Awaiting to see if the issue still existing now that MB-51968 has been resolved.

            owend Daniel Owen added a comment - Awaiting to see if the issue still existing now that MB-51968 has been resolved.
            umang.agrawal Umang added a comment -

            Daniel OwenI ran the test with build Enterprise Edition 7.1.2 build 3348, I didn't see RAM exceed 1 GB. It seems that the issue is resolved.

            umang.agrawal Umang added a comment - Daniel Owen I ran the test with build Enterprise Edition 7.1.2 build 3348, I didn't see RAM exceed 1 GB. It seems that the issue is resolved.
            owend Daniel Owen added a comment -

            Many thanks for testing Umang on the later build. Will resolve as fixed (due to MB-51968 being fixed)

            owend Daniel Owen added a comment - Many thanks for testing Umang on the later build. Will resolve as fixed (due to MB-51968 being fixed)

            People

              owend Daniel Owen
              umang.agrawal Umang
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty