Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49037

AWS m6g.large rebalance hung due to backfilling paused

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 7.1.0
    • 7.1.0
    • couchbase-bucket
    • Build number: 7.1.0-1361

      OS: Amazon Linux 2
      ARM instance: m6g.large

      2vCPU
      8GB Memory
      40GB EBS

    Description

      During rebalance performance tests on ARM AWS instances, the tests consistently hang - an example job can be found here along with the logs:

      http://perf.jenkins.couchbase.com/job/Cloud-Tester/600/

       

      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2021-10-07T223241-ns_1%40ec2-3-219-56-9.compute-1.amazonaws.com.zip
      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2021-10-07T223241-ns_1%40ec2-3-223-6-164.compute-1.amazonaws.com.zip
      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2021-10-07T223241-ns_1%40ec2-44-195-22-82.compute-1.amazonaws.com.zip

       

      The rebalance seems to hang on 'still waiting for backfill on connection', this happens 115 times in the logs:

       

      [rebalance:debug,2021-10-07T22:35:41.445Z,ns_1@ec2-44-195-22-82.compute-1.amazonaws.com:<0.1108.3>:dcp_replicator:wait_for_data_move_on_one_node:192]Still waiting for backfill on connection "replication:ns_1@ec2-44-195-22-82.compute-1.amazonaws.com->ns_1@ec2-3-223-6-164.compute-1.amazonaws.com:bucket-1" bucket "bucket-1", partition 745, last estimate {0,0, <<"calculating-item-count">>}

      During this time memcached keeps returning <<"calculating-item-count">> with no estimation, CPU usage also spikes at this time.

       

      Attachments

        1. MB-49037_b1695.png
          389 kB
          Paolo Cocchi
        2. MB-49037_dcp-backoff.png
          41 kB
          Paolo Cocchi
        3. MB-49037_HT-ejection.png
          52 kB
          Paolo Cocchi
        4. MB-49037_ht-mem.png
          96 kB
          Paolo Cocchi
        5. MB-49037_mem.png
          135 kB
          Paolo Cocchi
        6. Screenshot 2021-10-20 at 13.37.03.png
          59 kB
          Dave Rigby
        7. Screenshot 2021-10-20 at 13.49.48.png
          267 kB
          Dave Rigby
        8. Screenshot 2021-10-20 at 13.52.05.png
          60 kB
          Dave Rigby
        9. x86 dashboard.png
          320 kB
          Dave Rigby

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-7.1.0-1730 contains kv_engine commit df37d73 with commit message:
            MB-49037: Add ep_ht_item_memory stat

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1730 contains kv_engine commit df37d73 with commit message: MB-49037 : Add ep_ht_item_memory stat
            paolo.cocchi Paolo Cocchi added a comment - - edited

            ht_item_memory actually computed as (per each StoredValue):

            size_t size() const {
                return getObjectSize() + valuelen();
            }
             
            size_t StoredValue::getObjectSize() const {
                ..
                return sizeof(*this) + getKey().getObjectSize();
            }
            

            StoredValue::getObjectSize() (eg, 63 bytes on MacOS) is also what we compute as ht_metadata.
            So ht_item_memory includes ht_metadata, thus it is actually expected to be non-zero at ValueEjection also for a HashTable that contains only non-resident items.

            paolo.cocchi Paolo Cocchi added a comment - - edited ht_item_memory actually computed as (per each StoredValue): size_t size() const { return getObjectSize() + valuelen(); }   size_t StoredValue::getObjectSize() const { .. return sizeof(*this) + getKey().getObjectSize(); } StoredValue::getObjectSize() (eg, 63 bytes on MacOS) is also what we compute as ht_metadata. So ht_item_memory includes ht_metadata, thus it is actually expected to be non-zero at ValueEjection also for a HashTable that contains only non-resident items.
            paolo.cocchi Paolo Cocchi added a comment - - edited

            Interesting point, what we call "Mem Used - HashTable" in stats accounts for SVs Metadata + Blobs size:

            That means that in the Mem chart we are just seeing ~ 400MB of Metadata + ~ 400MB of Blobs -> ~ 800MB of allocation reported in the HashTable.

            That is misleading. Blobs are reference-counted objects. When a Blob is ejected from the HT (which is the case for all Blobs in this scenario) it shouldn't be accounted in HT mem-usage.
            Note that we still have ~ 400 MB of Blobs around as they are referenced in the replica checkpoints.

            paolo.cocchi Paolo Cocchi added a comment - - edited Interesting point, what we call "Mem Used - HashTable" in stats accounts for SVs Metadata + Blobs size: That means that in the Mem chart we are just seeing ~ 400MB of Metadata + ~ 400MB of Blobs -> ~ 800MB of allocation reported in the HashTable. That is misleading. Blobs are reference-counted objects. When a Blob is ejected from the HT (which is the case for all Blobs in this scenario) it shouldn't be accounted in HT mem-usage. Note that we still have ~ 400 MB of Blobs around as they are referenced in the replica checkpoints.
            paolo.cocchi Paolo Cocchi added a comment -

            Summary

            The issue observed in build 1361 is a conjunction of:

            • The test using Value Ejection
            • All items already ejected
            • Replica checkpoint taking up to the the entire CM Quota (ie 50% of the Bucket Quota in build 1361)
            • Replica checkpoint memory not being recovered as per default recovery thresholds (set in EP config)

            As per offline chat with Sean Corrigan, this test used to succeed at some build before 1361, so this was a regression in 1361.
            But, 1361 falls in the middle of the "Improvements" window, so some relevant things have changed since then:

            1. checkpoint_max_size was buggy in 1361, fixed in 1574 - note that the checkpoint_max_size param is directly related to the effectiveness of memory recovery by Checkpoint Removal
            2. CM Quota was set to 50% (of Bucket Quota) in 1361, set to 30% recently

            Those changes produce a very different memory pattern in checkpoints in build 1695, which can be summarized as checkpoint mem recovery being much more effective and avoiding any OOM during the test:

            paolo.cocchi Paolo Cocchi added a comment - Summary The issue observed in build 1361 is a conjunction of: The test using Value Ejection All items already ejected Replica checkpoint taking up to the the entire CM Quota (ie 50% of the Bucket Quota in build 1361) Replica checkpoint memory not being recovered as per default recovery thresholds (set in EP config) As per offline chat with Sean Corrigan , this test used to succeed at some build before 1361, so this was a regression in 1361. But, 1361 falls in the middle of the "Improvements" window, so some relevant things have changed since then: checkpoint_max_size was buggy in 1361, fixed in 1574 - note that the checkpoint_max_size param is directly related to the effectiveness of memory recovery by Checkpoint Removal CM Quota was set to 50% (of Bucket Quota) in 1361, set to 30% recently Those changes produce a very different memory pattern in checkpoints in build 1695, which can be summarized as checkpoint mem recovery being much more effective and avoiding any OOM during the test:
            paolo.cocchi Paolo Cocchi added a comment -

            Hi Sean Corrigan, I'm resolving this as fixed in the more recent builds, thanks.

            paolo.cocchi Paolo Cocchi added a comment - Hi Sean Corrigan , I'm resolving this as fixed in the more recent builds, thanks.

            People

              owend Daniel Owen
              sean.corrigan Sean Corrigan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty