Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44735

[couchstore]: With 10 buckets, 5+default scopes/bucket, 5 collections/scope, >3TB on disk Rebalance IN operation is stuck at 30.64%

    XMLWordPrintable

Details

    Description

      Steps:

      1. Create a 2 node cluster
      2. Create required buckets and collections.
      3. Create 100000000/collection items sequentially

        Active Resident Threshold of GleamBookUsers0 is 100
        Active Resident Threshold of GleamBookUsers1 is 100
        Active Resident Threshold of GleamBookUsers2 is 100
        Active Resident Threshold of GleamBookUsers3 is 100
        Active Resident Threshold of GleamBookUsers4 is 100
        Active Resident Threshold of GleamBookUsers5 is 100
        Active Resident Threshold of GleamBookUsers6 is 100
        Active Resident Threshold of GleamBookUsers7 is 100
        Active Resident Threshold of GleamBookUsers8 is 100
        Active Resident Threshold of GleamBookUsers9 is 100
        

      4. Rebalance IN with Loading of docs
        Rebalance completed with progress: 100% in 783.251000166 sec

        Active Resident Threshold of GleamBookUsers0 is 100
        Active Resident Threshold of GleamBookUsers1 is 100
        Active Resident Threshold of GleamBookUsers2 is 100
        Active Resident Threshold of GleamBookUsers3 is 100
        Active Resident Threshold of GleamBookUsers4 is 100
        Active Resident Threshold of GleamBookUsers5 is 100
        Active Resident Threshold of GleamBookUsers6 is 100
        Active Resident Threshold of GleamBookUsers7 is 100
        Active Resident Threshold of GleamBookUsers8 is 100
        Active Resident Threshold of GleamBookUsers9 is 100
        

      5. Rebalance OUT with Loading of docs
        Rebalance completed with progress: 100% in 1958.66400003 sec

        Active Resident Threshold of GleamBookUsers0 is 100
        Active Resident Threshold of GleamBookUsers1 is 100
        Active Resident Threshold of GleamBookUsers2 is 100
        Active Resident Threshold of GleamBookUsers3 is 100
        Active Resident Threshold of GleamBookUsers4 is 100
        Active Resident Threshold of GleamBookUsers5 is 100
        Active Resident Threshold of GleamBookUsers6 is 100
        Active Resident Threshold of GleamBookUsers7 is 100
        Active Resident Threshold of GleamBookUsers8 is 100
        Active Resident Threshold of GleamBookUsers9 is 100
        

      6. Rebalance SWAP with Loading of docs
        Rebalance completed with progress: 100% in 5252.32400012 sec

        Active Resident Threshold of GleamBookUsers0 is 100
        Active Resident Threshold of GleamBookUsers1 is 100
        Active Resident Threshold of GleamBookUsers2 is 100
        Active Resident Threshold of GleamBookUsers3 is 100
        Active Resident Threshold of GleamBookUsers4 is 100
        Active Resident Threshold of GleamBookUsers5 is 100
        Active Resident Threshold of GleamBookUsers6 is 100
        Active Resident Threshold of GleamBookUsers7 is 100
        Active Resident Threshold of GleamBookUsers8 is 100
        Active Resident Threshold of GleamBookUsers9 is 100
        

      7. Rebalance IN/OUT with Loading of docs
        Rebalance completed with progress: 100% in 10523.181 sec

        Active Resident Threshold of GleamBookUsers0 is 100
        Active Resident Threshold of GleamBookUsers1 is 99.958179363
        Active Resident Threshold of GleamBookUsers2 is 100
        Active Resident Threshold of GleamBookUsers3 is 95.4621923919
        Active Resident Threshold of GleamBookUsers4 is 99.6915230079
        Active Resident Threshold of GleamBookUsers5 is 100
        Active Resident Threshold of GleamBookUsers6 is 99.9331886697
        Active Resident Threshold of GleamBookUsers7 is 99.9220528991
        Active Resident Threshold of GleamBookUsers8 is 92.1667923794
        Active Resident Threshold of GleamBookUsers9 is 94.7632706013
        

      8. Rebalance OUT/IN with Loading of docs
        Rebalance completed with progress: 100% in 28732.2089999 sec

        Active Resident Threshold of GleamBookUsers0 is 27.9740141957
        Active Resident Threshold of GleamBookUsers1 is 27.9909488378
        Active Resident Threshold of GleamBookUsers2 is 40.9782675253
        Active Resident Threshold of GleamBookUsers3 is 25.3353349135
        Active Resident Threshold of GleamBookUsers4 is 28.1773154437
        Active Resident Threshold of GleamBookUsers5 is 45.4830972114
        Active Resident Threshold of GleamBookUsers6 is 33.871868483
        Active Resident Threshold of GleamBookUsers7 is 33.3545983334
        Active Resident Threshold of GleamBookUsers8 is 26.0763313068
        Active Resident Threshold of GleamBookUsers9 is 30.8288544046
        

      9. Rebalance IN with Loading of docs. Rebalance is stuck.

      mem_used(GleamBookUsers0):

      Machines config used for test:

      [root@hidd-srv-04 ~]# lscpu
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                56
      On-line CPU(s) list:   0-55
      Thread(s) per core:    2
      Core(s) per socket:    14
      Socket(s):             2
      NUMA node(s):          2
      Vendor ID:             GenuineIntel
      CPU family:            6
      Model:                 79
      Model name:            Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
      Stepping:              1
      CPU MHz:               1314.550
      CPU max MHz:           3300.0000
      CPU min MHz:           1200.0000
      BogoMIPS:              4794.58
      Virtualization:        VT-x
      L1d cache:             32K
      L1i cache:             32K
      L2 cache:              256K
      L3 cache:              35840K
       
      [root@hidd-srv-04 ~]# df -h
      Filesystem                             Size  Used Avail Use% Mounted on
      devtmpfs                               126G     0  126G   0% /dev
      tmpfs                                  126G     0  126G   0% /dev/shm
      tmpfs                                  126G   27M  126G   1% /run
      tmpfs                                  126G     0  126G   0% /sys/fs/cgroup
      /dev/mapper/centos_hidd--srv--04-root  926G   82G  845G   9% /
      /dev/sda2                             1014M  174M  841M  18% /boot
      /dev/sda1                              200M   12M  189M   6% /boot/efi
      /dev/mapper/data-data                  7.0T  1.7T  5.4T  24% /data
      tmpfs                                   26G     0   26G   0% /run/user/0
       
      [root@hidd-srv-04 ~]# free -m
                    total        used        free      shared  buff/cache   available
      Mem:         257685      167255        1813          26       88616       89538
      Swap:          4095           0        4095
      

      QE Test

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job3.ini -p bucket_storage=couchstore,bucket_eviction_policy=fullEviction -t volumetests.Magma.volume.test_long_rebalance,nodes_init=2,replicas=1,skip_cleanup=True,num_items=100000000,doc_size=4096,batch_size=10,sdk_timeout=60,log_level=debug,infra_log_level=info,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,maxttl=300,num_buckets=10,num_scopes=5,num_collections=5,doc_ops=expiry,durability=None,pc=5,sdk_client_pool=True,get-cbcollect-info=True,iterations=50 -m rest'
      

      Attachments

        1. hwm-lwm-mem_used.png
          hwm-lwm-mem_used.png
          206 kB
        2. Rebalance.png
          Rebalance.png
          306 kB
        3. Screen Recording 2021-04-07 at 8.55.57 AM.mov
          19.87 MB
        4. Screenshot 2021-03-08 at 14.02.40.png
          Screenshot 2021-03-08 at 14.02.40.png
          248 kB
        5. Screenshot 2021-03-08 at 14.24.34.png
          Screenshot 2021-03-08 at 14.24.34.png
          171 kB
        6. Screenshot 2021-03-08 at 14.27.51.png
          Screenshot 2021-03-08 at 14.27.51.png
          195 kB
        7. Screenshot 2021-04-08 at 09.35.15.png
          Screenshot 2021-04-08 at 09.35.15.png
          396 kB
        8. Screenshot 2021-04-08 at 09.38.39.png
          Screenshot 2021-04-08 at 09.38.39.png
          396 kB
        9. Screenshot 2021-04-08 at 10.49.49.png
          Screenshot 2021-04-08 at 10.49.49.png
          128 kB
        10. trace.67.tar.gz
          24.99 MB
        11. trace.68.tar.gz
          24.94 MB

        Issue Links

          For Gerrit Dashboard: MB-44735
          # Subject Branch Project Status CR V

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty