Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-52490

[30TB, 1% KV DGM, CBAS]: Rebalance in 1 KV node is stuck since 35 hours. No movement in data/vBuckets.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • Morpheus
    • 7.1.1
    • couchbase-bucket
    • Enterprise Edition 7.1.1 build 3067

    Description

      1. Create a 3 node KV cluster
      2. Create a magma bucket with 1 replica with RAM=200GB
      3. Load 10B 1024 bytes documents. This is 20TB of Active + replica and puts the bucket in 1% DGM.
      4. Upsert the whole data to create 50% fragmentation.
      5. Create 25 datasets on cbas ingesting data from different collections. Let the ingestion start. Start SQL++ load with 10QPS asynchronously.
      6. Start an asnyc CRUD data load:

        Read Start: 0
        Read End: 100000000
        Update Start: 0
        Update End: 100000000
        Expiry Start: 0
        Expiry End: 0
        Delete Start: 100000000
        Delete End: 200000000
        Create Start: 200000000
        Create End: 300000000
        Final Start: 200000000
        Final End: 300000000
        

      7. Rebalance in 1 KV node. Rebalance seem to be stuck since hours...

      QE Test

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job3.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.ClusterOpsVolume,nodes_init=3,graceful=True,skip_cleanup=True,num_items=100000000,num_buckets=1,bucket_names=GleamBook,doc_size=1300,bucket_type=membase,eviction_policy=fullEviction,iterations=2,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=25,pc=10,index_nodes=0,cbas_nodes=1,fts_nodes=0,ops_rate=200000,ramQuota=68267,doc_ops=create:update:delete:read,mutation_perc=100,rebl_ops_rate=50000,key_type=RandomKey -m rest'
      

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-52490
          # Subject Branch Project Status CR V

          Activity

            People

              owend Daniel Owen
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are 5 open Gerrit changes

                  PagerDuty