Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55294

[Serverless]: DCP replication queue is hung for a few specific databases under a given workload.

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • Elixir
    • Elixir
    • couchbase-bucket
    • 7.5.0-3593

    Description

      1. Create 20 serverless databases. wait for kv to scale from 3 to 6 kv nodes.
      2. Create gsi index and fts indexes on all the databases. Number of indexer varies based on the dataplane load pattern described below.
      3. Start the data load when all the indexes are ready
      4. Data ingestion and index building started in FTS/GSI as in data started loading in kv.
      5. FTS cpu started going up. It qualifies for Auto-Scaling and 2 more nodes are added to the cluster and rebalanced. Rebalance fails for FTS service and the test failed.
      6. It is observed that for few specific databases workload pattern the DCP replication queue was hung. Details below.

      Workload distribution on 20 DBs:
      0, 4, 8, 16, 20 - Workload 1 -----> All the DB's dcp queue is hung.
      1, 5, 9, 13, 17 - Workload 2
      2, 6, 10, 14, 18 - Workload 3
      3, 7, 11, 15, 19 - Workload 4

      Workload definations:
      Workload 1:

      {
          "scopes": 1,
          "collections": 20,
          "num_items": 5000000,
          "ops": 5000,
          "doc_size": 1024,
          "pattern": [10, 80, 0, 10, 0],
          "load_type": ["create", "read", "delete"],
          "2i": (20, 20), --> 20 indexes and 20 QPS
          "FTS": (5, 5) --> 5 indexes and 5 QPS
      }
      

      Workload 2:

      {
          "scopes": 1,
          "collections": 10,
          "num_items": 2000000,
          "ops": 4000,
          "doc_size": 1024,
          "pattern": [10, 80, 0, 10, 0],
          "load_type": ["create", "read", "delete"],
          "2i": (10, 10), --> 10 indexes and 10 QPS
          "FTS": (10, 10) --> 10 indexes and 10 QPS
      }
      

      Workload 3:

      {
          "scopes": 1,
          "collections": 5,
          "num_items": 200000,
          "ops": 2000,
          "doc_size": 1024,
          "pattern": [10, 80, 0, 10, 0],
          "load_type": ["create", "read", "delete"],
          "2i": (10, 20),
          "FTS": (10, 5)
      }
      

      Workload 4:

      {
          "scopes": 1,
          "collections": 5,
          "num_items": 1000000,
          "ops": 2000,
          "doc_size": 1024,
          "pattern": [10, 80, 0, 10, 0],
          "load_type": ["create", "read", "delete"],
          "2i": (0, 2), --> 0 indexes and 2 QPS (KV Range scan)
          "FTS": (0, 0)
      }
      

      Sample Database:

      Ritam Sharma, this effects the volume tests run for serverless.

      QE Test

      pc=500
      git fetch https://review.couchbase.org/TAF refs/changes/05/185605/1 && git checkout FETCH_HEAD
      sudo guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P "args=-i /tmp/ElixirVolumeTest.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.serverlessHospital.Murphy.ElixirVolume,skip_cleanup=True,num_buckets=${num_buckets},bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,iterations=${iterations},batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,maxttl=10,pc=${pc},mutation_perc=20,key_type=RandomKey,capella_run=true,skip_teardown_cleanup=true,wait_timeout=14400,index_timeout=36000,dataplane_id=${dataplane_id},cb_image=${cb_image},dn_image=${dn_image},dapi_image=${dapi_image},num_dataplanes=${num_dataplanes},runtype=serverless,kv_disk_size=${kv_disk_size},index_disk_size=300 -m rest"
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              drigby Dave Rigby
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty