Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51413

Warmup aborted when disk is full and vbucket_state local doc does not exist instead of loading data

    XMLWordPrintable

Details

    Description

      STEPS TO RECREATE:
      DISK FULL TEST

      1. Create a 4 node cluster
      2. Create 5 million items (doc size = 2048) and replicas =1
      3. Fill entire disk , ( "fallocate -l <space left on disk> <file_name>", e.g "fallocate -l 84716M /data/full_disk_84716MB_1647101247.94")
      4. After Disk is full, Start doc ops (create docs) until ep_data_write_failed > 0(ensured using cbstats)
      5. Kill memcached on all nodes (kill -9 $(pgrep memcached) Time difference between sigKill on each node was three seconds
      6. Observed "2022-03-12T09:08:26.230072-08:00 CRITICAL (default) WarmupBackfillTask::run(): caught exception while running backfill - aborting warmup: WarmupVbucketVisitor::visit(): vb:107 shardId:3 failed to create BySeqnoScanContext, for backfill task:'Warmup - loading KV Pairs shard 3'"
        (Observed on node 172.23.122.247)

      QE-TEST:

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.33408.ini bucket_storage=magma,rerun=false,bucket_eviction_policy=fullEviction,randomize_value=True,enable_dp=false,GROUP=P0,get-cbcollect-info=True,upgrade_version=7.1.0-1671 -t storage.magma.magma_disk_full.MagmaDiskFull.test_crash_recovery_disk_full,nodes_init=4,num_items=5000000,doc_size=2048,sdk_timeout=60,replicas=1,GROUP=P0'
      

      Note:

      1. After the above failure, in tear down we clear the disk space, by removing the file created to fill up the disk (step 3 mentioned above (using rm -rf /data/full_disk_*)). But even after creating disk space all nodes on UI stays in amber state.
      2. This issue is not easily reproducible . I ran this test many times on the same build, but was able to hit this issue only once.

      Cluster details: http://172.23.122.245:8091/ui/index.html#/buckets?commonBucket=default&scenarioZoom=minute&scenario=oombr8sk5

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ankush.sharma Ankush Sharma
            ankush.sharma Ankush Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty