Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50589

Warmup scan of large range of deleted items can hang warmup indefinitely

    XMLWordPrintable

Details

    • Triaged
    • 1
    • Yes
    • KV 2022-Jan

    Description

      Believe that this was introduced with MB-47267.

      Warmup skips deleted items when scanning disk - https://github.com/couchbase/kv_engine/blob/a6acea19e938412df114fe77dfa6a408c2d92424/engines/ep/src/warmup.cc#L517-L524.

      The crux of this comes down to not moving ScanContext::lastReadSeqno when we see deleted items in this case. CouchKVStore passes this filter down to couchstore so we won't invoke the LoadStorageKVPairCallback until we find a non-deleted item. MagmaKVStore filters the deletes and moves on to the next item. For both KVStores when we resume a scan we start from lastReadSeqno + 1 if lastReadSeqno != 0. During warmup we decide to pause a scan if more than some fixed amount of time. That time for Backfill tasks if set to 10 milliseconds.
      https://github.com/couchbase/kv_engine/blob/a6acea19e938412df114fe77dfa6a408c2d92424/engines/ep/src/warmup.cc#L969-L974

      If we have an on disk structure as follows:

      [1:alive, 2:deleted, 3:deleted, ..., n:deleted, n+1:alive]

      Then we can end up in a scenario where lastReadSeqno gets set to 1 for the first item read, and that item is warmed up. If the scan of 2-n takes more than 10 milliseconds then when we reach the item at n+1 Warmup decides to pause the scan. During the scan from 2-n we don't update lastReadSeqno meaning that the scan gets restarted from 2 rather than n+1 which if disk is consistently slow could result in warmup indefinitely hanging as scans repeat over the same range of deleted items.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ritesh.agarwal Ritesh Agarwal
            ben.huddleston Ben Huddleston
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty