Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51941

Slow bucket initialization due to slow disk may cause delta recovery timeout

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • backlog
    • 6.0.5, 6.5.2, 6.6.5, 7.0.4, 7.1.1
    • couchbase-bucket, ns_server
    • None
    • Untriaged
    • 1
    • Unknown

    Description

      This issue is very similar to MB-47267. During delta node recovery ns_server imposes a 1 minute timeout for the janitor to find that Buckets have been created on the incoming node(s). As part of Bucket initialization we schedule (but don't wait to run) the Warmup tasks which drive a lot of IO work. Should we have many Buckets we may see the disk work required during initialization of a Bucket become slow enough that we hit this 1 minute timeout if the disk cannot cope with the warmup of other Buckets + initialization of some given Bucket. Whilst we'd probably chalk this up as a slow disk issue if we saw this with a single Bucket, it has been observed that Warmup of other Buckets has an impact on the time it takes for us to initialize any given Bucket.

      Potential solution

      1. We could perhaps remove the scheduling of the Warmup tasks on Bucket initialization and instead expose some API to ns_server to schedule warmup. ns_server could then start warmup on all Buckets when it finds that all Buckets have been created (kv_engine does not know how many Buckets will be created).

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            owend Daniel Owen
            ben.huddleston Ben Huddleston
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty