Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43247

cbbackupmgr TEMP_OOM failures should be accounted for in retries

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Major
    • None
    • 6.5.1, 6.6.0, 6.6.1, 6.5.0, Cheshire-Cat
    • tools
    • None
    • Untriaged
    • 1
    • No

    Description

      What's the issue?
      We've seen a couple of cases where restores are "hanging" indefinitely with buckets with a very low residency ratio. I suspect this may be due to the fact that 'TEMP_OOM' errors are retried indefinitely (i.e. we'll never exhaust retries and return an error).

      Is there a workaround?
      Yes, restore to an adequately provisioned cluster.

      What's the fix
      We should account for 'TEMP_OOM' failures during a restore, note that they should be handled somewhat specially i.e. we should retry more times for 'TEMP_OOM' failures because in most scenarios data will be evicted allowing further progress.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              james.lee James Lee
              james.lee James Lee
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty