Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60630

[Backup] Restore fails on locked GCP bucket due to constant retries on 403 response

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • Tools 2024-Q1

    Description

      Problem

      In CBSE-16251, a customer moved a cloud backup archive in a normal GCP bucket to a locked GCP bucket (which allows creating new files, but prohibits modifying or deleting pre-existing objects), and then tried to do a restore from that bucket. The restore didn't fail, which would be the expected behaviour, but instead just hanged.

      All the data and services were successfully restored, but at the end of the restore, the sync archive step failed, as it requires updating the log file, which can't be done with a locked bucket.

      GCP correctly returns a 403 when cbbackupmgr tries to modify/delete an object in a locked bucket. The problem occurs because cbbackupmgr always retries on 403s (here and here) when using a GCP client, since it considers them intermittent.

      Fix

      Regardless of what error code is returned, we shouldn't constantly retry. Ideally we should be able to configure the maximum amount of retries on a per-operation level, but this is not possible. The GCP docs recommend using context timeouts instead, which would require changes to the tools-common cloud package, and backup is already a version behind.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              safian.ali Safian Ali
              safian.ali Safian Ali
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty