Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59804

[Guardrails] cbbackupmgr restore exits abruptly without a proper specific error message when the RR guardrail threshold has been breached

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • None
    • 7.6.0
    • tools
    • Couchbase server version - 7.6.0-1813 (provisioned cluster profile)
    • Untriaged
    • Linux x86_64
    • 0
    • Unknown

    Description

      Steps:

      1. Initialise a 2 node 7.6.0 cluster running just the KV service.
      2. Create a Magma bucket named 'default' with 512MB RAM per node and 0 replicas.
      3. Configure magmaMinimum=50 in http://172.23.216.45:8091/settings/resourceManagement
      4. Load data into the bucket until the resident ratio guardrail has been breached.
        The resident ratio of the bucket on 2 nodes after data loading is [44.52, 44.18], which is the below the configuredMinimum (50%). KV mutations are blocked now.
      5. Configure a backup /couchbase_data/backups and create a repository called 'example`.
      6. Backup the bucket data into the configured archive.
      7. Delete the existing bucket and a create a new bucket with the same name and config as before.
      8. Restore data into the new bucket. Restore started but exited before completion with the error message 'Error restoring cluster: unknown kv status code (54)'

      root@sd1606-deb10:/opt/couchbase/bin# ./cbbackupmgr restore -a /couchbase_data/backups -r example --cluster couchbase://172.23.216.45 --username Administrator --password password
      Restoring backup '2023-11-23T03_36_01.867066243-08_00'
      Transferring key value data for bucket 'default' at 6.25MiB/s (about 1m45s remaining)                                                                                  1888821 items / 1.65GiB
      [=================================================================================================================================X...................................................] 71.95%
      Error restoring cluster: unknown kv status code (54)
       
       
      | Transfer
      | --------
      | Status | Avg Transfer Rate | Started At                      | Finished At                     | Duration |
      | Failed | 6.27MiB/s         | Thu, 23 Nov 2023 04:02:52 -0800 | Thu, 23 Nov 2023 04:07:22 -0800 | 4m30s    |
       
       
      | Bucket
      | ------
      | Name    | Status | Transferred | Avg Transfer Rate | Started At                      | Finished At                     | Duration |
      | default | Failed | 1.65GiB     | 6.29MiB/s         | Thu, 23 Nov 2023 04:02:52 -0800 | Thu, 23 Nov 2023 04:07:22 -0800 | 4m29s    |
      |
      | Mutations                    | Deletions                    | Expirations                  |
      | ---------                    | ---------                    | -----------                  |
      | Received | Errored | Skipped | Received | Errored | Skipped | Received | Errored | Skipped |
      | 1888798  | 24      | 0       | 0        | 0       | 0       | 0        | 0       | 0       | 

      1. A specific error message indicating that the restore couldn't complete because of the guardrail threshold will be helpful.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              vibhav.sp Vibhav S P
              vibhav.sp Vibhav S P
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty