Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58194

[CBM] --resume loses data when backing up to S3 with less than 5MB completed in a vBucket

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.2.1
    • master, 7.1.0, 7.2.0
    • tools
    • Untriaged
    • 0
    • Yes

    Description

      What is the problem?
      On master and 7.1.3 (the latest tools package on our website) I noticed that if I interrupted a backup to S3 with Ctrl-C and resumed it with --resume it reported (both at the end and using the info subcommand) less mutations than I expected. Grabbing the backup from S3 I noted that the data and index files for some vBuckets were missing.

      James Lee informed me that when doing a --resume with S3:

      f you have less than 5MB of data for each vBucket - which will be likely for the sample buckets etc - then it'll restart from zero

      Looking at the code it looks like this is 5MB uploaded so far for a vBucket rather than the vBucket being less than 5MB.

      Looking in the logs I could see this:

      Aborting 764 invalid multipart uploads for vBuckets '[ <snip> ]'
      

      and these were the vBuckets for which the files were missing. It appears when we abort multipart uploads we are not correctly restreaming the vBuckets.

      Customer impact
      If a customer has a backup to S3 (and only S3, not Azure/GCP) interrupted and they resume it then it is possible the backup will say it completed successfully but the backup actually does not contain some vBuckets. This will definitely happen if each vBucket has less than 5MB in and can happen with larger vBuckets, although in that case we must not have finished uploaded the first 5MB of data for a vBucket (i.e. it is fairly unlikely).

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              gilad.kalchheim Gilad Kalchheim
              Matt.Hall Matt Hall
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty