Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55633

[CBM] Potential misclasification of incremental backups

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Yes

    Description

      TL;DR: As of 7.1.0, the backup type is only set once per-backup meaning whichever bucket is backed up first (non-deterministic) determines the backup type; this means the overall backup type can be potentially misclassified.

      To understand what's going on here, we should first understand how we classify the type of a backup.

      // setBackupType - Uses the given data ranges to determine what type of backup is going to be created, and then updates
      // the backups '.info' file.
      func (s *Sink) setBackupType(dr value.DataRanges) error {
      	var (
      		err error
      		full = true
      	)
       
      	dr.ForEach(nil, func(_ uint16, vbDR *value.DataRange) error { //nolint:errcheck
      		full = full &&
      			(len(vbDR.Sink.Initial.Snapshots) == 0 &&
      				vbDR.Sink.Initial.HighSeqNo == 0 &&
      				vbDR.Sink.Initial.PurgeSeqNo == 0 &&
      				len(vbDR.Sink.Initial.FailoverLog) == 0)
       
      		return nil
      	})
       
      	if full {
      		err = s.backup.setBackupType(value.FullBackup)
      	} else {
      		err = s.backup.setBackupType(value.IncrementalBackup)
      	}
       
      	// <snip>
      }
      

      We go through the data ranges, and determine whether they're the default ranges (e.g. starting from scratch); this indicates whether we're performing a full backup, or an incremental.

      // setBackupType - Sets the 'type' field in the backups '.info' file. If the 'field' is already set to 'INCR' and the
      // requested type is 'FULL' this function will return <nil> but will not update the backup type.
      func (bd *backupDir) setBackupType(backupType string) error {
      	// This function will be called for each bucket in the backup. If any of the buckets in the backup are incrementals
      	// the whole backup is classed as an incremental.
      	if bd.bInfo.Type == value.IncrementalBackup && backupType == value.FullBackup {
      		return nil
      	}
       
      	// <snip>
      }
      

      In MB-37023, we added an additional clause to the repo-level setBackupType function which caused the backup level setBackupType function to be skipped after being run once.

      func (s *Sink) setBackupType(dr value.DataRanges) error {
      	// We've already set the backup type, exit early
      	if s.backupTypeSet {
      		return nil
      	}
       
      	// <snip>
       
      	s.backupTypeSet = true
       
      	return nil
      }
      

      This was an oversight, that was introduced as part of the range refactor; initially, the data ranges were periodically persisted using the PersistDataRanges function, this would have setBackupType run far too regularly.

      Down the line though, it was changed to a vBucket level function (PersistVBDataRange) rendering the perceived optimisation void, leaving behind this bug.

      Attachments

        For Gerrit Dashboard: MB-55633
        # Subject Branch Project Status CR V

        Activity

          People

            gilad.kalchheim Gilad Kalchheim
            james.lee James Lee
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty