Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-12270

Cbbackup 2.5.1 Run Directly Against File System May Fail w/OS Error 2 If Files Are Concurrently Compacted

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 4.0.0
    • 2.5.1
    • tools
    • Security Level: Public
    • None
    • Untriaged
    • Centos 64-bit
    • No

    Description

      Customer is running cbbackup directly against the bucket datafiles in order to avoid TAP vbucket dumps; cbbackups are being executed this way in order to avoid significant changes to bucket residency percentages which negatively affect application performance. The general form of the command being used is:
      .
      cbbackup couchstore-files:///<bucket data path> <backup_dir> -u <admin> -p <password>
      .

      Customer is performing these backups against very large nodes. Customer reports that the cbbackups invariable fail on an OS error2 exception (file not found). Further investigation shows that the missing file is present when cbbackup begins but gets compacted at some point while the backup is running with the result that there is a different version of the data file in play when cbbackup attempts to back it up. Attached find a listing of the directory as it exists when cbbackup starts, the error message and a second listing of the directory taken after the missing file exception has been trapped. Clearly it shows that the missing file "166.couch.2029" exists when cbbackup starts but has been replaced by file "166.couch.2030" by the time cbbackup tries to read it.

      While an apparent workaround might be to suspend compaction while they are doing backups, that might result in other performance problems and also cause file system exhaustion. So we would like to see if there can be a code fix for this.

      Attached: directory listings before and after OS 2 missing file exception showing changed incarnation and the error message itself.

      Attachments

        1. cbbackup_error.out.txt
          5 kB
        2. ls_begin.out.txt
          4 kB
        3. ls_end.out.txt
          4 kB
        For Gerrit Dashboard: MB-12270
        # Subject Branch Project Status CR V

        Activity

          People

            bcui Bin Cui (Inactive)
            morrie Morrie Schreibman (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 40h
                40h
                Remaining:
                Remaining Estimate - 40h
                40h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty