Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-27459

cbbackup can corrupt documents with xattrs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 5.1.0
    • 5.0.0, 5.0.1
    • tools
    • Untriaged
    • Unknown

    Description

      When performing a backup with cbbackup, we stream the documents including xattrs (because, of course, these need backing up too). However we store these directly in the val column of the sqlite cbb_msg table.

      In some cases, this seems to get stored in the rough format:

      NULL  ...  NULL  XATTR_NAME  NULL  {XATTR}  NULL  {BODY}
      

      This seems in line with the binary protocol format.

      In fact, when stored in this way, the restore seems to work fine for some keys (more on these below. There are cases though where the padding seems different, which leads to the document being restored as binary.

      Steps to reproduce

      • Create a cluster with two buckets, and load one with data including xattrs (I've used SG as a real world example here).
      • Picking a key (key_123) we can see that it has both a body and xattr:

        subdoc> get key_123 -x _sync
        key_123              CAS=0x1507d50c13e60000
        0. Size=327, RC=0x00 Success (Not an error)
        {"rev":"2-7a40fa2b51531d65895b0f64d652bd20","sequence":391,"recent_sequences":[125,391],"history":{"revs":["1-ca9ad22802b66f662ff171f226211d5c","2-7a40fa2b51531d65895b0f64d652bd20"],"parents":[-1,0],"channels":[null,["04070"]]},"channels":{"04070":null},"cas":"0x0000e6130cd50715","time_saved":"2018-01-08T12:20:48.819534016Z"}
        1. Size=22, RC=0x00 Success (Not an error)
        {"channels":["04070"]}
        

      • Perform a backup of the bucket:

        $  /opt/couchbase/bin/cbbackup http://localhost:8091/ /vagrant/backup -b sg1 -u Administrator -p password
        

      • Restore the backup to the second bucket:

        $ /opt/couchbase/bin/cbrestore /vagrant/backup/2018-01-08T122403Z/2018-01-08T122403Z-full/ http://localhost:8091/ -b sg1 -B sg2 -u Administrator -p password
        

      • Verify the document in the second bucket - note that the xattr (_sync) no longer exists. Also note the spurious RN:

        subdoc> get key_123 -x _sync
        key_123              CAS=0x1507d50c13e60000
        0. Size=0, RC=0x3f Sub-document path does not exist
        1. Size=364, RC=0x00 Success (Not an error)
        RN_sync{"rev":"2-7a40fa2b51531d65895b0f64d652bd20","sequence":391,"recent_sequences":[125,391],"history":{"revs":["1-ca9ad22802b66f662ff171f226211d5c","2-7a40fa2b51531d65895b0f64d652bd20"],"parents":[-1,0],"channels":[null,["04070"]]},"channels":{"04070":null},"cas":"0x0000e6130cd50715","time_saved":"2018-01-08T12:20:48.819534016Z"}{"channels":["04070"]}
        

        Perhaps more clearly seen comparing the output couch_dbdump on both buckets, where we can also see that it's now of raw datatype:

        # /opt/couchbase/bin/couch_dbdump --key key_123 /opt/couchbase/var/lib/couchbase/data/sg1/65.couch.1
        Dumping "/opt/couchbase/var/lib/couchbase/data/sg1/65.couch.1":
          Doc ID: key_123
             seq: 2
             rev: 2
             content_meta: 128
             size (on disk): 312
             cas: 1515414047483625472, expiry: 0, flags: 0, datatype: 0x05 (json,xattr)
             size: 364
             xattrs: {"_sync":{"rev":"2-7a40fa2b51531d65895b0f64d652bd20","sequence":391,"recent_sequences":[125,391],"history":{"revs":["1-ca9ad22802b66f662ff171f226211d5c","2-7a40fa2b51531d65895b0f64d652bd20"],"parents":[-1,0],"channels":[null,["04070"]]},"channels":{"04070":null},"cas":"0x0000e6130cd50715","time_saved":"2018-01-08T12:20:48.819534016Z"}}
             data: (snappy) {"channels":["04070"]}
        

        # /opt/couchbase/bin/couch_dbdump --key key_123 /opt/couchbase/var/lib/couchbase/data/sg2/65.couch.1
        Dumping "/opt/couchbase/var/lib/couchbase/data/sg2/65.couch.1":
          Doc ID: key_123
             seq: 1
             rev: 2
             content_meta: 131
             size (on disk): 312
             cas: 1515414047483625472, expiry: 0, flags: 0, datatype: 0x00 (raw)
             size: 364
             data: (snappy)
         
        Total docs: 1
        


      Interestingly, as mentioned, this doesn't seem to affect all documents:

      # /opt/couchbase/bin/couch_dbdump --no-body /opt/couchbase/var/lib/couchbase/data/sg1/* | grep 'datatype:.*' -o | sort | uniq -c
      Failed to open "/opt/couchbase/var/lib/couchbase/data/sg1/stats.json": malformed data in file
      Failed to open "/opt/couchbase/var/lib/couchbase/data/sg1/stats.json.old": malformed data in file
          267 datatype: 0x00 (raw)
          829 datatype: 0x01 (json)
         2048 datatype: 0x05 (json,xattr)
       
      # /opt/couchbase/bin/couch_dbdump --no-body /opt/couchbase/var/lib/couchbase/data/sg2/* | grep 'datatype:.*' -o | sort | uniq -c
      Failed to open "/opt/couchbase/var/lib/couchbase/data/sg2/stats.json": malformed data in file
      Failed to open "/opt/couchbase/var/lib/couchbase/data/sg2/stats.json.old": malformed data in file
         1639 datatype: 0x00 (raw)
          830 datatype: 0x01 (json)
          675 datatype: 0x05 (json,xattr)
      

      Taking one such example, key_1236, we can see that it's restored correctly:

      # /opt/couchbase/bin/couch_dbdump --key key_1236 /opt/couchbase/var/lib/couchbase/data/sg2/1002.couch.1
      Dumping "/opt/couchbase/var/lib/couchbase/data/sg2/1002.couch.1":
        Doc ID: key_1236
           seq: 2
           rev: 1
           content_meta: 128
           size (on disk): 278
           cas: 1515414065754734592, expiry: 0, flags: 0, datatype: 0x05 (json,xattr)
           size: 318
           xattrs: {"_sync":{"rev":"1-7e5eb5682c9532f9907c8255e725cb54","sequence":1504,"recent_sequences":[1504],"history":{"revs":["1-7e5eb5682c9532f9907c8255e725cb54"],"parents":[-1],"channels":[["15966"]]},"channels":{"15966":null},"cas":"0x0000f15410d50715","time_saved":"2018-01-08T12:21:07.080455581Z"}}
           data: (snappy) {"channels":["15966"]}
       
      Total docs: 1
      

      Inspecting these keys in sqlitebrowser shows that the difference is the padding:

      I've also attached a repro backup: backup.zip

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-27459
          # Subject Branch Project Status CR V

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              James Flather James Flather (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty