Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-42352

delete_with_meta incompatibility issues between 6.6.0/6.5.1 and older versions when using user xattrs

    XMLWordPrintable

    Details

    • Triage:
      Untriaged
    • Story Points:
      1
    • Is this a Regression?:
      Unknown

      Description

      TL;DR:

      In some versions of Couchbase Server cbbackupmgr will backup tombstones that contain a body. Restoring these tombstone can fail:

      Affected versions
      The version of cbbackupmgr does not matter. What matters is the version of the cluster that was backed up and the version of the cluster being restored to.

      Backup (rows) \ Restore (columns) Versions <=6.0.4 6.5.0 6.5.1 6.6.0
      <=6.0.4 *
      6.5.0 N/A *
      6.5.1 N/A N/A *
      6.6.0 N/A N/A N/A *

      * When restoring to Couchbase Server 6.6.0 there is a workaround, which is explained below.


      Problem
      cbbackupmgr restore fails against Couchbase Server 6.5.x and 6.6.0 as the delete_with_meta is reject as being invalided. On previous versions of Couchbase Server the delete_with_meta is accepted.

      Notes

      I can see the point that the delete_with_meta should never take a body, unfortunately DCP on previous versions of Couchbase Server does provide tombstone with bodies. cbbbackupmgr backs up everything it's given and when it does a restore it will give everything back. This also affects XDCR between Couchbase Server 6.0.4 to 6.6.0. Meaning two of our upgrade processes will not work for 6.6.0.

      Interestingly on 6.6.0 the following config is set:

      6.6.0

      # /opt/couchbase/bin/cbstats localhost config -b test -u Administrator -p password | grep allow_del
      ep_allow_del_with_meta_prune_user_data:                false
      

      There does not seem to be away to see what this is set to on old versions.

      Steps to reproduce
      1. Create a document with user xattrs and a 10 second TTL on Couchbase Server 6.0.3

       /opt/couchbase/bin/cbc-subdoc -U couchbase://localhost/test -u Administrator -P password
       subdoc> set test value -x xattr=100 -e 10
      

      2. Wait 10 seconds
      3. Take a backup

      /opt/couchbase/bin/cbbackupmgr config -a backup -r zombie
      /opt/couchbase/bin/cbbackupmgr backup -a backup -r zombie  -c 10.112.194.101 -u Administrator -p password
      

      4. From the examine command we can see that the tombstone contains the xattr and a value:

      /opt/couchbase/bin/cbbackupmgr examine -a backup -r zombie   --bucket test --tombstones -k test
      Key: test
        SeqNo: 2
        Backup: 2020-10-29T15_05_45.745791997Z
        Deleted: true
        Size: 4B (key), 29B (meta), 3B (value)
        Meta: {"cas":1603983924472709120,"revseqno":2,"datatype":6}
        Xattrs: {"xattr":100}
        Value: value
      

      5. Restore the backup to Couchbase Server 6.6.0

       /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie  -c localhost -u Administrator -p password
      (1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
      Transferring key value data for 'test' at 0B/s (about 0s remaining)                                                                                                                                                                                                     2 items / 28.20KB
      [===============================================================================================================================================================================================================================================================================] 100.00%
      Error restoring cluster: invalid argument
      Restore bucket 'test' failed
      Mutations restored: 0, Mutations failed to restore: 0
      Deletions restored: 0, Deletions failed to restore: 1
      Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
      

      memcached.log

      2020-10-29T15:28:25.674599+00:00 INFO 44: HELO [{"a":"gocbcore/v9.0.4 cbbackupmgr-Unknown-5e545ae","i":"97f4cdb1b440b26d/f0ac4f557072da3e"}] XATTR, XERROR, Select bucket, Snappy, AltRequestSupport, SyncReplication, SubdocCreateAsDeleted [ [::1]:57536 - [::1]:11210 (not authenticated) ]
      2020-10-29T15:28:25.684252+00:00 INFO 44: Client [::1]:57536 authenticated as <ud>Administrator</ud>
      

      The restore on 6.5.1 and 6.5.0 also fails:

      6.5.1 Restore

      [root@node1-cb660-centos7 vagrant]# /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie  -c 10.112.201.101 -u Administrator -p passwo
      rd
      (1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
      Transferring key value data for 'test' at 0B/s (about 0s remaining)                                                        1 items / 28.20KB
      [==================================================================================================================================] 100.00%
      Error restoring cluster: invalid argument
      Restore bucket 'test' failed
      Mutations restored: 0, Mutations failed to restore: 0
      Deletions restored: 0, Deletions failed to restore: 1
      Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
      

      6.5.0 Restore

      # /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie  -c 10.112.200.101 -u Administrator -p password
      (1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
      Transferring key value data for 'test' at 0B/s (about 0s remaining)                                                        1 items / 28.20KB
      [==================================================================================================================================] 100.00%
      Error restoring cluster: invalid argument
      Restore bucket 'test' failed
      Mutations restored: 0, Mutations failed to restore: 0
      Deletions restored: 0, Deletions failed to restore: 1
      Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
      
      


      Restores to previous versions of Couchbase Server (6.0.X and lower) the restore works.

      The restore on 6.0.4:

      6.0.4 Restore

      /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie  -c 10.112.194.101 -u Administrator -p passwo
      rd
      (1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
      Copied all data in 156.041335ms (Avg. 28.20KB/Sec)                                                                         1 items / 28.20KB
      [==================================================================================================================================] 100.00%
      Restore bucket 'test' succeeded
      Mutations restored: 0, Mutations failed to restore: 0
      Deletions restored: 1, Deletions failed to restore: 0
      Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
      Restore completed successfully
      

      The xdcr logs from Couchbase Server 6.0.4 where the target is 6.6.0 showing that it's affected:

      6.0.4 goxdcr.log

      2020-10-29T16:44:48.605Z ERRO GOXDCR.XmemNozzle: xmem_84576cb22c4dd3a9bd354e17b2186eec/test/test_10.112.205.101:11210_0 received error response from setMeta client. Repairing connection. response status=EINVAL, opcode=0xa8, seqno=1, req.Key=<ud>[116 101 115 116]</ud>, req.Cas=0, req.Cas=0, req.Extras=[0 0 0 0 95 154 227 170 0 0 0 0 0 0 0 2 22 66 126 225 139 162 0 0]
      

      Workaround

      There is a work around but it only works for 6.6.0:

      cbepctl workaround

      # /opt/couchbase/bin/cbepctl localhost:11210 -b test -u Administrator -p password set flush_param allow_del_with_meta_prune_user_data true
      setting param: allow_del_with_meta_prune_user_data true
      set allow_del_with_meta_prune_user_data to true
      

      This will allow the restore to work:

      6.6.0 restore with workaround

      # /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie  -c localhost -u Administrator -p password
      (1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
      Copied all data in 90.191456ms (Avg. 28.20KB/Sec)                                                                          1 items / 28.20KB
      [==================================================================================================================================] 100.00%
      Restore bucket 'test' succeeded
      Mutations restored: 0, Mutations failed to restore: 0
      Deletions restored: 1, Deletions failed to restore: 0
      Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
      Restore completed successfully
      

      Unfortunately when the same work around is tried on 6.5.1 cbbackupmgr hangs and memcached produces an exception.

      6.5.1 restore with workaround hangs

      # /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie  -c 10.112.201.101 -u Administrator -p passwo
      rd
      (1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
      Transferring key value data for 'test' at 192B/s (about 0s remaining)                                                      1 items / 28.20KB
      [================================================================================================================================== ] 99.81%
      

      memcached exception

      2020-10-29T18:16:55.484527+00:00 ERROR 45: exception occurred in runloop during packet execution. Cookie info: [{"aiostat":"success","connection":"[ 10.112.201.1:51125 - 10.112.201.101:11210 (<ud>Administrator</ud>) ]","engine_storage":"0x0000000000000000","ewouldblock":false,"packet":{"bodylen":57,"cas":0,"datatype":["Snappy","Xattr"],"extlen":30,"key":"<ud>test</ud>","keylen":4,"magic":"ClientRequest","opaque":134217728,"opcode":"DEL_WITH_META","vbucket":127},"refcount":0}] - closing connection ([ 10.112.201.1:51125 - 10.112.201.101:11210 (<ud>Administrator</ud>) ]): Blob::assign failed to inflate.  buffer.size:21 uncompressedLength:0
      

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            No work has yet been logged on this issue.

              People

              Assignee:
              ashwin.govindarajulu Ashwin Govindarajulu
              Reporter:
              pvarley Patrick Varley
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty