Details
-
Bug
-
Resolution: Done
-
Critical
-
6.5.1, 6.6.0, 6.5.0
-
Untriaged
-
1
-
Unknown
Description
TL;DR:
In some versions of Couchbase Server cbbackupmgr will backup tombstones that contain a body. Restoring these tombstone can fail:
Affected versions
The version of cbbackupmgr does not matter. What matters is the version of the cluster that was backed up and the version of the cluster being restored to.
Backup (rows) \ Restore (columns) Versions | <=6.0.4 | 6.5.0 | 6.5.1 | 6.6.0 |
---|---|---|---|---|
<=6.0.4 | * | |||
6.5.0 | N/A | * | ||
6.5.1 | N/A | N/A | * | |
6.6.0 | N/A | N/A | N/A | * |
* When restoring to Couchbase Server 6.6.0 there is a workaround, which is explained below.
—
Problem
cbbackupmgr restore fails against Couchbase Server 6.5.x and 6.6.0 as the delete_with_meta is reject as being invalided. On previous versions of Couchbase Server the delete_with_meta is accepted.
Notes
I can see the point that the delete_with_meta should never take a body, unfortunately DCP on previous versions of Couchbase Server does provide tombstone with bodies. cbbbackupmgr backs up everything it's given and when it does a restore it will give everything back. This also affects XDCR between Couchbase Server 6.0.4 to 6.6.0. Meaning two of our upgrade processes will not work for 6.6.0.
Interestingly on 6.6.0 the following config is set:
6.6.0 |
# /opt/couchbase/bin/cbstats localhost config -b test -u Administrator -p password | grep allow_del
|
ep_allow_del_with_meta_prune_user_data: false
|
There does not seem to be away to see what this is set to on old versions.
Steps to reproduce
1. Create a document with user xattrs and a 10 second TTL on Couchbase Server 6.0.3
/opt/couchbase/bin/cbc-subdoc -U couchbase://localhost/test -u Administrator -P password
|
subdoc> set test value -x xattr=100 -e 10
|
2. Wait 10 seconds
3. Take a backup
/opt/couchbase/bin/cbbackupmgr config -a backup -r zombie
|
/opt/couchbase/bin/cbbackupmgr backup -a backup -r zombie -c 10.112.194.101 -u Administrator -p password
|
4. From the examine command we can see that the tombstone contains the xattr and a value:
/opt/couchbase/bin/cbbackupmgr examine -a backup -r zombie --bucket test --tombstones -k test
|
Key: test
|
SeqNo: 2
|
Backup: 2020-10-29T15_05_45.745791997Z
|
Deleted: true
|
Size: 4B (key), 29B (meta), 3B (value)
|
Meta: {"cas":1603983924472709120,"revseqno":2,"datatype":6}
|
Xattrs: {"xattr":100}
|
Value: value
|
5. Restore the backup to Couchbase Server 6.6.0
/opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie -c localhost -u Administrator -p password
|
(1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
|
Transferring key value data for 'test' at 0B/s (about 0s remaining) 2 items / 28.20KB
|
[===============================================================================================================================================================================================================================================================================] 100.00%
|
Error restoring cluster: invalid argument
|
Restore bucket 'test' failed
|
Mutations restored: 0, Mutations failed to restore: 0
|
Deletions restored: 0, Deletions failed to restore: 1
|
Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
|
memcached.log |
2020-10-29T15:28:25.674599+00:00 INFO 44: HELO [{"a":"gocbcore/v9.0.4 cbbackupmgr-Unknown-5e545ae","i":"97f4cdb1b440b26d/f0ac4f557072da3e"}] XATTR, XERROR, Select bucket, Snappy, AltRequestSupport, SyncReplication, SubdocCreateAsDeleted [ [::1]:57536 - [::1]:11210 (not authenticated) ]
|
2020-10-29T15:28:25.684252+00:00 INFO 44: Client [::1]:57536 authenticated as <ud>Administrator</ud>
|
The restore on 6.5.1 and 6.5.0 also fails:
6.5.1 Restore |
[root@node1-cb660-centos7 vagrant]# /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie -c 10.112.201.101 -u Administrator -p passwo
|
rd
|
(1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
|
Transferring key value data for 'test' at 0B/s (about 0s remaining) 1 items / 28.20KB
|
[==================================================================================================================================] 100.00%
|
Error restoring cluster: invalid argument
|
Restore bucket 'test' failed
|
Mutations restored: 0, Mutations failed to restore: 0
|
Deletions restored: 0, Deletions failed to restore: 1
|
Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
|
6.5.0 Restore |
# /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie -c 10.112.200.101 -u Administrator -p password
|
(1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
|
Transferring key value data for 'test' at 0B/s (about 0s remaining) 1 items / 28.20KB
|
[==================================================================================================================================] 100.00%
|
Error restoring cluster: invalid argument
|
Restore bucket 'test' failed
|
Mutations restored: 0, Mutations failed to restore: 0
|
Deletions restored: 0, Deletions failed to restore: 1
|
Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
|
|
—
Restores to previous versions of Couchbase Server (6.0.X and lower) the restore works.
The restore on 6.0.4:
6.0.4 Restore |
/opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie -c 10.112.194.101 -u Administrator -p passwo
|
rd
|
(1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
|
Copied all data in 156.041335ms (Avg. 28.20KB/Sec) 1 items / 28.20KB
|
[==================================================================================================================================] 100.00%
|
Restore bucket 'test' succeeded
|
Mutations restored: 0, Mutations failed to restore: 0
|
Deletions restored: 1, Deletions failed to restore: 0
|
Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
|
Restore completed successfully
|
—
The xdcr logs from Couchbase Server 6.0.4 where the target is 6.6.0 showing that it's affected:
6.0.4 goxdcr.log |
2020-10-29T16:44:48.605Z ERRO GOXDCR.XmemNozzle: xmem_84576cb22c4dd3a9bd354e17b2186eec/test/test_10.112.205.101:11210_0 received error response from setMeta client. Repairing connection. response status=EINVAL, opcode=0xa8, seqno=1, req.Key=<ud>[116 101 115 116]</ud>, req.Cas=0, req.Cas=0, req.Extras=[0 0 0 0 95 154 227 170 0 0 0 0 0 0 0 2 22 66 126 225 139 162 0 0]
|
Workaround
There is a work around but it only works for 6.6.0:
cbepctl workaround |
# /opt/couchbase/bin/cbepctl localhost:11210 -b test -u Administrator -p password set flush_param allow_del_with_meta_prune_user_data true
|
setting param: allow_del_with_meta_prune_user_data true
|
set allow_del_with_meta_prune_user_data to true
|
This will allow the restore to work:
6.6.0 restore with workaround |
# /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie -c localhost -u Administrator -p password
|
(1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
|
Copied all data in 90.191456ms (Avg. 28.20KB/Sec) 1 items / 28.20KB
|
[==================================================================================================================================] 100.00%
|
Restore bucket 'test' succeeded
|
Mutations restored: 0, Mutations failed to restore: 0
|
Deletions restored: 1, Deletions failed to restore: 0
|
Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
|
Restore completed successfully
|
Unfortunately when the same work around is tried on 6.5.1 cbbackupmgr hangs and memcached produces an exception.
6.5.1 restore with workaround hangs |
# /opt/couchbase/bin/cbbackupmgr restore -a backup -r zombie -c 10.112.201.101 -u Administrator -p passwo
|
rd
|
(1/1) Restoring backup 2020-10-29T15_05_45.745791997Z '2020-10-29T15_05_45.745791997Z'
|
Transferring key value data for 'test' at 192B/s (about 0s remaining) 1 items / 28.20KB
|
[================================================================================================================================== ] 99.81%
|
memcached exception |
2020-10-29T18:16:55.484527+00:00 ERROR 45: exception occurred in runloop during packet execution. Cookie info: [{"aiostat":"success","connection":"[ 10.112.201.1:51125 - 10.112.201.101:11210 (<ud>Administrator</ud>) ]","engine_storage":"0x0000000000000000","ewouldblock":false,"packet":{"bodylen":57,"cas":0,"datatype":["Snappy","Xattr"],"extlen":30,"key":"<ud>test</ud>","keylen":4,"magic":"ClientRequest","opaque":134217728,"opcode":"DEL_WITH_META","vbucket":127},"refcount":0}] - closing connection ([ 10.112.201.1:51125 - 10.112.201.101:11210 (<ud>Administrator</ud>) ]): Blob::assign failed to inflate. buffer.size:21 uncompressedLength:0
|