Details
-
Task
-
Resolution: Done
-
Major
-
Cheshire-Cat
-
None
-
Enterprise Edition 7.0.0 build 4907
Description
Description:
Volume testing is quite chaotic, we observed 3 types failing tasks during run. The goal of this issue is to determine if the following task failures are expected under these chaotic conditions:
Cluster Setup:
There are roughly 5 nodes present in the cluster at a given moment in time with an extra 3 nodes being used as spare nodes for the swap rebalance.
Services:
Each node runs the kv and backup services.
Testing:
The exact steps performed in the test can be found here: https://hub.internal.couchbase.com/confluence/pages/viewpage.action?pageId=50135893
The test made it to step 15 before I terminated it.
Backup Service Configuration:
There are 10 repositories: 'repo-plan1' .. 'repo-plan10' each with a plan: 'plan1' .. 'plan10' sharing identical tasks.
Tasks: Backup every 15 minutes. Merge every 40 minutes between every 0 and 1 days.
Each repository has a unique archive location '/tmp/my-archive/archive-plan1' .. '/tmp/my-archive/archive-plan2' to avoid lock contention issues.
Shared folder:
The shared folder (NFS) is '/data/share' is mounted at '/tmp/my-archive' on each machine using NFS.
Attached
The cbbackupmgr logs for each repository can be found in: backup-logs.zip
The server logs:
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/collectinfo-2021-04-14T150204-ns_1%40172.23.105.175.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/collectinfo-2021-04-14T150204-ns_1%40172.23.106.233.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/collectinfo-2021-04-14T150204-ns_1%40172.23.106.238.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/collectinfo-2021-04-14T150204-ns_1%40172.23.106.251.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/collectinfo-2021-04-14T150204-ns_1%40172.23.121.74.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/172.23.121.78.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/172.23.106.250.zip
https://cb-engineering.s3.amazonaws.com/CBQE-6782/tools-qe/172.23.106.236.zip
The test logs:
The task history:
Side commentary:
The testing is very chaotic, nodes performing backups are rebalanced out.
There were other interesting tasks, but I have omitted them as they seem to be of the expected category mainly relating to orphans or merge tasks which lacked the sufficient number of backups.
Attempted Supportal upload: https://supportal.couchbase.com/snapshot/4f28ea5724c3bcddbefc2c1de8390e05::0
Attachments
Issue Links
- relates to
-
MB-45726 Investigate case where backup completes successfully, however, SQLite files may be empty
- Open