Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
Untriaged
-
1
-
No
-
Tools 2021 Dec
Description
Our longevity cluster had 2 cloud repos configured (my_repo_gcp and my_repo_s3). I was triaging some issues with them today and filed MB-49639 & MB-49640. Post this, both the repos ran into a permission denied error on their respective staging directories. James and I look at it and decided that we should archive these repos and reinit them to see if we run into the issue again. I went ahead and archived them with names my_repo_gcp_1 and my_repo_s3_1. When I logged into the cluster a little later to reinit them, saw that these repos were showing up as active again with the same permission denied error (screenshot attached). This is not expected. Once the repos are archived, they are expected to remain in that state.
From the backup service logs, here we can see the repos getting archived successfully at 2021-11-18T10:26:25:
2021-11-18T10:26:25.623-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/archived/my_repo_gcp_1","Value":"eyJpZCI6Im15X3JlcG9fZ2NwXzEiLCJwbGFuX25hbWUiOiJteV9wbGFuIiwic3RhdGUiOiJhcmNoaXZlZCIsImFyY2hpdmUiOiJnczovL2xvbmdldml0eV90ZXN0aW5nL2NlbnRvczEiLCJyZXBvIjoiYWYzNjA5YmMtNDI4Ny00Y2E2LWEwMDQtNjY1NWUzZTM0OWUxIiwidmVyc2lvbiI6MSwiY2xvdWRfaW5mbyI6eyJhdXRoX3R5cGUiOjAsImNyZWRlbnRpYWxzX25hbWUiOiI0NzA5NGQ2NC0zODBhLTRhMjctYTdkZS0wOTk3MmMwMTY2OGUiLCJzdGFnaW5nX2RpciI6Ii9kYXRhL2djcC9teV9yZXBvX2djcC1hZjM2MDliYy00Mjg3LTRjYTYtYTAwNC02NjU1ZTNlMzQ5ZTEiLCJwcm92aWRlciI6ImdjcCIsInJlZ2lvbiI6IiJ9LCJjcmVhdGlvbl90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNDo0OC44OTg2NTc3ODYtMDg6MDAiLCJ1cGRhdGVfdGltZSI6IjIwMjEtMTEtMThUMTA6MjY6MjUuNTg4Nzk5NTA3LTA4OjAwIn0=","Rev":"g2gCbgUAMBML2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQDRFgvbDmo="}}
|
2021-11-18T10:26:25.703-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
|
2021-11-18T10:26:25.703-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_gcp","Value":null,"Rev":null}}
|
2021-11-18T10:26:25.703-08:00 INFO (ClockKeeper) Unloading repository {"cluster": "self", "repository": "my_repo_gcp"}
|
2021-11-18T10:26:25.703-08:00 INFO (ClockKeeper) Unloaded repository tasks {"cluster": "self", "repository": "my_repo_gcp", "#tasks": 2}
|
2021-11-18T10:26:25.703-08:00 INFO (Manger) Repository archived {"cluster": "self", "oldID": "my_repo_gcp", "newID": "my_repo_gcp_1"}
|
2021-11-18T10:26:25.704-08:00 DEBUG (REST) (Attempt 1) (POST) Dispatching request to 'http://127.0.0.1:8091/_event'
|
2021-11-18T10:26:25.708-08:00 DEBUG (REST) (Attempt 1) (POST) (200) Received response from 'http://127.0.0.1:8091/_event'
|
2021-11-18T10:26:38.648-08:00 INFO (REST) POST /api/v1/cluster/self/repository/active/my_repo_s3/archive
|
2021-11-18T10:26:38.756-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
|
2021-11-18T10:26:38.756-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/archived/my_repo_s3_1","Value":"eyJpZCI6Im15X3JlcG9fczNfMSIsInBsYW5fbmFtZSI6Im15X3BsYW4iLCJzdGF0ZSI6ImFyY2hpdmVkIiwiYXJjaGl2ZSI6InMzOi8vYmFja3VwLXNlcnZpY2UtdGVzdGluZy9sb25nZXZpdHktdGVzdGluZyIsInJlcG8iOiJiNjRmYTE0YS1mZDdjLTQ5MGMtOGMwYS04ZjkyYTM4NmVkNjIiLCJ2ZXJzaW9uIjoxLCJjbG91ZF9pbmZvIjp7ImF1dGhfdHlwZSI6MCwiY3JlZGVudGlhbHNfbmFtZSI6IjVmNmZlNmRmLWZjNjctNDU2Zi05ZDczLTJkODFiMGQ3Njg2NCIsInN0YWdpbmdfZGlyIjoiL2RhdGEvczMvbXlfcmVwb19zMy1iNjRmYTE0YS1mZDdjLTQ5MGMtOGMwYS04ZjkyYTM4NmVkNjIiLCJwcm92aWRlciI6ImF3cyIsInJlZ2lvbiI6ImV1LXdlc3QtMiJ9LCJjcmVhdGlvbl90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNzo0Mi4xODU1NzAxODgtMDg6MDAiLCJ1cGRhdGVfdGltZSI6IjIwMjEtMTEtMThUMTA6MjY6MzguNzM0Mjg2MjIyLTA4OjAwIn0=","Rev":"g2gCbgUAMBML2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQDeFgvbDmo="}}
|
2021-11-18T10:26:38.845-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
|
2021-11-18T10:26:38.845-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_s3","Value":null,"Rev":null}}
|
2021-11-18T10:26:38.845-08:00 INFO (ClockKeeper) Unloading repository {"cluster": "self", "repository": "my_repo_s3"}
|
2021-11-18T10:26:38.845-08:00 INFO (ClockKeeper) Unloaded repository tasks {"cluster": "self", "repository": "my_repo_s3", "#tasks": 2}
|
2021-11-18T10:26:38.845-08:00 INFO (Manger) Repository archived {"cluster": "self", "oldID": "my_repo_s3", "newID": "my_repo_s3_1"}
|
Then the test triggered rebalance which went through fine:
2021-11-18T10:56:45.704-08:00 INFO (Service Manager) Prepare topology change {"change": {"id":"bf90fbc8afadf20d4f366f03a9e61770","currentTopologyRev":null,"type":"topology-change-rebalance","keepNodes":[{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}],"ejectNodes":[]}}
|
2021-11-18T10:56:45.711-08:00 INFO (Service Manager) Start topology change {"change": {"id":"bf90fbc8afadf20d4f366f03a9e61770","currentTopologyRev":null,"type":"topology-change-rebalance","keepNodes":[{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}],"ejectNodes":[]}}
|
2021-11-18T10:56:45.711-08:00 INFO (Rebalance) Starting rebalance {"change": {"id":"bf90fbc8afadf20d4f366f03a9e61770","currentTopologyRev":null,"type":"topology-change-rebalance","keepNodes":[{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}],"ejectNodes":[]}}
|
2021-11-18T10:56:45.727-08:00 INFO (Rebalance) Got old leader {"leader": "d4bd66ebfd2fd6b933ef5c5e19ebe4c4"}
|
2021-11-18T10:56:45.747-08:00 INFO (Rebalance) Got current nodes {"#nodes": 1}
|
2021-11-18T10:56:45.747-08:00 INFO (Rebalance) Setting self as leader
|
2021-11-18T10:56:45.748-08:00 INFO (Rebalance) Did the failover nodes will do eject node now {"eject nodes": []}
|
2021-11-18T10:56:45.749-08:00 INFO (Rebalance) Adding nodes now {"keep nodes": [{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}]}
|
2021-11-18T10:56:45.749-08:00 INFO (Rebalance) Keep node {"nodeID": "d4bd66ebfd2fd6b933ef5c5e19ebe4c4"}
|
2021-11-18T10:56:45.749-08:00 DEBUG (Rebalance) Added node to MetaKV {"nodeID": "d4bd66ebfd2fd6b933ef5c5e19ebe4c4"}
|
2021-11-18T10:56:45.750-08:00 DEBUG (Rebalance) Checking if status match {"expected": "node in"}
|
2021-11-18T10:56:45.750-08:00 INFO (Rebalance) Nodes added
|
2021-11-18T10:56:45.750-08:00 INFO (Rebalance) Rebalance done {"err": null, "state": {}, "cancelled": false}
|
Corresponding step in the test:
[2021-11-18T10:39:43-08:00, sequoiatools/couchbase-cli:7.1:52ea0c] rebalance -c 172.23.108.103:8091 -u Administrator -p password
[2021-11-18T10:56:53-08:00, sequoiatools/cmd:7952a8] 60
Post this, we can see that the backup service is adding these repos again for some reason at 2021-11-18T11:08:
2021-11-18T11:08:10.231-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
|
2021-11-18T11:08:10.231-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_gcp","Value":"eyJpZCI6Im15X3JlcG9fZ2NwIiwicGxhbl9uYW1lIjoibXlfcGxhbiIsInN0YXRlIjoiYWN0aXZlIiwiYXJjaGl2ZSI6ImdzOi8vbG9uZ2V2aXR5X3Rlc3RpbmcvY2VudG9zMSIsInJlcG8iOiJhZjM2MDliYy00Mjg3LTRjYTYtYTAwNC02NjU1ZTNlMzQ5ZTEiLCJ2ZXJzaW9uIjoxLCJjbG91ZF9pbmZvIjp7ImF1dGhfdHlwZSI6MCwiY3JlZGVudGlhbHNfbmFtZSI6IjQ3MDk0ZDY0LTM4MGEtNGEyNy1hN2RlLTA5OTcyYzAxNjY4ZSIsInN0YWdpbmdfZGlyIjoiL2RhdGEvZ2NwL215X3JlcG9fZ2NwLWFmMzYwOWJjLTQyODctNGNhNi1hMDA0LTY2NTVlM2UzNDllMSIsInByb3ZpZGVyIjoiZ2NwIiwicmVnaW9uIjoiIn0sImNyZWF0aW9uX3RpbWUiOiIyMDIxLTExLTE3VDE0OjI0OjQ4Ljg5ODY1Nzc4Ni0wODowMCIsInVwZGF0ZV90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNDo0OC44OTg2NTc3ODYtMDg6MDAifQ==","Rev":"g2gCbgUA7vsJ2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQAw/QnbDmo="}}
|
2021-11-18T11:08:10.232-08:00 INFO (Event Handler) (Repo Change) Resuming repository {"repositoryID": "my_repo_gcp", "planName": "my_plan"}
|
2021-11-18T11:08:10.278-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_gcp", "task": "backup-1"}
|
2021-11-18T11:08:10.417-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_gcp", "task": "merge"}
|
2021-11-18T11:08:10.541-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
|
2021-11-18T11:08:10.541-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_s3","Value":"eyJpZCI6Im15X3JlcG9fczMiLCJwbGFuX25hbWUiOiJteV9wbGFuIiwic3RhdGUiOiJhY3RpdmUiLCJhcmNoaXZlIjoiczM6Ly9iYWNrdXAtc2VydmljZS10ZXN0aW5nL2xvbmdldml0eS10ZXN0aW5nIiwicmVwbyI6ImI2NGZhMTRhLWZkN2MtNDkwYy04YzBhLThmOTJhMzg2ZWQ2MiIsInZlcnNpb24iOjEsImNsb3VkX2luZm8iOnsiYXV0aF90eXBlIjowLCJjcmVkZW50aWFsc19uYW1lIjoiNWY2ZmU2ZGYtZmM2Ny00NTZmLTlkNzMtMmQ4MWIwZDc2ODY0Iiwic3RhZ2luZ19kaXIiOiIvZGF0YS9zMy9teV9yZXBvX3MzLWI2NGZhMTRhLWZkN2MtNDkwYy04YzBhLThmOTJhMzg2ZWQ2MiIsInByb3ZpZGVyIjoiYXdzIiwicmVnaW9uIjoiZXUtd2VzdC0yIn0sImNyZWF0aW9uX3RpbWUiOiIyMDIxLTExLTE3VDE0OjI3OjQyLjE4NTU3MDE4OC0wODowMCIsInVwZGF0ZV90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNzo0Mi4xODU1NzAxODgtMDg6MDAifQ==","Rev":"g2gCbgUAo/wJ2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQDe/QnbDmo="}}
|
2021-11-18T11:08:10.541-08:00 INFO (Event Handler) (Repo Change) Resuming repository {"repositoryID": "my_repo_s3", "planName": "my_plan"ved store event {"eventType": 4}
|
2021-11-18T11:08:10.231-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_gcp","Value":"eyJpZCI6Im15X3JlcG9fZ2NwIiwicGxhbl9uYW1lIjoibXlfcGxhbiIsInN0YXRlIjoiYWN0aXZlIiwiYXJjaGl2ZSI6ImdzOi8vbG9uZ2V2aXR5X3Rlc3RpbmcvY2VudG9zMSIsInJlcG8iOiJhZjM2MDliYy00Mjg3LTRjYTYtYTAwNC02NjU1ZTNlMzQ5ZTEiLCJ2ZXJzaW9uIjoxLCJjbG91ZF9pbmZvIjp7ImF1dGhfdHlwZSI6MCwiY3JlZGVudGlhbHNfbmFtZSI6IjQ3MDk0ZDY0LTM4MGEtNGEyNy1hN2RlLTA5OTcyYzAxNjY4ZSIsInN0YWdpbmdfZGlyIjoiL2RhdGEvZ2NwL215X3JlcG9fZ2NwLWFmMzYwOWJjLTQyODctNGNhNi1hMDA0LTY2NTVlM2UzNDllMSIsInByb3ZpZGVyIjoiZ2NwIiwicmVnaW9uIjoiIn0sImNyZWF0aW9uX3RpbWUiOiIyMDIxLTExLTE3VDE0OjI0OjQ4Ljg5ODY1Nzc4Ni0wODowMCIsInVwZGF0ZV90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNDo0OC44OTg2NTc3ODYtMDg6MDAifQ==","Rev":"g2gCbgUA7vsJ2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQAw/QnbDmo="}}}
|
2021-11-18T11:08:10.568-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_s3", "task": "backup-1"}
|
2021-11-18T11:08:10.681-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_s3", "task": "merge"}
|
Server Logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.137.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.155.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.157.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.5.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.67.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.69.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.70.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.106.100.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.106.188.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.108.103.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.120.245.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.121.117.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.121.3.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.123.27.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.123.28.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.96.148.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.96.252.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.96.253.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.119.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.121.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.122.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.242.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.98.135.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.11.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.20.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.21.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.25.zip
Attachments
Issue Links
- is duplicated by
-
MB-49639 [System Test][Backup Service] status of a backup task which took ~3hrs is not reflected properly in UI - reported as running when it completed successfully
- Closed