Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49646

[System Test][Backup Service] Weird issue observed with archived repos

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • ns_server
    • Untriaged
    • 1
    • No
    • Tools 2021 Dec

    Description

      Our longevity cluster had 2 cloud repos configured (my_repo_gcp and my_repo_s3). I was triaging some issues with them today and filed MB-49639 & MB-49640. Post this, both the repos ran into a permission denied error on their respective staging directories. James and I look at it and decided that we should archive these repos and reinit them to see if we run into the issue again. I went ahead and archived them with names my_repo_gcp_1 and my_repo_s3_1. When I logged into the cluster a little later to reinit them, saw that these repos were showing up as active again with the same permission denied error (screenshot attached). This is not expected. Once the repos are archived, they are expected to remain in that state.

      From the backup service logs, here we can see the repos getting archived successfully at 2021-11-18T10:26:25:

      2021-11-18T10:26:25.623-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/archived/my_repo_gcp_1","Value":"eyJpZCI6Im15X3JlcG9fZ2NwXzEiLCJwbGFuX25hbWUiOiJteV9wbGFuIiwic3RhdGUiOiJhcmNoaXZlZCIsImFyY2hpdmUiOiJnczovL2xvbmdldml0eV90ZXN0aW5nL2NlbnRvczEiLCJyZXBvIjoiYWYzNjA5YmMtNDI4Ny00Y2E2LWEwMDQtNjY1NWUzZTM0OWUxIiwidmVyc2lvbiI6MSwiY2xvdWRfaW5mbyI6eyJhdXRoX3R5cGUiOjAsImNyZWRlbnRpYWxzX25hbWUiOiI0NzA5NGQ2NC0zODBhLTRhMjctYTdkZS0wOTk3MmMwMTY2OGUiLCJzdGFnaW5nX2RpciI6Ii9kYXRhL2djcC9teV9yZXBvX2djcC1hZjM2MDliYy00Mjg3LTRjYTYtYTAwNC02NjU1ZTNlMzQ5ZTEiLCJwcm92aWRlciI6ImdjcCIsInJlZ2lvbiI6IiJ9LCJjcmVhdGlvbl90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNDo0OC44OTg2NTc3ODYtMDg6MDAiLCJ1cGRhdGVfdGltZSI6IjIwMjEtMTEtMThUMTA6MjY6MjUuNTg4Nzk5NTA3LTA4OjAwIn0=","Rev":"g2gCbgUAMBML2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQDRFgvbDmo="}}
      2021-11-18T10:26:25.703-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
      2021-11-18T10:26:25.703-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_gcp","Value":null,"Rev":null}}
      2021-11-18T10:26:25.703-08:00 INFO (ClockKeeper) Unloading repository {"cluster": "self", "repository": "my_repo_gcp"}
      2021-11-18T10:26:25.703-08:00 INFO (ClockKeeper) Unloaded repository tasks {"cluster": "self", "repository": "my_repo_gcp", "#tasks": 2}
      2021-11-18T10:26:25.703-08:00 INFO (Manger) Repository archived {"cluster": "self", "oldID": "my_repo_gcp", "newID": "my_repo_gcp_1"}
      2021-11-18T10:26:25.704-08:00 DEBUG (REST) (Attempt 1) (POST) Dispatching request to 'http://127.0.0.1:8091/_event'
      2021-11-18T10:26:25.708-08:00 DEBUG (REST) (Attempt 1) (POST) (200) Received response from 'http://127.0.0.1:8091/_event'
      2021-11-18T10:26:38.648-08:00 INFO (REST) POST /api/v1/cluster/self/repository/active/my_repo_s3/archive
      2021-11-18T10:26:38.756-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
      2021-11-18T10:26:38.756-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/archived/my_repo_s3_1","Value":"eyJpZCI6Im15X3JlcG9fczNfMSIsInBsYW5fbmFtZSI6Im15X3BsYW4iLCJzdGF0ZSI6ImFyY2hpdmVkIiwiYXJjaGl2ZSI6InMzOi8vYmFja3VwLXNlcnZpY2UtdGVzdGluZy9sb25nZXZpdHktdGVzdGluZyIsInJlcG8iOiJiNjRmYTE0YS1mZDdjLTQ5MGMtOGMwYS04ZjkyYTM4NmVkNjIiLCJ2ZXJzaW9uIjoxLCJjbG91ZF9pbmZvIjp7ImF1dGhfdHlwZSI6MCwiY3JlZGVudGlhbHNfbmFtZSI6IjVmNmZlNmRmLWZjNjctNDU2Zi05ZDczLTJkODFiMGQ3Njg2NCIsInN0YWdpbmdfZGlyIjoiL2RhdGEvczMvbXlfcmVwb19zMy1iNjRmYTE0YS1mZDdjLTQ5MGMtOGMwYS04ZjkyYTM4NmVkNjIiLCJwcm92aWRlciI6ImF3cyIsInJlZ2lvbiI6ImV1LXdlc3QtMiJ9LCJjcmVhdGlvbl90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNzo0Mi4xODU1NzAxODgtMDg6MDAiLCJ1cGRhdGVfdGltZSI6IjIwMjEtMTEtMThUMTA6MjY6MzguNzM0Mjg2MjIyLTA4OjAwIn0=","Rev":"g2gCbgUAMBML2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQDeFgvbDmo="}}
      2021-11-18T10:26:38.845-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
      2021-11-18T10:26:38.845-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_s3","Value":null,"Rev":null}}
      2021-11-18T10:26:38.845-08:00 INFO (ClockKeeper) Unloading repository {"cluster": "self", "repository": "my_repo_s3"}
      2021-11-18T10:26:38.845-08:00 INFO (ClockKeeper) Unloaded repository tasks {"cluster": "self", "repository": "my_repo_s3", "#tasks": 2}
      2021-11-18T10:26:38.845-08:00 INFO (Manger) Repository archived {"cluster": "self", "oldID": "my_repo_s3", "newID": "my_repo_s3_1"}
      

      Then the test triggered rebalance which went through fine:

      2021-11-18T10:56:45.704-08:00 INFO (Service Manager) Prepare topology change {"change": {"id":"bf90fbc8afadf20d4f366f03a9e61770","currentTopologyRev":null,"type":"topology-change-rebalance","keepNodes":[{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}],"ejectNodes":[]}}
      2021-11-18T10:56:45.711-08:00 INFO (Service Manager) Start topology change {"change": {"id":"bf90fbc8afadf20d4f366f03a9e61770","currentTopologyRev":null,"type":"topology-change-rebalance","keepNodes":[{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}],"ejectNodes":[]}}
      2021-11-18T10:56:45.711-08:00 INFO (Rebalance) Starting rebalance {"change": {"id":"bf90fbc8afadf20d4f366f03a9e61770","currentTopologyRev":null,"type":"topology-change-rebalance","keepNodes":[{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}],"ejectNodes":[]}}
      2021-11-18T10:56:45.727-08:00 INFO (Rebalance) Got old leader {"leader": "d4bd66ebfd2fd6b933ef5c5e19ebe4c4"}
      2021-11-18T10:56:45.747-08:00 INFO (Rebalance) Got current nodes {"#nodes": 1}
      2021-11-18T10:56:45.747-08:00 INFO (Rebalance) Setting self as leader
      2021-11-18T10:56:45.748-08:00 INFO (Rebalance) Did the failover nodes will do eject node now {"eject nodes": []}
      2021-11-18T10:56:45.749-08:00 INFO (Rebalance) Adding nodes now {"keep nodes": [{"nodeInfo":{"nodeId":"d4bd66ebfd2fd6b933ef5c5e19ebe4c4","priority":2,"opaque":{"grpc_port":9124,"host":"172.23.123.28","http_port":8097}},"recoveryType":"recovery-full"}]}
      2021-11-18T10:56:45.749-08:00 INFO (Rebalance) Keep node {"nodeID": "d4bd66ebfd2fd6b933ef5c5e19ebe4c4"}
      2021-11-18T10:56:45.749-08:00 DEBUG (Rebalance) Added node to MetaKV {"nodeID": "d4bd66ebfd2fd6b933ef5c5e19ebe4c4"}
      2021-11-18T10:56:45.750-08:00 DEBUG (Rebalance) Checking if status match {"expected": "node in"}
      2021-11-18T10:56:45.750-08:00 INFO (Rebalance) Nodes added
      2021-11-18T10:56:45.750-08:00 INFO (Rebalance) Rebalance done {"err": null, "state": {}, "cancelled": false}
      

      Corresponding step in the test:

      [2021-11-18T10:39:43-08:00, sequoiatools/couchbase-cli:7.1:52ea0c] rebalance -c 172.23.108.103:8091 -u Administrator -p password
      [2021-11-18T10:56:53-08:00, sequoiatools/cmd:7952a8] 60

      Post this, we can see that the backup service is adding these repos again for some reason at 2021-11-18T11:08:

      2021-11-18T11:08:10.231-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
      2021-11-18T11:08:10.231-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_gcp","Value":"eyJpZCI6Im15X3JlcG9fZ2NwIiwicGxhbl9uYW1lIjoibXlfcGxhbiIsInN0YXRlIjoiYWN0aXZlIiwiYXJjaGl2ZSI6ImdzOi8vbG9uZ2V2aXR5X3Rlc3RpbmcvY2VudG9zMSIsInJlcG8iOiJhZjM2MDliYy00Mjg3LTRjYTYtYTAwNC02NjU1ZTNlMzQ5ZTEiLCJ2ZXJzaW9uIjoxLCJjbG91ZF9pbmZvIjp7ImF1dGhfdHlwZSI6MCwiY3JlZGVudGlhbHNfbmFtZSI6IjQ3MDk0ZDY0LTM4MGEtNGEyNy1hN2RlLTA5OTcyYzAxNjY4ZSIsInN0YWdpbmdfZGlyIjoiL2RhdGEvZ2NwL215X3JlcG9fZ2NwLWFmMzYwOWJjLTQyODctNGNhNi1hMDA0LTY2NTVlM2UzNDllMSIsInByb3ZpZGVyIjoiZ2NwIiwicmVnaW9uIjoiIn0sImNyZWF0aW9uX3RpbWUiOiIyMDIxLTExLTE3VDE0OjI0OjQ4Ljg5ODY1Nzc4Ni0wODowMCIsInVwZGF0ZV90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNDo0OC44OTg2NTc3ODYtMDg6MDAifQ==","Rev":"g2gCbgUA7vsJ2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQAw/QnbDmo="}}
      2021-11-18T11:08:10.232-08:00 INFO (Event Handler) (Repo Change) Resuming repository {"repositoryID": "my_repo_gcp", "planName": "my_plan"}
      2021-11-18T11:08:10.278-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_gcp", "task": "backup-1"}
      2021-11-18T11:08:10.417-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_gcp", "task": "merge"}
      2021-11-18T11:08:10.541-08:00 DEBUG (Leader Manager) Received store event {"eventType": 4}
      2021-11-18T11:08:10.541-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_s3","Value":"eyJpZCI6Im15X3JlcG9fczMiLCJwbGFuX25hbWUiOiJteV9wbGFuIiwic3RhdGUiOiJhY3RpdmUiLCJhcmNoaXZlIjoiczM6Ly9iYWNrdXAtc2VydmljZS10ZXN0aW5nL2xvbmdldml0eS10ZXN0aW5nIiwicmVwbyI6ImI2NGZhMTRhLWZkN2MtNDkwYy04YzBhLThmOTJhMzg2ZWQ2MiIsInZlcnNpb24iOjEsImNsb3VkX2luZm8iOnsiYXV0aF90eXBlIjowLCJjcmVkZW50aWFsc19uYW1lIjoiNWY2ZmU2ZGYtZmM2Ny00NTZmLTlkNzMtMmQ4MWIwZDc2ODY0Iiwic3RhZ2luZ19kaXIiOiIvZGF0YS9zMy9teV9yZXBvX3MzLWI2NGZhMTRhLWZkN2MtNDkwYy04YzBhLThmOTJhMzg2ZWQ2MiIsInByb3ZpZGVyIjoiYXdzIiwicmVnaW9uIjoiZXUtd2VzdC0yIn0sImNyZWF0aW9uX3RpbWUiOiIyMDIxLTExLTE3VDE0OjI3OjQyLjE4NTU3MDE4OC0wODowMCIsInVwZGF0ZV90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNzo0Mi4xODU1NzAxODgtMDg6MDAifQ==","Rev":"g2gCbgUAo/wJ2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQDe/QnbDmo="}}
      2021-11-18T11:08:10.541-08:00 INFO (Event Handler) (Repo Change) Resuming repository {"repositoryID": "my_repo_s3", "planName": "my_plan"ved store event {"eventType": 4}
      2021-11-18T11:08:10.231-08:00 INFO (Event Handler) (Repo Change) Repository change {"event": {"Type":4,"Path":"/cbbs/cluster/self/active/my_repo_gcp","Value":"eyJpZCI6Im15X3JlcG9fZ2NwIiwicGxhbl9uYW1lIjoibXlfcGxhbiIsInN0YXRlIjoiYWN0aXZlIiwiYXJjaGl2ZSI6ImdzOi8vbG9uZ2V2aXR5X3Rlc3RpbmcvY2VudG9zMSIsInJlcG8iOiJhZjM2MDliYy00Mjg3LTRjYTYtYTAwNC02NjU1ZTNlMzQ5ZTEiLCJ2ZXJzaW9uIjoxLCJjbG91ZF9pbmZvIjp7ImF1dGhfdHlwZSI6MCwiY3JlZGVudGlhbHNfbmFtZSI6IjQ3MDk0ZDY0LTM4MGEtNGEyNy1hN2RlLTA5OTcyYzAxNjY4ZSIsInN0YWdpbmdfZGlyIjoiL2RhdGEvZ2NwL215X3JlcG9fZ2NwLWFmMzYwOWJjLTQyODctNGNhNi1hMDA0LTY2NTVlM2UzNDllMSIsInByb3ZpZGVyIjoiZ2NwIiwicmVnaW9uIjoiIn0sImNyZWF0aW9uX3RpbWUiOiIyMDIxLTExLTE3VDE0OjI0OjQ4Ljg5ODY1Nzc4Ni0wODowMCIsInVwZGF0ZV90aW1lIjoiMjAyMS0xMS0xN1QxNDoyNDo0OC44OTg2NTc3ODYtMDg6MDAifQ==","Rev":"g2gCbgUA7vsJ2w5sAAAAAWgCbQAAACBkNGJkNjZlYmZkMmZkNmI5MzNlZjVjNWUxOWViZTRjNGgCYQFuBQAw/QnbDmo="}}}
      2021-11-18T11:08:10.568-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_s3", "task": "backup-1"}
      2021-11-18T11:08:10.681-08:00 INFO (ClockKeeper) Adding job {"cluster": "self", "repository": "my_repo_s3", "task": "merge"}
      

      Server Logs:
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.137.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.155.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.157.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.5.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.67.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.69.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.104.70.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.106.100.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.106.188.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.108.103.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.120.245.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.121.117.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.121.3.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.123.27.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.123.28.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.96.148.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.96.252.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.96.253.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.119.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.121.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.122.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.97.242.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.98.135.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.11.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.20.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.21.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1637262812/collectinfo-2021-11-18T191335-ns_1%40172.23.99.25.zip

      Attachments

        1. backup-gcp.log
          8.01 MB
        2. backup-gcp.staged.log
          8.01 MB
        3. backup-s3.log
          4.56 MB
        4. Screen Shot 2021-11-18 at 11.25.33 AM.png
          Screen Shot 2021-11-18 at 11.25.33 AM.png
          142 kB

        Issue Links

          For Gerrit Dashboard: MB-49646
          # Subject Branch Project Status CR V

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              arunkumar Arunkumar Senthilnathan (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty