Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53883

[BP 6.6.6][System test upgrade] - Backups fail and Merges skipped post upgrade from 7.0.3 -> 7.1.0

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown

    Description

      Script to Repro
      1. Run CC longevity on 7.0.3 for 2-3 days

      ./sequoia -client 172.23.104.254:2375 -provider file:centos_second_cluster.yml -test tests/integration/cheshirecat/test_cheshirecat_kv_gsi_coll_xdcr_backup_sgw_fts_itemct_txns_eventing_cbas_scale3.yml -scope tests/integration/cheshirecat/scope_cheshirecat_with_backup.yml -scale 3 -repeat 0 -log_level 0 -version 7.0.3-7031 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      2. Change the encryption level to strict.
      3. Upgrade the entire 7.0.3 cluster to 7.1 using online upgrade strategies of swap rebalance and graceful failover/recovery.

      Post upgrade to 7.1 we saw the following error messages on backup nodes.
      172.23.96.254 : backup

      2022-03-20T17:00:34.826-07:00 WARN (Worker) Task failed {"cluster": "self", "repositoryID": "my_repo", "state": "active", "taskName": "backup-1", "err": "exit status 1", "cbmErr": "exit status 1: failed to execute cluster operations: failed to transfer Analytics metadata: failed to get Analytics metadata: failed to get analytics metadata: failed to execute request: failed to execute request: exhausted retry count after 3 retries, last error: internal server error executing 'GET' request to '/api/v1/backup': {\n\t\"version\": 2,\n\t\"requestID\": \"ec88b5b8-e63d-404a-8555-a526a824cd9a\",\n\t\"errors\": [{ \n\t\t\"code\": 23001,\t\t\"msg\": \"Analytics Service requires a rebalance to be operational\"\t} \n\t],\n\t\"status\": \"fatal\",\n\t\"metrics\": {\n\t\t\"elapsedTime\": \"12.643782ms\",\n\t\t\"executionTime\": \"11.156596ms\",\n\t\t\"resultCount\": 0,\n\t\t\"resultSize\": 0,\n\t\t\"processedObjects\": 0,\n\t\t\"errorCount\": 1\n\t}\n}\n"}
      

      See task history

      cbcollect_info attached.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            I had a discussion with Michael Blow about this backport. Since this fix wasn't backported to 7.0.4 and the backport patches from 7.1 are a bit involved, we aren't going to do the backport. The workaround is mentioned in the original ticket which is simply invoking a rebalance from the CLI or the REST API.

            murtadha.hubail Murtadha Hubail added a comment - I had a discussion with Michael Blow  about this backport. Since this fix wasn't backported to 7.0.4 and the backport patches from 7.1 are a bit involved, we aren't going to do the backport. The workaround is mentioned in the original ticket which is simply invoking a rebalance from the CLI or the REST API.

            People

              Balakumaran.Gopal Balakumaran Gopal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty