Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-1690

Backup/restore with Operator doesn't find the repo after changing the backup resource config

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • None
    • kubernetes, operator
    • None
    • 1

    Description

      The Restore resource cannot start due to 

      {"level":"error","ts":1601331434.5607915,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/cb-example","error":"no corresponding CouchbaseBackup Repo found"

      Steps to reproduce:

      1) create a backup resource as per the backup.yaml, using the couchbase/operator-backup:6.5.1-111 image

      Events: 
        Type    Reason     Age    From               Message 
        ----    ------     ----   ----               ------- 
        Normal  Scheduled  3m29s  default-scheduler  Successfully assigned default/my-backup-
      full-1601331000-tc7c4 to minikube 
        Normal  Pulling    3m29s  kubelet, minikube  Pulling image "couchbase/operator-backup:6.5.1-111" 
        Normal  Pulled     41s    kubelet, minikube  Successfully pulled image "couchbase/operator-backup:6.5.1-111" 
        Normal  Created    37s    kubelet, minikube  Created container cbbackupmgr-full 
        Normal  Started    37s    kubelet, minikube  Started container cbbackupmgr-full

      2) Let backup runs once, from the logs we can see the backup ran successfully.

       

      2020-09-28 22:12:56,415 - root - INFO - epoch: 1601331176 
      2020-09-28 22:12:56,415 - root - INFO - timestamp: 2020-09-28T22_12_56 
      2020-09-28 22:12:56,415 - root - INFO - Namespace(backup_ret='24.00', cacert=None, cluster='cb-example', config=
      'true', end=None, log_ret='24.00', mode='backup', repo=None, start=None, strategy='full_incremental', verbosity=
      'INFO') 
      2020-09-28 22:12:56,415 - root - INFO - start logRetention check 
      2020-09-28 22:12:56,416 - root - INFO - removed 0 logs in /data/scriptlogs/incremental 
      2020-09-28 22:12:56,416 - root - INFO - mode: BACKUP 
      2020-09-28 22:12:56,416 - root - INFO - Perform CONFIG: new Repo to be created 
      2020-09-28 22:12:56,416 - root - INFO - Strategy: full_incremental 
      2020-09-28 22:12:56,416 - root - INFO - Perform FULL BACKUP 
      rm: cannot remove '/data/backups/lock.lk': No such file or directory 
      2020-09-28 22:12:56,421 - root - INFO - performing config as backup archive was just created 
      2020-09-28 22:12:56,422 - root - INFO - config true, config needs to be performed 
      2020-09-28 22:12:56,422 - root - INFO - attempting to create repo: cb-example-2020-09-28T22_12_56 
      2020-09-28 22:12:56,437 - root - INFO - b'Backup repository `cb-example-2020-09-28T22_12_56` created successfull
      y in archive `/data/backups`\n' 
      2020-09-28 22:12:56,437 - root - INFO - attempt to write status 
      2020-09-28 22:12:56,437 - root - INFO - status should have been hopefully written to 
      2020-09-28 22:12:56,437 - root - INFO - attempting to query K8S objects 
      2020-09-28 22:12:56,438 - root - INFO - k8s config loaded 
      2020-09-28 22:12:56,438 - root - INFO - working in namespace: default 
      2020-09-28 22:12:56,476 - root - INFO - BACKUP start 
      2020-09-28 22:12:56,477 - root - INFO - backing up to: cb-example-2020-09-28T22_12_56 
      2020-09-28 22:13:00,649 - root - INFO - Starting Backup cleanup 
      2020-09-28 22:13:00,651 - root - INFO - Time 2020-09-28 22:13:00.651335, Retention 1 day, 0:00:00, Threshold 202
      0-09-27 22:13:00.651335 
      2020-09-28 22:13:00,651 - root - INFO - Considering repository cb-example-2020-09-28T22_12_56 ... 
      2020-09-28 22:13:00,651 - root - INFO - Creation time within threshold, continuing 
      2020-09-28 22:13:00,671 - root - INFO - Exiting Script, success. Output: {"location": "2020-09-28T22_12_56.52662
      3889Z", "duration_seconds": "4.082307783", "avg_data_transfer_rate_bytes_sec": 796108, "total_items": 7303, "tot
      al_items_size_bytes": 3250585, "buckets": {"beer-sample": {"mutations_backedup": "7303", "mutations_failed": "0"
      , "deletions_backedup": "0", "deletions_failed": "0"}}} 
      Directory '/data/scriptlogs' created successfully 
      Directory '/data/scriptlogs/incremental' created successfully 
      Directory '/data/scriptlogs/full_only' created successfully 
      Directory '/data/scriptlogs/restore' created successfully
       
      

      3) However, describing the backup resource doesn't show the repo name:

      • needs to note here that this seems to happen when the backup resource config is updated, such as changing the schedule, which caused the repo info to not be shown.

        kubectl describe couchbasebackups my-backup                                               
        Name:         my-backup 
        Namespace:    default 
        Labels:       cluster=cb-backup 
        Annotations:  <none> 
        API Version:  couchbase.com/v2 
        Kind:         CouchbaseBackup 
        Metadata: 
          Creation Timestamp:  2020-09-28T22:06:32Z 
          Generation:          5 
          Managed Fields: 
            API Version:  couchbase.com/v2 
            Fields Type:  FieldsV1 
            fieldsV1: 
              f:metadata: 
                f:labels: 
                  .: 
                  f:cluster: 
              f:spec: 
                .: 
                f:backOffLimit: 
                f:backupRetention: 
                f:failedJobsHistoryLimit: 
                f:full: 
                  .: 
                  f:schedule: 
                f:incremental: 
                  .: 
                  f:schedule: 
                f:logRetention: 
                f:size: 
                f:strategy: 
                f:successfulJobsHistoryLimit: 
            Manager:         kubectl 
            Operation:       Update 
            Time:            2020-09-28T22:13:48Z 
          Resource Version:  24795 
          Self Link:         /apis/couchbase.com/v2/namespaces/default/couchbasebackups/my-backup 
          UID:               af185437-df7d-478e-8c69-0b06739c7487 
        Spec: 
          Back Off Limit:             4 
          Backoff Limit:              2 
          Backup Retention:           24h 
          Failed Jobs History Limit:  10 
          Full: 
            Schedule:  */5 3 * * * 
          Incremental: 
            Schedule:                     */5 2 * * * 
          Log Retention:                  24h 
          Size:                           5Gi 
          Strategy:                       full_incremental 
          Successful Jobs History Limit:  10 
        Events: 
          Type    Reason           Age   From  Message 
          ----    ------           ----  ----  ------- 
          Normal  BackupStarted    10m         Backup `my-backup` started 
          Normal  BackupCompleted  10m         Backup `my-backup` completed

        4) We see the following error when creates a restore resource (restore.yaml)

        {"level":"error","ts":1601331434.5607915,"logger":"cluster","msg":"Reconciliation failed","cluster":"default/cb-example","error":"no corresponding CouchbaseBackup Repo found"

        5) Running an incremental-backup after the first backup still doesn't show the repo:

        2020-09-28 22:26:17,056 - root - INFO - epoch: 1601331977 
        2020-09-28 22:26:17,056 - root - INFO - timestamp: 2020-09-28T22_26_17 
        2020-09-28 22:26:17,056 - root - INFO - Namespace(backup_ret='24.00', cacert=None, cluster='cb-example', config=
        'false', end=None, log_ret='24.00', mode='backup', repo=None, start=None, strategy='full_incremental', verbosity
        ='INFO') 
        2020-09-28 22:26:17,056 - root - INFO - start logRetention check 
        2020-09-28 22:26:17,056 - root - INFO - removed 0 logs in /data/scriptlogs/incremental 
        2020-09-28 22:26:17,056 - root - INFO - mode: BACKUP 
        2020-09-28 22:26:17,056 - root - INFO - Strategy: full_incremental 
        2020-09-28 22:26:17,056 - root - INFO - Perform INCREMENTAL BACKUP 
        mkdir: cannot create directory ‘/data/backups’: File exists 
        rm: cannot remove '/data/backups/lock.lk': No such file or directory 
        2020-09-28 22:26:17,060 - root - INFO - backup archive /data/backups already exists 
        2020-09-28 22:26:17,060 - root - WARNING - an incremental job was first run. 
        2020-09-28 22:26:17,060 - root - WARNING - please make sure you alter your schedules so a full backup is perform
        ed first 
        2020-09-28 22:26:17,060 - root - INFO - attempting to query K8S objects 
        2020-09-28 22:26:17,061 - root - INFO - k8s config loaded 
        2020-09-28 22:26:17,061 - root - INFO - working in namespace: default 
        2020-09-28 22:26:17,103 - root - INFO - BACKUP start 
        2020-09-28 22:26:17,103 - root - INFO - backing up to: cb-example-2020-09-28T22_12_56 
        2020-09-28 22:26:21,130 - root - INFO - Starting Backup cleanup 
        2020-09-28 22:26:21,130 - root - INFO - Time 2020-09-28 22:26:21.130923, Retention 1 day, 0:00:00, Threshold 202
        0-09-27 22:26:21.130923 
        2020-09-28 22:26:21,131 - root - INFO - Considering repository cb-example-2020-09-28T22_12_56 ... 
        2020-09-28 22:26:21,131 - root - INFO - Creation time within threshold, continuing 
        2020-09-28 22:26:21,166 - root - INFO - Exiting Script, success. Output: {"location": "2020-09-28T22_26_17.15325
        6303Z", "duration_seconds": "3.939892963", "avg_data_transfer_rate_bytes_sec": 7280, "total_items": 0, "total_it
        ems_size_bytes": 28672, "buckets": {"beer-sample": {"mutations_backedup": "0", "mutations_failed": "0", "deletio
        ns_backedup": "0", "deletions_failed": "0"}}}

        It seems during the time the backup runs, the backup resource would show the repo names, but once the backup is finished the backup resource doesn't show.

       

      cbopinfo is attached

      Attachments

        1. backup.yaml
          0.4 kB
        2. cbopinfo-20200925T132537+0000.tar.gz
          2.45 MB
        3. cbopinfo-20200928T155936-0700.tar.gz
          775 kB
        4. cbopinfo-20201001T132628-0700.tar.gz
          790 kB
        5. cbopinfo-20201001T132628-0700.tar.gz
          790 kB
        6. describerestore.out
          3 kB
        7. restore.yaml
          0.2 kB
        8. restorelogs.out
          9 kB
        9. restorelogs.out
          9 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            daniel.ma Daniel Ma (Inactive)
            tin.tran Tin Tran (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty