Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-3605

Upgrade swap rebalance is re-tried with different params on operator pod deletion

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • 2.8.0
    • 2.7.0
    • operator
    • Initial Cluster version : 7.6.0-2176
      Upgrade Cluster version : 7.6.1-3200
      Kubernetes Version : v1.30.0
      CAO and operator : 2.7.0 built locally
      Environment : Kind cluster
    • 17 -Timetrap
    • 2

    Description

      Cluster Setup

      • Kind cluster locally run on Mac
      • 7 nodes with services kv, index, n1ql
      • 6 buckets
      • Initial Cluster version : 7.6.0-2176
      • Upgrade Cluster version : 7.6.1-3200

      Steps taken in the scenario

      • Created a cluster
      • Created 6 buckets
      • Issued an upgrade from 7.6.0-2176 to 7.6.1-3200 using swap rebalance
      • After one pod is successfully upgraded, stopped/failed the rebalance on the second pod upgrade swap rebalance. (cb-example-0002 was ejected and cb-example-0008 was the upgraded pod)
      • Also deleted the operator pod.
      • When the new operator pod comes back, rebalance is re-tried with different params.
      • cb-example-0000 was ejected and cb-example-0008 was added as the upgraded pod.

      Rebalance before operator pod restart

       

      Rebalance post operator pod restart

      Issue

      • The upgrade stopped to behave like a swap rebalance that was intended with operator pod kill.
      • If 3 buckets are already swapped between cb-example-0002 and cb-example-0008 in the first rebalance which then failed, the remaining 3 should be rebalanced between the same pods.
      • By ejecting a new pod data, the existing data on cb-example-0008 has to be deleted and a new swap rebalance is started. This nullifies the advantages of the swap rebalance.
      • When the number of buckets are high and the data size is in terrabytes, this causes a huge performance deterioration.
      • The rebalance should be retried with the same configurations

      Operator logs : https://cb-engineering.s3.amazonaws.com/K8S-3605/cbopinfo-20240801T185041+0530.tar.gz

      Cluster logs : 
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0007.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0008.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0009.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0010.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0011.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0012.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3605/collectinfo-2024-08-01T132027-ns_1%40cb-example-0011.cb-example.default.svc.zip
      Cluster deployment : https://cb-engineering.s3.amazonaws.com/K8S-3605/couchbase-cluster.yaml

      Bucket deployment : https://cb-engineering.s3.amazonaws.com/K8S-3605/couchbase-buckets.yaml

      Cluster upgrade deployment : https://cb-engineering.s3.amazonaws.com/K8S-3605/couchbase-cluster-upgrade.yaml


        The cao tool and operator images were built locally on this commit

      commit c6c620990c0d8a42f11b9081da4722dbf2e72595 (HEAD -> 2.7.x, origin/2.7.x)
      Author: usamah jassat <usamah.jassat@couchbase.com>
      Date:   Thu Jul 25 15:20:23 2024 +0100    K8S-3540: Rename delta recovery remnants
          
          Change-Id: Idd31502edc841972a8f10d3af1156b959fc9b44b
          Reviewed-on: https://review.couchbase.org/c/couchbase-operator/+/213311
          Reviewed-by: Justin Ashworth <justin.ashworth@couchbase.com>
          Tested-by: Build Bot <build@couchbase.com>
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            yusuf.ramzan Yusuf Ramzan
            raghav.sk Raghav S K
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty