Details
Description
Cluster Setup
- Kind cluster locally run on Mac
- 5 nodes with all services
- 1 bucket
- Initial Cluster version : 7.2.2-6401
- Upgrade Cluster version : 7.2.3-6705
- Downgrade Cluster version : 7.2.2-6401
Steps taken in the scenario
- Created a cluster
- Created 1 bucket
- Issued an upgrade from 7.2.2-6401 to 7.2.3-6705 using swap rebalance
- Swap rebalance for cb-example-0001 with cb-example-0005 completes.
- Hibernate the cluster.
- Wake up the cluster with 7.2.2 as image instead of 7.2.3.
- The cluster is never recovered.
The operator goes into a loop of
{"level":"debug","ts":"2024-08-07T10:45:31Z","logger":"api","msg":"http","cluster":"default/cb-example","method":"GET","url":"http://cb-example-0000.cb-example.default.svc:8091/pools/default","status":"200 OK","time_ms":4.649042} |
{"level":"error","ts":"2024-08-07T10:45:31Z","logger":"cluster","msg":"Failed to update members","cluster":"default/cb-example","error":"error extracting image verion","stacktrace":"github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:523\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:608\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:90\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"} |
{"level":"error","ts":"2024-08-07T10:45:31Z","logger":"cluster","msg":"Failed to rotate expired certificates","cluster":"default/cb-example","error":"TLS invalid: Attempted to check if certifiates are expired but TLS was never initialized","stacktrace":"github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:548\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:608\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:90\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"} |
Operator logs : https://cb-engineering.s3.amazonaws.com/K8S-3609/collectinfo-2024-08-07T104800-ns_1%40cb-example-0000.cb-example.default.svc.zip
Cluster logs : https://cb-engineering.s3.amazonaws.com/K8S-3609/cbopinfo-20240807T161714+0530.tar.gz
Couchbase deployment : https://cb-engineering.s3.amazonaws.com/K8S-3609/couchbase-cluster.yaml
The cao tool and operator images were built locally on this commit
commit f752305ba8574b4464efb7abb009a52a5560fc1b (HEAD -> 2.7.x, origin/2.7.x) |
Author: Yusuf Ramzan <yusuf.ramzan@couchbase.com> |
Date: Mon Aug 5 14:50:00 2024 +0100 K8S-3598 Fixed not all nodes ready for rebalance |
|
Change-Id: I78e2c3fc76ad8d848e86dac836469e75cfc92683
|
Reviewed-on: https://review.couchbase.org/c/couchbase-operator/+/213742 |
Tested-by: Build Bot <build@couchbase.com> |
Reviewed-by: <usamah.jassat@couchbase.com> |