Details
-
Bug
-
Resolution: Fixed
-
Critical
-
2.7.0
-
Initial Cluster version : 7.6.0-2176
Upgrade Cluster version : 7.6.1-3200
Kubernetes Version : v1.30.0
CAO and operator : 2.7.0 built locally
Environment : Kind cluster
-
15 - First Frontier, 16 - Killing Time
-
2
Description
Cluster Setup
- Kind cluster locally run on Mac
- 3 nodes in cluster
- 8 buckets
- Initial Cluster version : 7.6.0-2176
- Upgrade Cluster version : 7.6.1-3200
Steps taken in the scenario
- Created a cluster
- Created 8 buckets
- Issued an upgrade with delta recovery
- When one of the nodes was picked up for failover, manually deleted the pod
$kubectl delete pod cb-example-0001 |
- The upgrade then goes on with the new restarted pod
Issue
{"level":"info","ts":"2024-07-24T17:34:04Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"couchbase-controller","object":{"name":"cb-example","namespace":"default"},"namespace":"default","name":"cb-example","reconcileID":"c7cc48b9-9718-4b66-8e42-f88780a6fd3e"}panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference[signal SIGSEGV: segmentation violation code=0x1 addr=0x70 pc=0x11fee90] |
goroutine 16 [running]:sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1() sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:116 +0x1a4panic({0x146f460?, 0x26d88e0?}) runtime/panic.go:770 +0x124github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).checkOrchestratorOnLatestVersion.func1() github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1299 +0x1a0github.com/couchbase/couchbase-operator/pkg/util/retryutil.Retry({0x1988c30, 0x4001026000}, 0xdf8475800?, 0x400015eab8) github.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:14 +0x70github.com/couchbase/couchbase-operator/pkg/util/retryutil.RetryFor(0x0?, 0x4000cd8ab8) github.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:30 +0x5cgithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).checkOrchestratorOnLatestVersion(0x40010265b0?, 0x400080b5c0?, {0x400102bd39?, 0x4000da1ad0?}) github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1306 +0x50github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).recreateAndRebalanceNode(0x4000e2fa40, 0x40007000e0, {0x1993d18, 0x4000c5c900}, {0x400102bd39, 0x5}, 0x0?) github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1340 +0x258github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).handleInPlaceUpgrade(0x4000e2fa40, 0x40007000e0, 0x40007c7c80, {0x400102bd39, 0x5}) github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1414 +0x32cgithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).handleUpgradeNode(0x4000e2fa40, 0x40007000e0) github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1579 +0x5c4github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).exec(0x4000e2fa40, 0x40007000e0) github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:323 +0x164github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileMembers(...) github.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:266github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile(0x40007000e0) github.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:173 +0x734github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile(0x40007000e0) github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:544 +0x440github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update(0x40007000e0, 0x4000b93608) github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:591 +0x2d0github.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile(0x40000e2c00, {0x0?, 0x0?}, {{{0x4000b11610, 0x7}, {0x4000b11606, 0xa}}}) github.com/couchbase/couchbase-operator/pkg/controller/controller.go:90 +0x560sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x198d340?, {0x1988b88?, 0x4000da18c0?}, {{{0x4000b11610?, 0xb?}, {0x4000b11606?, 0x0?}}}) sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119 +0x8csigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0x40003c8960, {0x1988bc0, 0x40003622d0}, {0x1526a20, 0x4000a986c0}) sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316 +0x2dcsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0x40003c8960, {0x1988bc0, 0x40003622d0}) sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266 +0x198sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2() sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227 +0x74created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 103 sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:223 +0x404 |
Operator logs :
Before restart with panic - https://cb-engineering.s3.amazonaws.com/K8S-3596/operator.log
Post restart - https://cb-engineering.s3.amazonaws.com/K8S-3596/cbopinfo-20240724T232125+0530.tar.gz
Cluster logs :
https://cb-engineering.s3.amazonaws.com/K8S-3596/collectinfo-2024-07-24T175020-ns_1%40cb-example-0000.cb-example.default.svc.zip
https://cb-engineering.s3.amazonaws.com/K8S-3596/collectinfo-2024-07-24T175020-ns_1%40cb-example-0001.cb-example.default.svc.zip
https://cb-engineering.s3.amazonaws.com/K8S-3596/collectinfo-2024-07-24T175020-ns_1%40cb-example-0002.cb-example.default.svc.zip
The cao tool and operator images were built locally on this commit
commit c2e920ddbcfa9b4819d47ad81d0a35c359dd1dc6 (HEAD -> master, origin/master, origin/HEAD)
|
Author: usamah jassat <usamah.jassat@couchbase.com> |
Date: Wed Jul 17 15:11:19 2024 +0100 K8S-3581: don't attempt backend migration when rebalance required |
|
Change-Id: I2d2b6d6d4f8dbb0a30db5bd54a05631d17631eee
|
Reviewed-on: https://review.couchbase.org/c/couchbase-operator/+/212890 |
Reviewed-by: Yusuf Ramzan <yusuf.ramzan@couchbase.com> |
Tested-by: Build Bot <build@couchbase.com> |
Attachments
For Gerrit Dashboard: K8S-3596 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
213929,1 | K8S-3596: Add nil check on orchestratorMember before checking its version | master | couchbase-operator | Status: ABANDONED | 0 | +1 |
214377,6 | K8S-3596: Update orchestrator name check with contains and add nil safeguard | master | couchbase-operator | Status: MERGED | +2 | +1 |