Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-3596

panic: SIGSEGV in invalid memory address or nil pointer dereference in operator code

    XMLWordPrintable

Details

    • 15 - First Frontier, 16 - Killing Time
    • 2

    Description

      Cluster Setup

      • Kind cluster locally run on Mac
      • 3 nodes in cluster
      • 8 buckets
      • Initial Cluster version : 7.6.0-2176
      • Upgrade Cluster version : 7.6.1-3200

      Steps taken in the scenario

      • Created a cluster
      • Created 8 buckets
      • Issued an upgrade with delta recovery
      • When one of the nodes was picked up for failover, manually deleted the pod

      $kubectl delete pod cb-example-0001

      * Operator fails with a panic and restarts.

      • The upgrade then goes on with the new restarted pod

      Issue

      {"level":"info","ts":"2024-07-24T17:34:04Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"couchbase-controller","object":{"name":"cb-example","namespace":"default"},"namespace":"default","name":"cb-example","reconcileID":"c7cc48b9-9718-4b66-8e42-f88780a6fd3e"}panic: runtime error: invalid memory address or nil pointer dereference [recovered]        panic: runtime error: invalid memory address or nil pointer dereference[signal SIGSEGV: segmentation violation code=0x1 addr=0x70 pc=0x11fee90]
      goroutine 16 [running]:sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()        sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:116 +0x1a4panic({0x146f460?, 0x26d88e0?})        runtime/panic.go:770 +0x124github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).checkOrchestratorOnLatestVersion.func1()        github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1299 +0x1a0github.com/couchbase/couchbase-operator/pkg/util/retryutil.Retry({0x1988c30, 0x4001026000}, 0xdf8475800?, 0x400015eab8)        github.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:14 +0x70github.com/couchbase/couchbase-operator/pkg/util/retryutil.RetryFor(0x0?, 0x4000cd8ab8)        github.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:30 +0x5cgithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).checkOrchestratorOnLatestVersion(0x40010265b0?, 0x400080b5c0?, {0x400102bd39?, 0x4000da1ad0?})        github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1306 +0x50github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).recreateAndRebalanceNode(0x4000e2fa40, 0x40007000e0, {0x1993d18, 0x4000c5c900}, {0x400102bd39, 0x5}, 0x0?)        github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1340 +0x258github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).handleInPlaceUpgrade(0x4000e2fa40, 0x40007000e0, 0x40007c7c80, {0x400102bd39, 0x5})        github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1414 +0x32cgithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).handleUpgradeNode(0x4000e2fa40, 0x40007000e0)        github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1579 +0x5c4github.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).exec(0x4000e2fa40, 0x40007000e0)        github.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:323 +0x164github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileMembers(...)        github.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:266github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile(0x40007000e0)        github.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:173 +0x734github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile(0x40007000e0)        github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:544 +0x440github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update(0x40007000e0, 0x4000b93608)        github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:591 +0x2d0github.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile(0x40000e2c00, {0x0?, 0x0?}, {{{0x4000b11610, 0x7}, {0x4000b11606, 0xa}}})        github.com/couchbase/couchbase-operator/pkg/controller/controller.go:90 +0x560sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x198d340?, {0x1988b88?, 0x4000da18c0?}, {{{0x4000b11610?, 0xb?}, {0x4000b11606?, 0x0?}}})        sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119 +0x8csigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0x40003c8960, {0x1988bc0, 0x40003622d0}, {0x1526a20, 0x4000a986c0})        sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316 +0x2dcsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0x40003c8960, {0x1988bc0, 0x40003622d0})        sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266 +0x198sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()        sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227 +0x74created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 103        sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:223 +0x404


      Operator logs : 

      Before restart with panic - https://cb-engineering.s3.amazonaws.com/K8S-3596/operator.log

      Post restart -  https://cb-engineering.s3.amazonaws.com/K8S-3596/cbopinfo-20240724T232125+0530.tar.gz

      Cluster logs : 
      https://cb-engineering.s3.amazonaws.com/K8S-3596/collectinfo-2024-07-24T175020-ns_1%40cb-example-0000.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3596/collectinfo-2024-07-24T175020-ns_1%40cb-example-0001.cb-example.default.svc.zip
      https://cb-engineering.s3.amazonaws.com/K8S-3596/collectinfo-2024-07-24T175020-ns_1%40cb-example-0002.cb-example.default.svc.zip


       The cao tool and operator images were built locally on this commit

      commit c2e920ddbcfa9b4819d47ad81d0a35c359dd1dc6 (HEAD -> master, origin/master, origin/HEAD)
      Author: usamah jassat <usamah.jassat@couchbase.com>
      Date:   Wed Jul 17 15:11:19 2024 +0100    K8S-3581: don't attempt backend migration when rebalance required
          
          Change-Id: I2d2b6d6d4f8dbb0a30db5bd54a05631d17631eee
          Reviewed-on: https://review.couchbase.org/c/couchbase-operator/+/212890
          Reviewed-by: Yusuf Ramzan <yusuf.ramzan@couchbase.com>
          Tested-by: Build Bot <build@couchbase.com>

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ben.mottershead Ben Mottershead
            raghav.sk Raghav S K
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty