Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-3125

Failure to rebalance after during recovery of ephermeral bucket

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.5.0
    • None
    • None
    • None
    • 2.5.0 - Time of our Lives
    • 2

    Description

      The test:

       

      TestAutoRecoveryEpehemeralWithNoAutofailover

      Is failing while waiting for a rebalance event:

       

      Error Messagecontext deadline exceeded: failed to wait for event RebalanceStarted/A rebalance has been started to balance data across the cluster 

      There's a rebalance failure in the logs, which might be related:

       

      Error formatting macro: code: java.lang.StackOverflowError

      "level":"info","ts":1690442522.2835317,"logger":"cluster","msg":"Reconciliation failed","cluster":"test-sft6t/test-couchbase-b7k7c","error":"failed to rebalance: timeout: unexpected rebalance error","stack":"github.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).verifyRebalance.func1\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/rebalance.go:282\ngithub.com/couchbase/couchbase-operator/pkg/util/retryutil.Retry\n\tgithub.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:14\ngithub.com/couchbase/couchbase-operator/pkg/util/retryutil.RetryFor\n\tgithub.com/couchbase/couchbase-operator/pkg/util/retryutil/retryutil.go:30\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).verifyRebalance\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/rebalance.go:307\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).rebalance\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/rebalance.go:246\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).handleRebalance\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1085\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).exec\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:307\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileMembers\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:265\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:175\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:490\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:533\ngithub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:91\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.13.1/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.13.1/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.13.1/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.13.1/pkg/internal/controller/controller.go:234"}

      {"level":"info","ts":1690442522.2903638,"logger":"cluster","msg":"Resource updated","cluster":"test-sft6t/test-couchbase-b7k7c","diff":"  (\n  \t\"\"\"\n  \t... // 32 identical lines\n  \t  type: Balanced\n  \t- lastTransitionTime: \"2023-07-27T07:20:09Z\"\n- \t  lastUpdateTime: \"2023-07-27T07:20:09Z\"\n- \t  message: 'reconcile was blocked from running: waiting for pod failover'\n+ \t  lastUpdateTime: \"2023-07-27T07:22:02Z\"\n+ \t  message: 'failed to rebalance: timeout: unexpected rebalance error'\n  \t  reason: ErrorEncountered\n  \t  status: \"True\"\n  \t... // 23 identical lines\n  \t\"\"\"\n  )\n"}

      CC: Justin Ashworth 

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            abhi.bose Abhi Bose (Inactive)
            gilad.kalchheim Gilad Kalchheim
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty