Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-2681

New Operator Instance is unauthorised to update Cluster status.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • 2.3.0
    • operator
    • None
    • 1

    Description

      This might not be a bug because the test case passes and might be a simple race condition, but I saw weird messages in the logs which was not reported in the Operator logs.

      TestCase: TestKillOperatorAndUpdateClusterConfig

      Couchbase Server Version: 7.0.3

      Operator Logs attached(not sure if they are of value because they don't contain any helpful data)

      Error messages in the Operator logs while the test case was running: 

      {"level":"error","ts":1647517815.3176866,"msg":"error retrieving resource lock test-89qxf/couchbase-operator: Unauthorized\n","stacktrace":"k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1.1\n\tk8s.io/client-go@v0.23.2/tools/leaderelection/leaderelection.go:272\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.WaitForWithContext\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:660\nk8s.io/apimachinery/pkg/util/wait.poll\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:594\nk8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:545\nk8s.io/apimachinery/pkg/util/wait.PollImmediateUntil\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:536\nk8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1\n\tk8s.io/client-go@v0.23.2/tools/leaderelection/leaderelection.go:271\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\tk8s.io/apimachinery@v0.23.2/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/leaderelection.(*LeaderElector).renew\n\tk8s.io/client-go@v0.23.2/tools/leaderelection/leaderelection.go:268\nk8s.io/client-go/tools/leaderelection.(*LeaderElector).Run\n\tk8s.io/client-go@v0.23.2/tools/leaderelection/leaderelection.go:212\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startLeaderElection.func3\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/internal.go:642"}
       
      {"level":"info","ts":1647517817.2948174,"msg":"failed to renew lease test-89qxf/couchbase-operator: timed out waiting for the condition\n"}
       
      {"level":"error","ts":1647517817.2948787,"msg":"error received after stop sequence was engaged","error":"leader election lost","stacktrace":"sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\tsigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/internal.go:541"}
       
      {"level":"debug","ts":1647517817.2949321,"logger":"events","msg":"Normal","object":
       
      {"kind":"ConfigMap","apiVersion":"v1"},"reason":"LeaderElection","message":"couchbase-operator-b774b44c6-lh8xp_74c90ce5-83ab-45d2-9a27-45b59af17463 stopped leading"}
       
      {"level":"debug","ts":1647517817.29499,"logger":"events","msg":"Normal","object":{"kind":"Lease","namespace":"test-89qxf","name":"couchbase-operator","uid":"b3a54459-dbee-44a7-aa4f-302b671e1107","apiVersion":"coordination.k8s.io/v1","resourceVersion":"4989"},"reason":"LeaderElection","message":"couchbase-operator-b774b44c6-lh8xp_74c90ce5-83ab-45d2-9a27-45b59af17463 stopped leading"}
       
      {"level":"error","ts":1647517817.3014927,"msg":"Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\".16dd28b9d4fb91b1\", GenerateName:\"\", Namespace:\"default\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:\"\", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"ConfigMap\", Namespace:\"\", Name:\"\", UID:\"\", APIVersion:\"v1\", ResourceVersion:\"\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"couchbase-operator-b774b44c6-lh8xp_74c90ce5-83ab-45d2-9a27-45b59af17463 stopped leading\", Source:v1.EventSource{Component:\"couchbase-operator-b774b44c6-lh8xp_74c90ce5-83ab-45d2-9a27-45b59af17463\", Host:\"\"}, FirstTimestamp:time.Date(2022, time.March, 17, 11, 50, 17, 294786993, time.Local), LastTimestamp:time.Date(2022, time.March, 17, 11, 50, 17, 294786993, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Unauthorized' (will not retry!)\n","stacktrace":"k8s.io/client-go/tools/record.recordToSink\n\tk8s.io/client-go@v0.23.2/tools/record/event.go:216\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartRecordingToSink.func1\n\tk8s.io/client-go@v0.23.2/tools/record/event.go:194\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1\n\tk8s.io/client-go@v0.23.2/tools/record/event.go:311"}
       
      {"level":"error","ts":1647517817.3104205,"msg":"Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"couchbase-operator.16dd28b9d4fbb521\", GenerateName:\"\", Namespace:\"test-89qxf\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:\"\", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"test-89qxf\", Name:\"couchbase-operator\", UID:\"b3a54459-dbee-44a7-aa4f-302b671e1107\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"4989\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"couchbase-operator-b774b44c6-lh8xp_74c90ce5-83ab-45d2-9a27-45b59af17463 stopped leading\", Source:v1.EventSource{Component:\"couchbase-operator-b774b44c6-lh8xp_74c90ce5-83ab-45d2-9a27-45b59af17463\", Host:\"\"}, FirstTimestamp:time.Date(2022, time.March, 17, 11, 50, 17, 294796065, time.Local), LastTimestamp:time.Date(2022, time.March, 17, 11, 50, 17, 294796065, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Unauthorized' (will not retry!)\n","stacktrace":"k8s.io/client-go/tools/record.recordToSink\n\tk8s.io/client-go@v0.23.2/tools/record/event.go:216\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartRecordingToSink.func1\n\tk8s.io/client-go@v0.23.2/tools/record/event.go:194\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1\n\tk8s.io/client-go@v0.23.2/tools/record/event.go:311"} 

      Jenkins Job(Green, because the test did not fail): http://qa.sc.couchbase.com/view/Cloud/job/k8s-cbop-gke-pipeline/553/console 

      And if you download the logs attached to the job, it won't have all information.

      I saw these messages 3-4 times out of 30 times I've ran this test on my local, so hopefully a race condition and hard to reproduce.

      I'm happy to mark this ticket as Not a bug, but if you could also confirm the same it'd be great. 

      Actual Operator logs of the Jenkins job: actualLogs.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            simon.murray Simon Murray
            prateek.kumar Prateek Kumar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty