Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Done
    • None
    • 1.1.0
    • kubernetes
    • None

    Description

      I was able to run a couchbase cluster using minikube following the instructions on our documentation. After deleting the operator and CRD using the below commands,I created the operator again (kubectl create -f operator.yaml), the operator was created but is not changed to running state.The logs show resource locks. Please see attached logs.

      • kubectl delete deployment couchbase-operator
      • kubectl delete crd couchbaseclusters.couchbase.com

      Below is a related issue i see in Jira where similar issue was not reproducible.

      https://issues.couchbase.com/browse/K8S-92

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          simon.murray Simon Murray added a comment -

          Looks like your Kubernetes API isn't responding, I'd have a look at the logs for the control plane and see if that shows the problem:

          [Fri 28 Sep 09:55:21 BST 2018] simon@symphony ~/go/src/github.com/couchbase/couchbase-operator kubectl get po -n kube-system
          NAME                                    READY   STATUS    RESTARTS   AGE
          etcd-minikube                           1/1     Running   0          9d
          kube-addon-manager-minikube             1/1     Running   13         38d
          kube-apiserver-minikube                 1/1     Running   0          9d
          kube-controller-manager-minikube        1/1     Running   0          9d
          kube-dns-86f4d74b45-fj66k               3/3     Running   50         38d
          kube-proxy-zrmth                        1/1     Running   0          9d
          kube-scheduler-minikube                 1/1     Running   0          9d
          kubernetes-dashboard-5498ccf677-9dv59   1/1     Running   37         38d
          storage-provisioner                     1/1     Running   37         38d
          

          I'd try the server logs for kube-apiserver-minikube to start with.  If that look okay, make sure you can ping the endpoint via the virtual IP 10.96.0.1 as it may be a network problem.

          simon.murray Simon Murray added a comment - Looks like your Kubernetes API isn't responding, I'd have a look at the logs for the control plane and see if that shows the problem: [Fri 28 Sep 09:55:21 BST 2018] simon@symphony ~/go/src/github.com/couchbase/couchbase-operator kubectl get po -n kube-system NAME READY STATUS RESTARTS AGE etcd-minikube 1/1 Running 0 9d kube-addon-manager-minikube 1/1 Running 13 38d kube-apiserver-minikube 1/1 Running 0 9d kube-controller-manager-minikube 1/1 Running 0 9d kube-dns-86f4d74b45-fj66k 3/3 Running 50 38d kube-proxy-zrmth 1/1 Running 0 9d kube-scheduler-minikube 1/1 Running 0 9d kubernetes-dashboard-5498ccf677-9dv59 1/1 Running 37 38d storage-provisioner 1/1 Running 37 38d I'd try the server logs for kube-apiserver-minikube to start with.  If that look okay, make sure you can ping the endpoint via the virtual IP 10.96.0.1 as it may be a network problem.

          Thank you Simon. I ran the command kubectl get po -n kibe-system and got the below output

          etcd-minikube                           1/1       Running            0          23h
          kube-addon-manager-minikube             1/1       Running            1          2d
          kube-apiserver-minikube                 1/1       Running            0          23h
          kube-controller-manager-minikube        1/1       Running            0          23h
          kube-dns-86f4d74b45-4xm2r               2/3       CrashLoopBackOff   437        2d
          kube-proxy-rtmxd                        0/1       Terminating        0          2d
          kube-scheduler-minikube                 1/1       Running            15         2d
          kubernetes-dashboard-5498ccf677-txnnv   0/1       CrashLoopBackOff   155        2d
          storage-provisioner                     0/1       CrashLoopBackOff   89         2d
          

          The operator did not move to running state because of issue with Kube-dns. I had to flush the DNS on my mac using the following command ( varies my mac OS version). 

          sudo killall -HUP mDNSResponder;
          sudo killall mDNSResponderHelper;
          sudo dscacheutil -flushcache;say MacOS DNS cache has been cleared
          

          After flushing the DNS, i deleted the operator , stopped minikube and started it again and was able to create the operator successfully.

          karthik.vijayraghavan Karthik Vijayraghavan (Inactive) added a comment - Thank you Simon. I ran the command kubectl get po -n kibe-system and got the below output etcd-minikube                           1 / 1       Running            0           23h kube-addon-manager-minikube             1 / 1       Running            1           2d kube-apiserver-minikube                 1 / 1       Running            0           23h kube-controller-manager-minikube        1 / 1       Running            0           23h kube-dns-86f4d74b45-4xm2r               2 / 3       CrashLoopBackOff   437         2d kube-proxy-rtmxd                        0 / 1       Terminating        0           2d kube-scheduler-minikube                 1 / 1       Running            15         2d kubernetes-dashboard-5498ccf677-txnnv   0 / 1       CrashLoopBackOff   155         2d storage-provisioner                     0 / 1       CrashLoopBackOff   89         2d The operator did not move to running state because of issue with Kube-dns. I had to flush the DNS on my mac using the following command ( varies my mac OS version).  sudo killall -HUP mDNSResponder; sudo killall mDNSResponderHelper; sudo dscacheutil -flushcache;say MacOS DNS cache has been cleared After flushing the DNS, i deleted the operator , stopped minikube and started it again and was able to create the operator successfully.
          anuj.sahni Anuj Sahni added a comment - - edited

          I have similar resource lock issue where kube-dns is cycling through endless loop. I tried flushing DNS, deleting operator, stopping minikube and starting minikube but same issue still persist.

           

          $ kubectl get pod  -n kube-system 
          NAME                                    READY   STATUS             RESTARTS   AGE
          coredns-c4cffd6dc-klzlv                 1/1     Running            29         23h
          etcd-minikube                           1/1     Running            0          4m
          kube-addon-manager-minikube             1/1     Running            2          9h
          kube-apiserver-minikube                 1/1     Running            0          4m
          kube-controller-manager-minikube        1/1     Running            0          4m
          kube-dns-86f4d74b45-mr7d6               2/3     Running            84         23h
          kube-proxy-6hrmf                        1/1     Running            0          3m
          kube-scheduler-minikube                 1/1     Running            1          16m
          kubernetes-dashboard-6f4cfc5d87-tspw5   0/1     CrashLoopBackOff   38         23h
          storage-provisioner                     0/1     CrashLoopBackOff   35         23h
           
           
          $ kubectl logs couchbase-operator-7fcbdf8f47-2k7v7
          time="2018-09-29T04:54:38Z" level=info msg="couchbase-operator v1.0.0 (release)" module=main
          time="2018-09-29T04:54:38Z" level=info msg="Obtaining resource lock" module=main
          time="2018-09-29T04:54:38Z" level=info msg="Starting event recorder" module=main
          time="2018-09-29T04:54:38Z" level=info msg="Attempting to be elected the couchbase-operator leader" module=main
          E0929 04:55:08.280043       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          E0929 04:55:41.733591       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          E0929 04:56:15.993343       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          E0929 04:56:49.589367       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          E0929 04:57:22.641679       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          E0929 04:57:55.661951       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          E0929 04:58:29.313115       1 leaderelection.go:224] error retrieving resource lock default/couchbase-operator: Get https://10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout
          

           

           

          Any help to resolve kube-dns issue will be appreciated. 

           

          anuj.sahni Anuj Sahni added a comment - - edited I have similar resource lock issue where kube-dns is cycling through endless loop. I tried flushing DNS, deleting operator, stopping minikube and starting minikube but same issue still persist.   $ kubectl get pod  -n kube-system  NAME                                    READY   STATUS             RESTARTS   AGE coredns-c4cffd6dc-klzlv                 1 / 1     Running            29         23h etcd-minikube                           1 / 1     Running            0           4m kube-addon-manager-minikube             1 / 1     Running            2           9h kube-apiserver-minikube                 1 / 1     Running            0           4m kube-controller-manager-minikube        1 / 1     Running            0           4m kube-dns-86f4d74b45-mr7d6               2 / 3     Running            84         23h kube-proxy-6hrmf                        1 / 1     Running            0           3m kube-scheduler-minikube                 1 / 1     Running            1           16m kubernetes-dashboard-6f4cfc5d87-tspw5   0 / 1     CrashLoopBackOff   38         23h storage-provisioner                     0 / 1     CrashLoopBackOff   35         23h     $ kubectl logs couchbase-operator-7fcbdf8f47-2k7v7 time= "2018-09-29T04:54:38Z" level=info msg= "couchbase-operator v1.0.0 (release)" module=main time= "2018-09-29T04:54:38Z" level=info msg= "Obtaining resource lock" module=main time= "2018-09-29T04:54:38Z" level=info msg= "Starting event recorder" module=main time= "2018-09-29T04:54:38Z" level=info msg= "Attempting to be elected the couchbase-operator leader" module=main E0929 04 : 55 : 08.280043       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout E0929 04 : 55 : 41.733591       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout E0929 04 : 56 : 15.993343       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout E0929 04 : 56 : 49.589367       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout E0929 04 : 57 : 22.641679       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout E0929 04 : 57 : 55.661951       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout E0929 04 : 58 : 29.313115       1 leaderelection.go: 224 ] error retrieving resource lock default /couchbase-operator: Get https: //10.96.0.1:443/api/v1/namespaces/default/endpoints/couchbase-operator: dial tcp 10.96.0.1:443: i/o timeout     Any help to resolve kube-dns issue will be appreciated.   

          I'm going to close this issue since it is a problem with the Kubernetes environment and not the operator. If you are still having this issue please send an email to the kubernetes mailing list and if someone has seen it before they can recommend steps to get your cluster back into a healthy state.

          mikew Mike Wiederhold [X] (Inactive) added a comment - I'm going to close this issue since it is a problem with the Kubernetes environment and not the operator. If you are still having this issue please send an email to the kubernetes mailing list and if someone has seen it before they can recommend steps to get your cluster back into a healthy state.

          People

            mikew Mike Wiederhold [X] (Inactive)
            karthik.vijayraghavan Karthik Vijayraghavan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty