Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-513

RZA: Anti-affinity is not working based on number of K8S nodes

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Test Blocker
    • Resolution: Fixed
    • 1.0.0
    • 1.0.0
    • testing

    Description

      TestCase: TestRzaAntiAffinityOn

      4 K8S nodes are present with 3 server groups defined.

      With AA turned on for 4 node couchbase cluster, 4 pods should be scheduled on each node irrespective of the server groups.

      Attachments

        For Gerrit Dashboard: K8S-513
        # Subject Branch Project Status CR V

        Activity

          simon.murray Simon Murray added a comment -

          This is working as designed.  You are not allowed to schedule a pod on the master node due to the:

          node-role.kubernetes.io/master: ""

          label.  The test even checks for the member creation event to fail:

          message: New member test-couchbase-cd7cs-0003 creation failed

          Which it does.  Not entirely sure why this is a bug?

           

          simon.murray Simon Murray added a comment - This is working as designed.  You are not allowed to schedule a pod on the master node due to the: node-role.kubernetes.io/master: "" label.  The test even checks for the member creation event to fail: message: New member test-couchbase-cd7cs-0003 creation failed Which it does.  Not entirely sure why this is a bug?  

          But the normal AA=on case is scheduling the nodes on master node as well.

          Master node scheduling is allowed if the Taint dedicated=master:NoSchedule on master node is removed.

          Pod scheduling status during AA without server groups defined, (test case: TestAntiAffinityOn)

          Ashwins-MacBook-Pro:couchbase-operator]$ kubectl get pods -o wide
          NAME                                  READY     STATUS    RESTARTS   AGE       IP          NODE
          couchbase-operator-5bdf548959-jj5n8   1/1       Running   0          3m        10.42.0.1   k8s-worker-1
          test-couchbase-7hptl-0000             1/1       Running   0          3m        10.44.0.1   k8s-worker-3
          test-couchbase-7hptl-0001             1/1       Running   0          2m        10.36.0.1   k8s-worker-2
          test-couchbase-7hptl-0002             1/1       Running   0          1m        10.42.0.2   k8s-worker-1
          test-couchbase-7hptl-0003             1/1       Running   0          1m        10.32.0.3   k8s-master-1

          Attaching log file cbopinfo-20180806T144634+0530_OnlyAA.tar.gz

          ashwin.govindarajulu Ashwin Govindarajulu added a comment - But the normal AA=on case is scheduling the nodes on master node as well. Master node scheduling is allowed if the Taint dedicated=master:NoSchedule on master node is removed. Pod scheduling status during AA without server groups defined, (test case: TestAntiAffinityOn) Ashwins-MacBook-Pro:couchbase-operator]$ kubectl get pods -o wide NAME                                  READY     STATUS    RESTARTS   AGE       IP          NODE couchbase-operator-5bdf548959-jj5n8   1/1       Running   0          3m        10.42.0.1   k8s-worker-1 test-couchbase-7hptl-0000             1/1       Running   0          3m        10.44.0.1   k8s-worker-3 test-couchbase-7hptl-0001             1/1       Running   0          2m        10.36.0.1   k8s-worker-2 test-couchbase-7hptl-0002             1/1       Running   0          1m        10.42.0.2   k8s-worker-1 test-couchbase-7hptl-0003             1/1       Running   0          1m        10.32.0.3   k8s-master-1 Attaching log file cbopinfo-20180806T144634+0530_OnlyAA.tar.gz
          simon.murray Simon Murray added a comment -

          Even with the taint removed, your problem is that the master is in RzaGroup-2, however the scheduler sorts the names and picks the one with the "smallest" name, in this case RzaGroup-1.  The reason is that the operator will always do the same thing regardless of the ordering of the zones in the configuration.   So we are working as designed.

          We have no visibility of the number of tagged nodes.  In fact in most deployments we wouldn't be given access to this information as we can then see the entire cluster, what it's labeled as, what docker images are installed on it...  This is far to big a security risk to let us have access in the common case.

          simon.murray Simon Murray added a comment - Even with the taint removed, your problem is that the master is in RzaGroup-2, however the scheduler sorts the names and picks the one with the "smallest" name, in this case RzaGroup-1.  The reason is that the operator will always do the same thing regardless of the ordering of the zones in the configuration.   So we are working as designed. We have no visibility of the number of tagged nodes.  In fact in most deployments we wouldn't be given access to this information as we can then see the entire cluster, what it's labeled as, what docker images are installed on it...  This is far to big a security risk to let us have access in the common case.

          Fixed the testcase accordingly. Test case passes now.

          So closing the bug

          ashwin.govindarajulu Ashwin Govindarajulu added a comment - Fixed the testcase accordingly. Test case passes now. So closing the bug

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty