Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Won't Do
Priority: Minor
Fix Version/s: not-targeted
Affects Version/s: None
Component/s: kubernetes
Labels:
- releasenote

Description

Set antiaffinity to true in couchbase-cluster.yaml.
There is one master node and 2 worker nodes.

[root@ip-172-31-7-110 couchbase-operator]# kubectl get nodes

NAME                                         STATUS    ROLES     AGE       VERSION

ip-172-31-1-197.us-east-2.compute.internal   Ready     <none>    57d       v1.10.2

ip-172-31-6-25.us-east-2.compute.internal    Ready     <none>    57d       v1.10.2

ip-172-31-7-110.us-east-2.compute.internal   Ready     master    57d       v1.10.2

* There are three pods to be scheduled as mentioned in couchbase-cluster.yaml:

servers:

    - size: 3

      name: all_services

      services:

        - data

        - index

        - query

        - search

        - eventing

        - analytics

* Since anti-affinity is set to true, there is no worker node for the third pod to be scheduled and the third pod fails to be scheduled as expected. The logs also print out messages appropriately indicating this behavior:

time="2018-06-29T05:24:57Z" level=info msg="Finish reconciling" cluster-name=cb-example module=cluster

time="2018-06-29T05:24:57Z" level=error msg="failed to reconcile: Failed to add new node to cluster: unable to schedule pod: 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules." cluster-name=cb-example module=cluster

* The first two pods are scheduled properly and are up and running:

[root@ip-172-31-7-110 couchbase-operator]# kubectl get pods -o wide

NAME                                  READY     STATUS    RESTARTS   AGE       IP          NODE

cb-example-0000                       1/1       Running   0          12m       10.44.0.5   ip-172-31-6-25.us-east-2.compute.internal

cb-example-0001                       1/1       Running   0          12m       10.36.0.2   ip-172-31-1-197.us-east-2.compute.internal

couchbase-operator-5d7dfb795f-wthfr   1/1       Running   0          2d        10.36.0.1   ip-172-31-1-197.us-east-2.compute.internal

* However, after logging into the UI, it appears that the operator left the cluster in an inconsistent state (pending rebalance) as shown below:

Question: Since 2 pods out of 3 pods are scheduled successfully due to node availability, isn't the operator expected to manage those 2 pods correctly? Is the cluster expected to be left in this state?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

inconsistent.png
309 kB
28/Jun/18 10:37 PM

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Simon Murray

Reporter:: Sindhura Palakodety (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 28/Jun/18 10:39 PM

Updated:: 24/Oct/22 9:23 AM

Resolved:: 24/Oct/22 9:23 AM

Gerrit Reviews

There are no open Gerrit changes

Ability to Balance in Nodes on Provision Failure

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty