Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
1.0.0
-
K8s cluster running on AWS
Description
A node was deleted from couchbase cluster running k8s, running on AWS.
$ kubectl get pods --watch
|
NAME READY STATUS RESTARTS AGE
|
cb-op-aks-demo-0000 1/1 Running 0 1d |
cb-op-aks-demo-0001 0/1 Terminating 0 1d |
cb-op-aks-demo-0002 1/1 Running 0 1d |
cb-op-aks-demo-0003 1/1 Running 0 1d |
cb-op-aks-demo-0004 1/1 Running 0 1d |
couchbase-operator-5566bd4b67-rmllv 1/1 Running 0 1d |
As expected new pod with same name is spun up, added to the cluster and rebalance operation is performed.
cb-op-aks-demo-0001 0/1 Terminating 0 1d |
cb-op-aks-demo-0001 0/1 Terminating 0 1d |
cb-op-aks-demo-0001 0/1 Pending 0 0s |
cb-op-aks-demo-0001 0/1 Pending 0 0s |
cb-op-aks-demo-0001 0/1 ContainerCreating 0 0s |
cb-op-aks-demo-0001 0/1 Running 0 19s |
cb-op-aks-demo-0001 1/1 Running 0 30s |
^C%
|
10:42:22 ✘ ram.dhakne@Rams-MBP ...work/k8s/cbaws ⬗ 1507.k8s 2m26s |
$ kubectl get pods
|
NAME READY STATUS RESTARTS AGE
|
cb-op-aks-demo-0000 1/1 Running 0 1d |
cb-op-aks-demo-0001 1/1 Running 0 2m |
cb-op-aks-demo-0002 1/1 Running 0 1d |
cb-op-aks-demo-0003 1/1 Running 0 1d |
cb-op-aks-demo-0004 1/1 Running 0 1d |
couchbase-operator-5566bd4b67-rmllv 1/1 Running 0 1d |
|
In the operator logs we can see same events and actions happening
ime="2018-09-13T17:40:13Z" level=info msg="failed nodes: cb-op-aks-demo-0001" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:13Z" level=info msg="is rebalancing: false" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:13Z" level=info msg="needs rebalance: true" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:15Z" level=info msg="An auto-failover has taken place" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:15Z" level=info msg="Creating a pod (cb-op-aks-demo-0001) running Couchbase enterprise-5.5.1" cluster-name=cb-op-a |
ks-demo module=cluster
|
time="2018-09-13T17:40:42Z" level=error msg="failed to reconcile: recovering node http://cb-op-aks-demo-0001.cb-op-aks-demo.default.svc:8091" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="server config data: cb-op-aks-demo-0000,cb-op-aks-demo-0002" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="running members: cb-op-aks-demo-0003,cb-op-aks-demo-0004,cb-op-aks-demo-0000,cb-op-aks-demo-0001,cb-op-aks-demo-0002" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="cluster membership: cb-op-aks-demo-0002,cb-op-aks-demo-0003,cb-op-aks-demo-0004,cb-op-aks-demo-0000,cb-op-aks-demo-0001" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="active nodes: cb-op-aks-demo-0000,cb-op-aks-demo-0002,cb-op-aks-demo-0003,cb-op-aks-demo-0004" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="add back: cb-op-aks-demo-0001" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="is rebalancing: false" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:50Z" level=info msg="needs rebalance: true" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:52Z" level=info msg="Add back node `cb-op-aks-demo-0001` is being marked for delta recovery" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:40:56Z" level=info msg="Rebalance progress: 0.000000" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:41:00Z" level=info msg="Rebalance progress: 7.741292" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:41:04Z" level=info msg="Rebalance progress: 32.897566" cluster-name=cb-op-aks-demo module=cluster |
time="2018-09-13T17:41:08Z" level=info msg="Rebalance progress: 58.060721" cluster-name=cb-op-aks-demo module=cluster |
|