Description
If you manually rebalance a node out of a cluster, the operator will keep checking for its existence and never add it back in.
Eventually the round-robin of REST requests will hit the removed node and fail with 404s (as the node is not in a cluster).
Snippet of operator logs:
time="2017-12-31T00:42:16Z" level=info msg="Start reconciling" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:16Z" level=info msg="server config all_services: cb-example-0000,cb-example-0001" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:16Z" level=info msg="running members: cb-example-0000,cb-example-0001,cb-example-0002" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:16Z" level=info msg="cluster membership: cb-example-0001,cb-example-0002,cb-example-0000" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:16Z" level=info msg="active nodes: cb-example-0000,cb-example-0001" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:16Z" level=info msg="unknown nodes: cb-example-0002" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:16Z" level=info msg="Finish reconciling" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:24Z" level=info msg="Start reconciling" cluster-name=cb-example module=cluster
|
time="2017-12-31T00:42:24Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:29Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:34Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:39Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:44Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:49Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:54Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:42:59Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:04Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:09Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:14Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:19Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:24Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:29Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:34Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:39Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:44Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:49Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:54Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:43:59Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:44:04Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:44:09Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
time="2017-12-31T00:44:14Z" level=warning msg="cluster status: failed with error Code: 404, Error: ...retrying" cluster-name=cb-example module=retryutil
|
Attachments
Issue Links
- is duplicated by
-
K8S-106 TestNodeManualFailover is failing due to a 404 error when trying to connect to the first node
-
- Resolved
-
We don't allow manual removal of nodes and don't have plans to support it right now. We can revisit this topic in the future, but at the moment we want Kubernetes to control the cluster entirely. Users should not generally be making any changes to the cluster other than updating the Kubernetes configuration.