Details
Description
Observed in MB-32921 logs, the NC Is unable to decrypt its key due to the openssl env issue. As a result, the NC crashes while attempting to join the cluster, getting into a restart loop. After 42+ hours of (12K+) repeated rejoins, the CC exits with OOM: GC overhead limit exceeded.
Each register loop looks like this on the CC:
2019-02-03T01:32:39.545-08:00 WARN CBAS.cluster.NodeManager [Worker:ClusterController] +addNode: 0d54f53fda3750763cc6e6518cf63ece
|
2019-02-03T01:32:39.545-08:00 WARN CBAS.cluster.NodeManager [Worker:ClusterController] Node '0d54f53fda3750763cc6e6518cf63ece' is already registered; failing the node then re-registering.
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.cluster.NodeManager [Worker:ClusterController] 0d54f53fda3750763cc6e6518cf63ece considered dead
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.bootstrap.ClusterLifecycleListener [Worker:ClusterController] NC: 0d54f53fda3750763cc6e6518cf63ece left
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.utils.ClusterStateManager [Worker:ClusterController] Removing configuration parameters for node id 0d54f53fda3750763cc6e6518cf63ece
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.utils.ClusterStateManager [Worker:ClusterController] ignoring update to same cluster state of ACTIVE
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.cluster.NodeManager [Worker:ClusterController] adding node to registry
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.cluster.NodeManager [Worker:ClusterController] updating cluster capacity
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.work.RegisterNodeWork [Worker:ClusterController] registered node: 0d54f53fda3750763cc6e6518cf63ece
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.bootstrap.ClusterLifecycleListener [Worker:ClusterController] NC: 0d54f53fda3750763cc6e6518cf63ece joined
|
2019-02-03T01:32:39.545-08:00 INFO CBAS.utils.ClusterStateManager [Worker:ClusterController] Registering configuration parameters for node id 0d54f53fda3750763cc6e6518cf63ece
|
Attachments
Issue Links
- is caused by
-
MB-33500 [CX] IPCHandles are leaked on node rebalance out
- Closed