Details
Description
Summary
-> Had a 7.1 cluster of nodes = 172.23.137.248, 172.23.137.249, 172.23.137.251, 172.23.137.253
(.248 and .249 had analytics service running)
The cluster was using x509 certs and had TLS enforced.
-> Failover .248 node. And add it back.
This rebalance appeared to have completed, but the UI's rebalance button got enabled again. Checking the UI logs observed that:
Analytics Service unable to successfully rebalance 4007071453d0c97827b5cbe0b0339b1a due to 'java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)'; see analytics_info.log for details
|
on .249 analytics_error.log
2022-01-20T04:02:07.827-08:00 ERRO CBAS.rebalance.Rebalance [Executor-10:ClusterController] Rebalance 4007071453d0c97827b5cbe0b0339b1a failed
|
java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)
|
at com.couchbase.analytics.control.rebalance.Rebalance.ensureNodesClusterActive(Rebalance.java:526) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.adjustClusterBeforeRebalance(Rebalance.java:683) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:205) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:166) [cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:84) [cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:27) [cbas-connector-7.1.0-2108.jar:7.1.0-2108]
|
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
|
at java.lang.Thread.run(Thread.java:829) [?:?]
|
2022-01-20T04:02:08.101-08:00 ERRO CBAS.servlet.RebalanceServlet [HttpExecutor(port:9111)-8] Rebalance 4007071453d0c97827b5cbe0b0339b1a failed
|
java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)
|
at com.couchbase.analytics.control.rebalance.Rebalance.ensureNodesClusterActive(Rebalance.java:526) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.adjustClusterBeforeRebalance(Rebalance.java:683) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:205) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:166) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:84) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
|
at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:27) ~[cbas-connector-7.1.0-2108.jar:7.1.0-2108]
|
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
|
at java.lang.Thread.run(Thread.java:829) [?:?]
|
2022-01-20T04:07:40.248-08:00 ERRO CBAS.tcp.TCPEndpoint [TCPEndpoint IO Thread [/0:0:0:0:0:0:0:0:9116]] Unexpected tcp io error in connection TCPConnection[Remote Address: /172.23.137.248:59942 Local Address: /0:0:0:0:0:0:0:0:9116]
|
org.apache.hyracks.api.exceptions.NetException: Socket Closed
|
at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:360) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:119) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
at org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:199) [hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
2022-01-20T04:07:40.248-08:00 ERRO CBAS.tcp.TCPEndpoint [TCPEndpoint IO Thread [null]] Unexpected tcp io error in connection TCPConnection[Remote Address: /172.23.137.248:9117 Local Address: null]
|
org.apache.hyracks.api.exceptions.NetException: Socket Closed
|
at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:360) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:119) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
at org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:199) [hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
2022-01-20T04:07:40.249-08:00 ERRO CBAS.tcp.TCPEndpoint [TCPEndpoint IO Thread [/0:0:0:0:0:0:0:0:9116]] Unexpected tcp io error in connection TCPConnection[Remote Address: /172.23.137.248:9116 Local Address: /0:0:0:0:0:0:0:0:9116]
|
org.apache.hyracks.api.exceptions.NetException: Socket Closed
|
at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:360) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:119) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|
at org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:199) [hyracks-net-7.1.0-2108.jar:7.1.0-2108]
|