Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50482

java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • analytics
    • Centos 7 64 bit; CB EE 7.1.0-2108
    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown
    • CX Sprint 280

    Description

      Summary
      -> Had a 7.1 cluster of nodes = 172.23.137.248, 172.23.137.249,  172.23.137.251, 172.23.137.253
      (.248 and .249 had analytics service running)
      The cluster was using x509 certs and had TLS enforced.

      -> Failover .248 node. And add it back.
      This rebalance appeared to have completed, but the UI's rebalance button got enabled again. Checking the UI logs observed that:

      Analytics Service unable to successfully rebalance 4007071453d0c97827b5cbe0b0339b1a due to 'java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)'; see analytics_info.log for details

      on .249 analytics_error.log

      2022-01-20T04:02:07.827-08:00 ERRO CBAS.rebalance.Rebalance [Executor-10:ClusterController] Rebalance 4007071453d0c97827b5cbe0b0339b1a failed
      java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)
          at com.couchbase.analytics.control.rebalance.Rebalance.ensureNodesClusterActive(Rebalance.java:526) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.adjustClusterBeforeRebalance(Rebalance.java:683) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:205) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:166) [cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:84) [cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:27) [cbas-connector-7.1.0-2108.jar:7.1.0-2108]
          at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
          at java.lang.Thread.run(Thread.java:829) [?:?]
      2022-01-20T04:02:08.101-08:00 ERRO CBAS.servlet.RebalanceServlet [HttpExecutor(port:9111)-8] Rebalance 4007071453d0c97827b5cbe0b0339b1a failed
      java.lang.IllegalStateException: timed out waiting for all nodes to join & cluster active (missing nodes: [c9240354a56716c75704bbbdbcbd547e], state: UNUSABLE)
          at com.couchbase.analytics.control.rebalance.Rebalance.ensureNodesClusterActive(Rebalance.java:526) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.adjustClusterBeforeRebalance(Rebalance.java:683) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:205) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:166) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:84) ~[cbas-server-7.1.0-2108.jar:7.1.0-2108]
          at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:27) ~[cbas-connector-7.1.0-2108.jar:7.1.0-2108]
          at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
          at java.lang.Thread.run(Thread.java:829) [?:?]
      2022-01-20T04:07:40.248-08:00 ERRO CBAS.tcp.TCPEndpoint [TCPEndpoint IO Thread [/0:0:0:0:0:0:0:0:9116]] Unexpected tcp io error in connection TCPConnection[Remote Address: /172.23.137.248:59942 Local Address: /0:0:0:0:0:0:0:0:9116]
      org.apache.hyracks.api.exceptions.NetException: Socket Closed
          at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:360) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
          at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:119) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
          at org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:199) [hyracks-net-7.1.0-2108.jar:7.1.0-2108]
      2022-01-20T04:07:40.248-08:00 ERRO CBAS.tcp.TCPEndpoint [TCPEndpoint IO Thread [null]] Unexpected tcp io error in connection TCPConnection[Remote Address: /172.23.137.248:9117 Local Address: null]
      org.apache.hyracks.api.exceptions.NetException: Socket Closed
          at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:360) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
          at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:119) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
          at org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:199) [hyracks-net-7.1.0-2108.jar:7.1.0-2108]
      2022-01-20T04:07:40.249-08:00 ERRO CBAS.tcp.TCPEndpoint [TCPEndpoint IO Thread [/0:0:0:0:0:0:0:0:9116]] Unexpected tcp io error in connection TCPConnection[Remote Address: /172.23.137.248:9116 Local Address: /0:0:0:0:0:0:0:0:9116]
      org.apache.hyracks.api.exceptions.NetException: Socket Closed
          at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:360) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
          at org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:119) ~[hyracks-net-7.1.0-2108.jar:7.1.0-2108]
          at org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:199) [hyracks-net-7.1.0-2108.jar:7.1.0-2108]

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              sumedh.basarkod Sumedh Basarkod (Inactive)
              sumedh.basarkod Sumedh Basarkod (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty