Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30056

Analytics is unavailable in clusters with multiple nodes

    XMLWordPrintable

Details

    Description

      Build 5.5.0-2879. Changelog:

      CHANGELOG for asterix-opt
       
       * Commit: b1b31c18a9979128af5b19cd2a611bb3a841cc8f in build: 5.5.0-2879
         Update tests for MB-22933
         
         Change-Id: I02623f0a38cd3527cbb66ed309c60d7616da5d01
         Reviewed-on: http://review.couchbase.org/95323
         Tested-by: Build Bot <build@couchbase.com>
         Reviewed-by: Murtadha Hubail <Murtadha.hubail@couchbase.com>
         
      CHANGELOG for cbas
       
       * Commit: 340fbc0c851982642159986119da837ba2a67c1b in build: 5.5.0-2879
         MB-22933: bind to all interfaces
         
         Change-Id: I90c58dc604c69be67107934df03bc7eff265479b
         Reviewed-on: http://review.couchbase.org/95320
         Tested-by: Michael Blow <michael.blow@couchbase.com>
         Reviewed-by: Murtadha Hubail <Murtadha.hubail@couchbase.com>
      

      Setup:

      • 2 KV nodes
      • 2 Analytics nodes (172.23.96.8 and 172.23.96.9)

      172.23.96.8 is unreachable and keeps restarting

      2018-06-11T09:18:40.307-07:00 INFO CBAS.nc.NodeControllerService [main] addCc: /0.0.0.0:9112
      2018-06-11T09:18:40.313-07:00 WARN CBAS.impl.IPCConnectionManager [IPC Network Listener Thread [/0.0.0.0:9115]] Exception finishing channel connect
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_162]
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_162]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.finishConnect(IPCConnectionManager.java:328) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:291) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:196) [hyracks-ipc.jar:5.5.0-2879]
      2018-06-11T09:18:40.316-07:00 WARN CBAS.impl.IPCConnectionManager [main] Connection to /0.0.0.0:9112 failed; retrying (retry attempt 1 of 5) after 100ms
      2018-06-11T09:18:40.417-07:00 WARN CBAS.impl.IPCConnectionManager [IPC Network Listener Thread [/0.0.0.0:9115]] Exception finishing channel connect
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_162]
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_162]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.finishConnect(IPCConnectionManager.java:328) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:291) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:196) [hyracks-ipc.jar:5.5.0-2879]
      2018-06-11T09:18:40.418-07:00 WARN CBAS.impl.IPCConnectionManager [main] Connection to /0.0.0.0:9112 failed; retrying (retry attempt 2 of 5) after 150ms
      2018-06-11T09:18:40.569-07:00 WARN CBAS.impl.IPCConnectionManager [IPC Network Listener Thread [/0.0.0.0:9115]] Exception finishing channel connect
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_162]
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_162]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.finishConnect(IPCConnectionManager.java:328) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:291) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:196) [hyracks-ipc.jar:5.5.0-2879]
      2018-06-11T09:18:40.570-07:00 WARN CBAS.impl.IPCConnectionManager [main] Connection to /0.0.0.0:9112 failed; retrying (retry attempt 3 of 5) after 225ms
      2018-06-11T09:18:40.796-07:00 WARN CBAS.impl.IPCConnectionManager [IPC Network Listener Thread [/0.0.0.0:9115]] Exception finishing channel connect
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_162]
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_162]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.finishConnect(IPCConnectionManager.java:328) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:291) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:196) [hyracks-ipc.jar:5.5.0-2879]
      2018-06-11T09:18:40.797-07:00 WARN CBAS.impl.IPCConnectionManager [main] Connection to /0.0.0.0:9112 failed; retrying (retry attempt 4 of 5) after 337ms
      2018-06-11T09:18:41.135-07:00 WARN CBAS.impl.IPCConnectionManager [IPC Network Listener Thread [/0.0.0.0:9115]] Exception finishing channel connect
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_162]
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_162]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.finishConnect(IPCConnectionManager.java:328) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:291) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:196) [hyracks-ipc.jar:5.5.0-2879]
      2018-06-11T09:18:41.136-07:00 WARN CBAS.impl.IPCConnectionManager [main] Connection to /0.0.0.0:9112 failed; retrying (retry attempt 5 of 5) after 505ms
      2018-06-11T09:18:41.642-07:00 WARN CBAS.impl.IPCConnectionManager [IPC Network Listener Thread [/0.0.0.0:9115]] Exception finishing channel connect
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_162]
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_162]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.finishConnect(IPCConnectionManager.java:328) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:291) [hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:196) [hyracks-ipc.jar:5.5.0-2879]
      2018-06-11T09:18:41.643-07:00 ERRO CBAS.control.AnalyticsDriver [main] Exiting AnalyticsDriver due to exception
      org.apache.hyracks.ipc.exceptions.IPCException: java.io.IOException: Connection failed to /0.0.0.0:9112
              at org.apache.hyracks.ipc.impl.IPCSystem.getHandle(IPCSystem.java:99) ~[hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCSystem.getHandle(IPCSystem.java:88) ~[hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCSystem.getHandle(IPCSystem.java:74) ~[hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.ReconnectingIPCHandle.<init>(ReconnectingIPCHandle.java:42) ~[hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCSystem.getHandle(IPCSystem.java:94) ~[hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.control.nc.NodeControllerService.addCc(NodeControllerService.java:366) ~[hyracks-control-nc.jar:5.5.0-2879]
              at org.apache.hyracks.control.nc.NodeControllerService.start(NodeControllerService.java:301) ~[hyracks-control-nc.jar:5.5.0-2879]
              at com.couchbase.analytics.control.AnalyticsDriver.startService(AnalyticsDriver.java:120) ~[cbas-server.jar:5.5.0-2879]
              at com.couchbase.analytics.control.AnalyticsDriver.main(AnalyticsDriver.java:92) [cbas-server.jar:5.5.0-2879]
      Caused by: java.io.IOException: Connection failed to /0.0.0.0:9112
              at org.apache.hyracks.ipc.impl.IPCConnectionManager.getIPCHandle(IPCConnectionManager.java:129) ~[hyracks-ipc.jar:5.5.0-2879]
              at org.apache.hyracks.ipc.impl.IPCSystem.getHandle(IPCSystem.java:97) ~[hyracks-ipc.jar:5.5.0-2879]
              ... 8 more
      

      # netstat -lpnt | grep "java\|cbas"
      tcp        0      0 0.0.0.0:9115            0.0.0.0:*               LISTEN      158359/java         
      tcp        0      0 0.0.0.0:9116            0.0.0.0:*               LISTEN      158359/java         
      tcp        0      0 0.0.0.0:9117            0.0.0.0:*               LISTEN      158359/java         
      tcp        0      0 0.0.0.0:9118            0.0.0.0:*               LISTEN      158359/java         
      tcp        0      0 127.0.0.1:9122          0.0.0.0:*               LISTEN      158347/cbas  
      

      172.23.96.9 is reachable but "unusable"

      curl -s http://Administrator:password@172.23.96.9:8095/analytics/cluster | jq '.state'
      "UNUSABLE"
      

      # netstat -lpnt | grep "java\|cbas"
      tcp        0      0 0.0.0.0:9110            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:43030           0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9111            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9112            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:42009           0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9113            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9114            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9115            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9116            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9117            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9118            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:8095            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9119            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9120            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 0.0.0.0:9121            0.0.0.0:*               LISTEN      49653/java          
      tcp        0      0 127.0.0.1:9122          0.0.0.0:*               LISTEN      49639/cbas      
      

      Single-node clusters do not demonstrate this issue.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              michael.blow Michael Blow
              pavelpaulau Pavel Paulau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty