Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46809

[CX] Analytics keeps restarting during global recovery when it fails to communicate with remote cluster

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Yes
    • CX Sprint 250

    Description

      When Analytics starts the global recovery, it keeps restarting over and over again if it fails to communicate with the remote cluster.

      Steps to reproduce:

      • create a remote cluster
      • in the local cluster, create a remote link to the remote cluster
      • create an analytics collection on a remote bucket
      • stop the local cluster
      • make sure the remote cluster is not reachable now from the local cluster
      • start the local cluster and observe that Analytics keeps restarting

       

      2021-06-08T13:49:49.440-07:00 ERRO CBAS.bootstrap.HttpClientProvider [Executor-8:ClusterController] http request failed on final attempt: 1/12021-06-08T13:49:49.440-07:00 ERRO CBAS.bootstrap.HttpClientProvider [Executor-8:ClusterController] http request failed on final attempt: 1/1org.apache.http.conn.ConnectTimeoutException: Connect to 172.23.104.199:8091 [/172.23.104.199] failed: connect timed out at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[httpclient-4.5.13.jar:4.5.13] at com.couchbase.analytics.auth.CbRemoteLinkHelper.execute(CbRemoteLinkHelper.java:98) ~[cbas-server.jar:7.0.1-0000] at com.couchbase.analytics.auth.CbRemoteLinkHelper.execute(CbRemoteLinkHelper.java:86) ~[cbas-server.jar:7.0.1-0000] at com.couchbase.analytics.bootstrap.NsServerHelper.executeRemoteClusterRequest(NsServerHelper.java:1134) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.executeRemoteClusterRequest(NsServerHelper.java:1156) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.executeRemoteClusterRequest(NsServerHelper.java:1148) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.lambda$getExternalBuckets$0(NsServerHelper.java:322) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.auth.CbRemoteLinkHelper.remoteClusterExecution(CbRemoteLinkHelper.java:183) ~[cbas-server.jar:7.0.1-0000] at com.couchbase.analytics.bootstrap.NsServerHelper.getExternalBuckets(NsServerHelper.java:321) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.getBuckets(NsServerHelper.java:339) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.AnalyticsGlobalRecoveryManager.reconnectBucket(AnalyticsGlobalRecoveryManager.java:127) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.AnalyticsGlobalRecoveryManager.doRecovery(AnalyticsGlobalRecoveryManager.java:90) ~[cbas-server.jar:<dev build>] at org.apache.asterix.hyracks.bootstrap.GlobalRecoveryManager.recover(GlobalRecoveryManager.java:125) ~[asterix-app.jar:7.0.1-0000] at org.apache.asterix.hyracks.bootstrap.GlobalRecoveryManager.lambda$startGlobalRecovery$0(GlobalRecoveryManager.java:102) ~[asterix-app.jar:7.0.1-0000] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?] at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?]Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?] at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) ~[?:?] at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) ~[?:?] at java.net.AbstractPlainSocketImpl.connect(Unknown Source) ~[?:?] at java.net.SocksSocketImpl.connect(Unknown Source) ~[?:?] at java.net.Socket.connect(Unknown Source) ~[?:?] at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) ~[httpclient-4.5.13.jar:4.5.13] at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ~[httpclient-4.5.13.jar:4.5.13] ... 26 more2021-06-08T13:49:49.443-07:00 FATA CBAS.bootstrap.GlobalRecoveryManager [Executor-8:ClusterController] Global recovery failed. Shutting down...com.couchbase.analytics.common.exceptions.AnalyticsHyracksException: CBAS0055: Cannot connect to host 172.23.104.199:8091 for link Default.externalcb: java.net.SocketTimeoutException: connect timed out at com.couchbase.analytics.auth.CbRemoteLinkHelper.execute(CbRemoteLinkHelper.java:122) ~[cbas-server.jar:7.0.1-0000] at com.couchbase.analytics.auth.CbRemoteLinkHelper.execute(CbRemoteLinkHelper.java:86) ~[cbas-server.jar:7.0.1-0000] at com.couchbase.analytics.bootstrap.NsServerHelper.executeRemoteClusterRequest(NsServerHelper.java:1134) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.executeRemoteClusterRequest(NsServerHelper.java:1156) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.executeRemoteClusterRequest(NsServerHelper.java:1148) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.lambda$getExternalBuckets$0(NsServerHelper.java:322) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.auth.CbRemoteLinkHelper.remoteClusterExecution(CbRemoteLinkHelper.java:183) ~[cbas-server.jar:7.0.1-0000] at com.couchbase.analytics.bootstrap.NsServerHelper.getExternalBuckets(NsServerHelper.java:321) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.NsServerHelper.getBuckets(NsServerHelper.java:339) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.AnalyticsGlobalRecoveryManager.reconnectBucket(AnalyticsGlobalRecoveryManager.java:127) ~[cbas-server.jar:<dev build>] at com.couchbase.analytics.bootstrap.AnalyticsGlobalRecoveryManager.doRecovery(AnalyticsGlobalRecoveryManager.java:90) ~[cbas-server.jar:<dev build>] at org.apache.asterix.hyracks.bootstrap.GlobalRecoveryManager.recover(GlobalRecoveryManager.java:125) ~[asterix-app.jar:7.0.1-0000] at org.apache.asterix.hyracks.bootstrap.GlobalRecoveryManager.lambda$startGlobalRecovery$0(GlobalRecoveryManager.java:102) ~[asterix-app.jar:7.0.1-0000] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?] at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?]2021-06-08T13:49:49.498-07:00 INFO CBAS.bootstrap.AnalyticsNCApplication [ShutdownHook-9e8fad0513ca9ac46cca5f43816af08e] Stopping Couchbase Analytics driver2021-06-08T13:49:51.710-07:00 ERRO CBAS.lifecycle.LifeCycleComponentManager [ShutdownHook-9e8fad0513ca9ac46cca5f43816af08e] Stopping instance
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            umang.agrawal Umang
            ali.alsuliman Ali Alsuliman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty