Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-57737

Race between replica failure messages and waiting for replica acks

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • Analytics Sprint 22

    Description

      As seen in CBSE-14663, when a network failure happens right before some acks from replicas are received, we could be waiting indefinitely for those acks:

      "app//org.apache.asterix.transaction.management.service.logging.LogManagerWithReplication.lambda$appendToLogTail$0(LogManagerWithReplication.java:98)",
      "app//org.apache.asterix.transaction.management.service.logging.LogManagerWithReplication$$Lambda$1269/0x0000000100b02440.run(Unknown Source)",
      "app//org.apache.hyracks.api.util.InvokeUtil.doUninterruptibly(InvokeUtil.java:60)",
      "app//org.apache.asterix.transaction.management.service.logging.LogManagerWithReplication.appendToLogTail(LogManagerWithReplication.java:89)",
      "app//org.apache.asterix.transaction.management.service.logging.LogManagerWithReplication.log(LogManagerWithReplication.java:70)",
      "app//org.apache.asterix.transaction.management.service.transaction.TransactionManager.commitTransaction(TransactionManager.java:86)",
      "app//org.apache.asterix.metadata.MetadataNode.commitTransaction(MetadataNode.java:201)",
      "app//org.apache.asterix.metadata.MetadataManager.commitTransaction(MetadataManager.java:150)",
      "app//com.couchbase.analytics.metadata.BucketEventsListener.persistBuckets(BucketEventsListener.java:422)",
      "app//com.couchbase.analytics.metadata.BucketEventsListener.persistBuckets(BucketEventsListener.java:407)",
      "app//com.couchbase.analytics.metadata.BucketEventsListener.setUuid(BucketEventsListener.java:348)",
      "app//com.couchbase.analytics.lang.ConnectLinkStatement.combine(ConnectLinkStatement.java:360)",
      "app//com.couchbase.analytics.lang.ConnectLinkStatement.doConnectInner(ConnectLinkStatement.java:769)",
      "app//com.couchbase.analytics.lang.ConnectLinkStatement.doConnect(ConnectLinkStatement.java:713)",
      "app//com.couchbase.analytics.metadata.BucketEventsListener.doConnect(BucketEventsListener.java:453)",
      "app//com.couchbase.analytics.metadata.BucketEventsListener.compileAndStartJob(BucketEventsListener.java:439)",
      "app//org.apache.asterix.app.active.ActiveEntityEventsListener.doStart(ActiveEntityEventsListener.java:411)",
      "app//com.couchbase.analytics.metadata.BucketEventsListener.doResume(BucketEventsListener.java:472)",
      "app//org.apache.asterix.app.active.RecoveryTask.resumeOrRecover(RecoveryTask.java:92)",
      "app//org.apache.asterix.app.active.ActiveEntityEventsListener.resume(ActiveEntityEventsListener.java:658)",
      "app//org.apache.asterix.app.active.ActiveNotificationHandler.resumeOrHalt(ActiveNotificationHandler.java:281)",
      "app//org.apache.asterix.app.active.ActiveNotificationHandler.resume(ActiveNotificationHandler.java:253)",
      "app//com.couchbase.analytics.control.rebalance.ReplicasRecoveryAttempt.resumeActive(ReplicasRecoveryAttempt.java:108)",
      "app//com.couchbase.analytics.control.rebalance.ReplicasRecoveryAttempt.run(ReplicasRecoveryAttempt.java:76)",
      "java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)",
      "java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)",
      "java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)",
      "java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)",
      "java.base@11.0.17/java.lang.Thread.run(Thread.java:829)"
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              murtadha.hubail Murtadha Hubail
              murtadha.hubail Murtadha Hubail
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty