Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35139

[System Test] : Disconnect link local caused a crash on an analytics node

    XMLWordPrintable

Details

    Description

      Build : 6.5.0-3748
      Test : -test tests/analytics/test_analytics_rebalance.yml -scope tests/analytics/scope_analytics_rebalance.yml
      Scale : 4
      Iteration : 1st

      In the system test, after a step where link is disconnected, there is a crash observed on 172.23.96.214

      Disconnect link local completed on 2019-07-16T18:45:33. Just around this time, on 172.23.96.214 there was a crash observed.
      2019-07-16T18:45:28.949-07:00 FATA CBAS.util.ExitUtil [org.apache.hyracks.api.rewriter.runtime.SuperActivity:JID:3.1939:TAID:TID:ANID:ODID:2:0:6:0:0] JVM halting with status 103; thread dump at halt:

      Before that, the following is seen in the analytics_warn.log

      2019-07-16T18:45:27.282-07:00 WARN CBAS.runtime.BucketOperatorNodePushable [Executor-926:ceee097da8a62da38f0114b3c5582329:JID:3.1939:TAID:TID:ANID:ODID:0:0:15:0:SuperActivityOperatorNodePushable:BucketOperatorNodePushable:(Default.Local.default(CouchbaseMetadataExtension))[15]:BucketOperatorDescriptor] Failure during data ingestion
      org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.InterruptedException
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:51) ~[hyracks-api.jar:6.5.0-3748]
      	at org.apache.hyracks.comm.channels.NetworkOutputChannel.nextFrame(NetworkOutputChannel.java:91) ~[hyracks-comm.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.partitions.PipelinedPartition.nextFrame(PipelinedPartition.java:82) ~[hyracks-control-nc.jar:6.5.0-3748]
      	at com.couchbase.analytics.runtime.ProgressFrameTupleAppender.forward(ProgressFrameTupleAppender.java:93) ~[cbas-connector.jar:6.5.0-3748]
      	at com.couchbase.analytics.runtime.ProgressFrameTupleAppender.write(ProgressFrameTupleAppender.java:73) ~[cbas-connector.jar:6.5.0-3748]
      	at com.couchbase.analytics.runtime.ProgressFrameTupleAppender.flush(ProgressFrameTupleAppender.java:105) ~[cbas-connector.jar:6.5.0-3748]
      	at org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.flush(AbstractFrameAppender.java:124) ~[hyracks-dataflow-common.jar:6.5.0-3748]
      	at com.couchbase.analytics.runtime.ProgressPartitionDataWriter.flush(ProgressPartitionDataWriter.java:175) ~[cbas-connector.jar:6.5.0-3748]
      	at org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.flush(AbstractFrameAppender.java:117) ~[hyracks-dataflow-common.jar:6.5.0-3748]
      	at org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.nextFrame(AssignRuntimeFactory.java:126) ~[algebricks-runtime.jar:6.5.0-3748]
      	at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.nextFrame(AlgebricksMetaOperatorDescriptor.java:150) ~[algebricks-runtime.jar:6.5.0-3748]
      	at org.apache.hyracks.dataflow.common.comm.util.FrameUtils.flushFrame(FrameUtils.java:50) ~[hyracks-dataflow-common.jar:6.5.0-3748]
      	at org.apache.hyracks.dataflow.std.base.AbstractReplicateOperatorDescriptor$ReplicatorMaterializerActivityNode$1.nextFrame(AbstractReplicateOperatorDescriptor.java:142) ~[hyracks-dataflow-std.jar:6.5.0-3748]
      	at org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:93) ~[hyracks-dataflow-common.jar:6.5.0-3748]
      	at org.apache.asterix.external.dataflow.TupleForwarder.complete(TupleForwarder.java:51) ~[asterix-external-data.jar:6.5.0-3748]
      	at org.apache.asterix.external.dataflow.FeedRecordDataFlowController.finish(FeedRecordDataFlowController.java:169) ~[asterix-external-data.jar:6.5.0-3748]
      	at org.apache.asterix.external.dataflow.FeedRecordDataFlowController.start(FeedRecordDataFlowController.java:113) ~[asterix-external-data.jar:6.5.0-3748]
      	at org.apache.asterix.external.dataset.adapter.FeedAdapter.start(FeedAdapter.java:38) ~[asterix-external-data.jar:6.5.0-3748]
      	at com.couchbase.analytics.runtime.BucketOperatorNodePushable.start(BucketOperatorNodePushable.java:52) [cbas-connector.jar:6.5.0-3748]
      	at org.apache.asterix.active.ActiveSourceOperatorNodePushable.initialize(ActiveSourceOperatorNodePushable.java:102) [asterix-active.jar:6.5.0-3748]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:212) [hyracks-api.jar:6.5.0-3748]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      	at java.lang.Thread.run(Thread.java:834) [?:?]
      Caused by: java.lang.InterruptedException
      	at java.lang.Object.wait(Native Method) ~[?:?]
      	at java.lang.Object.wait(Object.java:328) ~[?:?]
      	at org.apache.hyracks.comm.channels.NetworkOutputChannel.nextFrame(NetworkOutputChannel.java:85) ~[hyracks-comm.jar:6.5.0-3748]
      	... 23 more
      2019-07-16T18:45:27.282-07:00 FATA CBAS.runtime.DcpUpdateCallback [org.apache.hyracks.api.rewriter.runtime.SuperActivity:JID:3.1939:TAID:TID:ANID:ODID:2:0:6:0:0] Restarting process to ensure data integrity
      org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.InterruptedException
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:51) ~[hyracks-api.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.io.IoRequest.queue(IoRequest.java:105) ~[hyracks-control-nc.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.io.IoRequest.read(IoRequest.java:76) ~[hyracks-control-nc.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.io.IOManager.asyncRead(IOManager.java:318) ~[hyracks-control-nc.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.io.IOManager.syncRead(IOManager.java:249) ~[hyracks-control-nc.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.buffercache.AbstractBufferedFileIOManager.readToBuffer(AbstractBufferedFileIOManager.java:250) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.file.BufferedFileHandle.read(BufferedFileHandle.java:75) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.buffercache.BufferCache.read(BufferCache.java:549) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.buffercache.BufferCache.tryRead(BufferCache.java:522) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.buffercache.BufferCache.pin(BufferCache.java:192) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.btree.impls.DiskBTree.searchDown(DiskBTree.java:170) ~[hyracks-storage-am-btree.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.btree.impls.DiskBTree.search(DiskBTree.java:95) ~[hyracks-storage-am-btree.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.btree.impls.DiskBTree.access$000(DiskBTree.java:44) ~[hyracks-storage-am-btree.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.btree.impls.DiskBTree$DiskBTreeAccessor.search(DiskBTree.java:243) ~[hyracks-storage-am-btree.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreePointSearchCursor.doHasNext(LSMBTreePointSearchCursor.java:82) ~[hyracks-storage-am-lsm-btree.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.EnforcedIndexCursor.hasNext(EnforcedIndexCursor.java:69) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeSearchCursor.doHasNext(LSMBTreeSearchCursor.java:60) ~[hyracks-storage-am-lsm-btree.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.common.EnforcedIndexCursor.hasNext(EnforcedIndexCursor.java:69) ~[hyracks-storage-common.jar:6.5.0-3748]
      	at org.apache.asterix.runtime.operators.LSMPrimaryUpsertOperatorNodePushable$1.process(LSMPrimaryUpsertOperatorNodePushable.java:159) [asterix-runtime.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.processFrame(LSMHarness.java:854) ~[hyracks-storage-am-lsm-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.batchOperate(LSMHarness.java:705) [hyracks-storage-am-lsm-common.jar:6.5.0-3748]
      	at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.batchOperate(LSMTreeIndexAccessor.java:214) [hyracks-storage-am-lsm-common.jar:6.5.0-3748]
      	at org.apache.asterix.runtime.operators.LSMPrimaryUpsertOperatorNodePushable.nextFrame(LSMPrimaryUpsertOperatorNodePushable.java:323) [asterix-runtime.jar:6.5.0-3748]
      	at org.apache.asterix.external.feed.dataflow.SyncFeedRuntimeInputHandler.nextFrame(SyncFeedRuntimeInputHandler.java:46) [asterix-external-data.jar:6.5.0-3748]
      	at org.apache.asterix.external.operators.FeedMetaStoreNodePushable.nextFrame(FeedMetaStoreNodePushable.java:151) [asterix-external-data.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:401) [hyracks-control-nc.jar:6.5.0-3748]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:335) [hyracks-control-nc.jar:6.5.0-3748]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      	at java.lang.Thread.run(Thread.java:834) [?:?]
      Caused by: java.lang.InterruptedException
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1261) ~[?:?]
      	at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:317) ~[?:?]
      	at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:364) ~[?:?]
      	at org.apache.hyracks.control.nc.io.IoRequest.queue(IoRequest.java:103) ~[hyracks-control-nc.jar:6.5.0-3748]
      	... 28 more
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              mihir.kamdar Mihir Kamdar (Inactive)
              mihir.kamdar Mihir Kamdar (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty