Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-27999

Analytics System Test: 504 error and analytics NC is down for a longevity test running since 10 days.

    XMLWordPrintable

Details

    • Untriaged
    • No
    • CX Sprint 91, CX Sprint 93

    Description

      Ran the simple longevity test for 10 days. There are 4 KV buckets, and corresponding to each bucket, there is 1 CBAS bucket and a dataset which is non-filtered.

      With continuous KV ops that include updates and deletes and creates, the test is executed.

      The environment is live if you want to debug:
      172.23.108.231 (kv)
      172.23.108.232 (kv)
      172.23.108.233 (analytics)
      172.23.108.234 (analytics)

      In UI: Error retrieving buckets state, contacting analytics service returned status: -1

      CC node analytics log says: analytics.log:org.apache.hyracks.api.exceptions.HyracksException: Node 597a18c441ceeb3ace29fba5544ff07c not live

      Exception in analytics logs NC node:

      2018-02-06T18:26:21.126-08:00 WARN CBAS.work.NotifyTaskFailureWork [Worker:325581374ffcef84a21ed2c4d5c31bca] 325581374ffcef84a21ed2c4d5c31bca is sending a notification to cc that task TAID:TID:ANID:ODID:2:0:0:0 has failed
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0003: java.lang.InterruptedException
              at org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:68) ~[hyracks-control-common-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              at org.apache.hyracks.control.nc.Task.run(Task.java:368) ~[hyracks-control-nc-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_152]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_152]
              at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_152]
      Caused by: java.lang.InterruptedException
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302) ~[?:1.8.0_152]
              at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) ~[?:1.8.0_152]
              at org.apache.hyracks.control.nc.Task.run(Task.java:326) ~[hyracks-control-nc-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              ... 3 more
      2018-02-06T18:26:21.142-08:00 INFO CBAS.adapter.CouchbaseConnector [Executor-15:325581374ffcef84a21ed2c4d5c31bca:BucketOperatorNodePushable:(Default.bucket1(CouchbaseMetadataExtension))[1]:BucketOperatorDescriptor] bucket1:325581374ffcef84a21ed2c4d5c31bca:1 stopping...
      2018-02-06T18:26:21.142-08:00 INFO CBAS.adapter.CouchbaseConnector [Executor-15:325581374ffcef84a21ed2c4d5c31bca:BucketOperatorNodePushable:(Default.bucket1(CouchbaseMetadataExtension))[1]:BucketOperatorDescriptor] bucket1:325581374ffcef84a21ed2c4d5c31bca:1 disconnecting...
      

      Ingestion also Interrupted it seems from the logs:

      2018-02-06T18:26:21.390-08:00 INFO CBAS.adapter.CouchbaseConnector [Executor-15:325581374ffcef84a21ed2c4d5c31bca:BucketOperatorNodePushable:(Default.bucket1(CouchbaseMetadataExtension))[1]:BucketOperatorDescriptor] bucket1:325581374ffcef84a21ed2c4d5c31bca:1 closed...
      2018-02-06T18:26:21.391-08:00 INFO CBAS.dataflow.FeedRecordDataFlowController [Executor-15:325581374ffcef84a21ed2c4d5c31bca:BucketOperatorNodePushable:(Default.bucket1(CouchbaseMetadataExtension))[1]:BucketOperatorDescriptor] State is being set from STARTED to STOPPED
      2018-02-06T18:26:21.391-08:00 WARN CBAS.runtime.BucketOperatorNodePushable [Executor-15:325581374ffcef84a21ed2c4d5c31bca:BucketOperatorNodePushable:(Default.bucket1(CouchbaseMetadataExtension))[1]:BucketOperatorDescriptor] Failure during data ingestion
      org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.InterruptedException
              at org.apache.hyracks.comm.channels.NetworkOutputChannel.nextFrame(NetworkOutputChannel.java:79) ~[hyracks-comm-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              at org.apache.hyracks.control.nc.partitions.PipelinedPartition.nextFrame(PipelinedPartition.java:82) ~[hyracks-control-nc-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              at com.couchbase.analytics.runtime.ProgressFrameTupleAppender.forward(ProgressFrameTupleAppender.java:91) ~[cbas-connector-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at com.couchbase.analytics.runtime.ProgressFrameTupleAppender.write(ProgressFrameTupleAppender.java:71) ~[cbas-connector-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at org.apache.hyracks.dataflow.common.comm.util.FrameUtils.appendToWriter(FrameUtils.java:159) ~[hyracks-dataflow-common-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              at com.couchbase.analytics.runtime.ProgressPartitionDataWriter.doNextFrame(ProgressPartitionDataWriter.java:140) ~[cbas-connector-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at com.couchbase.analytics.runtime.ProgressPartitionDataWriter.nextFrame(ProgressPartitionDataWriter.java:125) ~[cbas-connector-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at org.apache.hyracks.dataflow.common.comm.util.FrameUtils.flushFrame(FrameUtils.java:50) ~[hyracks-dataflow-common-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
              at org.apache.hyracks.dataflow.std.base.AbstractReplicateOperatorDescriptor$ReplicatorMaterializerActivityNode$1.nextFrame(AbstractReplicateOperatorDescriptor.java:142) ~[hyracks-dataflow-std-0.3.3-SNAPSHOT.jar:0.3.3-SNAPSHOT]
      
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ritesh.agarwal Ritesh Agarwal
            ritesh.agarwal Ritesh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty