Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-23626

[CBAS] Clustered setup : CBAS Cluster became unusable after flushing the KV bucket

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • CBAS DP2
    • CBAS DP2
    • analytics
    • None
    • Untriaged
    • Yes
    • CX Sprint 47

    Description

      Build 662 (RC1)

      In the testcase where after initial ingestion, KV bucket is flushed, once on a clustered CBAS setup, there were errors, and then the CBAS cluster became unusable. It is only once that this issue was seen. Upon re-run, the error wasnt seen. Also, on a single node setup on Linux and Windows, this error wasnt seen.

      2017-03-30T04:50:37.737-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobCleanup: JobId@JID:259 Status@FAILURE Exceptions@[org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot remove index while it is open. (Dataset reference count = 1, Operation tracker number of active operations = 0)]
      2017-03-30T04:50:37.737-07:00 INFO CBAS.work.JobCleanupWork [Worker:ClusterController] Cleanup for JobRun with id: JID:259
      2017-03-30T04:50:37.737-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobCleanup: JobId@JID:258 Status@TERMINATED
      2017-03-30T04:50:37.737-07:00 INFO CBAS.work.JobCleanupWork [Worker:ClusterController] Cleanup for JobRun with id: JID:258
      2017-03-30T04:50:37.738-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobletCleanupNotification
      2017-03-30T04:50:37.738-07:00 INFO CBAS.active.ActiveLifecycleListener [Worker:ClusterController] NO NEED TO NOTIFY JOB FINISH!
      2017-03-30T04:50:37.742-07:00 INFO CBAS.impl.IPCSystem [IPC Network Listener Thread [/0.0.0.0:48415]] Exception in message MSG[852:688:3:org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot remove index while it is open. (Dataset reference count = 1, Operation tracker number of active operations = 0)]
      2017-03-30T04:50:37.742-07:00 WARN CBAS.work.WorkQueue [Worker:ClusterController] Work JobletCleanupNotification waited 0 times (~0ms), blocked 1 times (~0ms)
      2017-03-30T04:50:37.742-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobletCleanupNotification
      2017-03-30T04:50:37.742-07:00 INFO CBAS.active.ActiveLifecycleListener [Worker:ClusterController] NO NEED TO NOTIFY JOB FINISH!
      2017-03-30T04:50:37.743-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobStart
      2017-03-30T04:50:37.744-07:00 INFO CBAS.dataset.DatasetDirectoryService [Worker:ClusterController] DatasetDirectoryService notified of new job JID:260
      2017-03-30T04:50:37.744-07:00 INFO CBAS.executor.ActivityClusterPlanner [Worker:ClusterController] Plan for org.apache.hyracks.api.job.ActivityCluster@492a90cd
      2017-03-30T04:50:37.744-07:00 INFO CBAS.executor.ActivityClusterPlanner [Worker:ClusterController] Built 1 Task Clusters
      2017-03-30T04:50:37.744-07:00 INFO CBAS.executor.ActivityClusterPlanner [Worker:ClusterController] Tasks: [TID:ANID:ODID:0:0:0]
      2017-03-30T04:50:37.745-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: GetResultPartitionLocations: JobId@JID:258 ResultSetId@RSID:0 Known@null
      2017-03-30T04:50:37.745-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: WaitForJobCompletion
      2017-03-30T04:50:37.749-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: TaskComplete: [1[JID:260:TAID:TID:ANID:ODID:0:0:0:0]
      2017-03-30T04:50:37.749-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobCleanup: JobId@JID:260 Status@TERMINATED
      2017-03-30T04:50:37.749-07:00 INFO CBAS.work.JobCleanupWork [Worker:ClusterController] Cleanup for JobRun with id: JID:260
      2017-03-30T04:50:37.750-07:00 INFO CBAS.work.WorkQueue [Worker:ClusterController] Executing: JobletCleanupNotification
      2017-03-30T04:50:37.750-07:00 INFO CBAS.active.ActiveLifecycleListener [Worker:ClusterController] NO NEED TO NOTIFY JOB FINISH!
      2017-03-30T04:50:37.757-07:00 ERRO CBAS.metadata.BucketEventsListener [Executor-10:ClusterController] Failed to delete a dataset{"nodeGroupName":"DEFAULT_NG_ALL_NODES","hints":{},"datasetName":"default_ds","recordTypeDataverseName":"Metadata","datasetDetails":"org.apache.asterix.metadata.entities.InternalDatasetDetails@31a3b31a","pendingOp":"Pending No Operation","metaTypeName":"DCPMeta","datasetId":"137","compactionPolicyFactory":"prefix","compactionPolicyProperties":{"max-mergable-component-size":"1073741824","max-tolerance-component-count":"5"},"metaTypeDataverseName":"Metadata","dataverseName":"Default","datasetType":"INTERNAL","recordTypeName":"AnyObject"}
      org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot remove index while it is open. (Dataset reference count = 1, Operation tracker number of active operations = 0)
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:50) ~[hyracks-api-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:62) ~[hyracks-control-common-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:330) ~[hyracks-control-nc-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
      	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
      Caused by: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot remove index while it is open. (Dataset reference count = 1, Operation tracker number of active operations = 0)
      	at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_101]
      	at java.util.concurrent.FutureTask.get(FutureTask.java:192) [?:1.8.0_101]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:222) ~[hyracks-api-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:84) ~[hyracks-api-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:273) ~[hyracks-control-nc-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	... 3 more
      Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Cannot remove index while it is open. (Dataset reference count = 1, Operation tracker number of active operations = 0)
      	at org.apache.asterix.common.context.DatasetLifecycleManager.unregister(DatasetLifecycleManager.java:138) ~[asterix-common-0.9.1-SNAPSHOT.jar:0.9.1-SNAPSHOT]
      	at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.destroy(IndexDataflowHelper.java:138) ~[hyracks-storage-am-common-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.storage.am.common.dataflow.IndexDropOperatorNodePushable.initialize(IndexDropOperatorNodePushable.java:53) ~[hyracks-storage-am-common-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:86) ~[hyracks-api-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$2(SuperActivityOperatorNodePushable.java:216) ~[hyracks-api-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_101]
      	... 3 more
      2017-03-30T04:50:37.762-07:00 INFO CBAS.utils.ClusterStateManager [Executor-10:ClusterController] Cluster State is now UNUSABLE
      2017-03-30T04:50:47.778-07:00 INFO CBAS.server.QueryServiceServlet [HttpExecutor(port:8095)-2] {"host":"172.23.108.232:8095","path":"/analytics/service","statement":"select count(*) from default_ds;","pretty":true,"mode":"null","clientContextID":"null"}
      2017-03-30T04:50:47.778-07:00 ERRO CBAS.apache.asterix [HttpExecutor(port:8095)-2] Cannot execute request, cluster is UNUSABLE
      java.lang.IllegalStateException: Cannot execute request, cluster is UNUSABLE
      	at org.apache.asterix.api.http.server.QueryServiceServlet.handleRequest(QueryServiceServlet.java:424) [asterix-app-0.9.1-SNAPSHOT.jar:0.9.1-SNAPSHOT]
      	at org.apache.asterix.api.http.server.QueryServiceServlet.post(QueryServiceServlet.java:87) [asterix-app-0.9.1-SNAPSHOT.jar:0.9.1-SNAPSHOT]
      	at org.apache.hyracks.http.server.AbstractServlet.handle(AbstractServlet.java:78) [hyracks-http-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.http.server.HttpRequestHandler.handle(HttpRequestHandler.java:70) [hyracks-http-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.http.server.HttpRequestHandler.call(HttpRequestHandler.java:55) [hyracks-http-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at org.apache.hyracks.http.server.HttpRequestHandler.call(HttpRequestHandler.java:36) [hyracks-http-0.3.1-SNAPSHOT.jar:0.3.1-SNAPSHOT]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
      	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Abdullah.Alamoudi Abdullah Alamoudi [X] (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty