Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46462

[CX] unable to drop dataset with corrupted storage

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • CX Sprint 248, CX Sprint 260

    Description

      1. Create single node cluster
      2. Map beer-sample into analytics
      3. Disconnect link Local
        • note, this won't be necessary once 7.0.0 fixes for MB-46403 are in
      4. Delete the beer-sample directory from I/O device
      5. Drop collection, fails w/ 500 (see below)

      Marking as critical, because AFAICT, the only way to get out of this condition is to failover & remove the node from the cluster, or rebalance it out if it is the only analytics node.

      It seems reasonable that a DROP DATASET should not fail on corrupt / missing storage of the DATASET. Obviously if there is metadata corruption, that would be another matter.

      2021-05-22T08:52:28.675-04:00 INFO CBAS.utils.DatasetUtil [QueryTranslator:a6c83050-bc77-4891-ac0b-531b6d8e2b7c] DROP DATASET: {"bucketName":"beer-sample","whereClause":null,"nodeGroupName":"beer-sample/_default._default","hints":{},"bucketDataverseName":{"canonicalForm":"beer-sample/_default","parts":["beer-sample","_default"],"partCount":2},"scopeName":"_default","datasetName":"_default","recordTypeDataverseName":{"canonicalForm":"Metadata","parts":["Metadata"],"partCount":1},"datasetDetails":"org.apache.asterix.metadata.entities.InternalDatasetDetails@7f9d9603","pendingOp":"Pending No Operation","collectionName":"_default","metaTypeName":"DCPMeta","rebalanceCount":0,"datasetId":"101","compactionPolicyFactory":"concurrent","compactionPolicyProperties":{"max-component-count":"30","min-merge-component-count":"3","max-merge-component-count":"10","size-ratio":"1.2"},"metaTypeDataverseName":{"canonicalForm":"Metadata","parts":["Metadata"],"partCount":1},"dataverseName":{"canonicalForm":"beer-sample/_default","parts":["beer-sample","_default"],"partCount":2},"bucketLinkName":"Local","datasetType":"INTERNAL","collectionId":"0x0","compressionScheme":"snappy","recordTypeName":"AnyObject"}
      2021-05-22T08:52:28.681-04:00 INFO CBAS.context.GlobalVirtualBufferCache [QueryTranslator:a6c83050-bc77-4891-ac0b-531b6d8e2b7c] Registered metadata index {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_0/storage/partition_0/Metadata/Index/0/Index", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 2, "num-scheduled-flushes":0, "current-memory-component":0} to the global VBC
      2021-05-22T08:52:28.681-04:00 INFO CBAS.context.GlobalVirtualBufferCache [QueryTranslator:a6c83050-bc77-4891-ac0b-531b6d8e2b7c] Registered metadata index {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_0/storage/partition_0/Metadata/Dataset/0/Dataset", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 2, "num-scheduled-flushes":0, "current-memory-component":0} to the global VBC
      2021-05-22T08:52:28.741-04:00 INFO CBAS.dataflow.IndexDataflowHelper [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:0:0] Dropping index storage/partition_0/beer-sample/^_default/_default/0/_default on node 0fa315401d95eb5ce7e967e57f5fa545
      2021-05-22T08:52:28.741-04:00 INFO CBAS.dataflow.IndexDataflowHelper [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:2:0] Dropping index storage/partition_2/beer-sample/^_default/_default/0/_default on node 0fa315401d95eb5ce7e967e57f5fa545
      2021-05-22T08:52:28.741-04:00 INFO CBAS.dataflow.IndexDataflowHelper [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:1:0] Dropping index storage/partition_1/beer-sample/^_default/_default/0/_default on node 0fa315401d95eb5ce7e967e57f5fa545
      2021-05-22T08:52:28.768-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:0:0] Deactivating the index: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_0/storage/partition_0/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}. STARTED
      2021-05-22T08:52:28.768-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:0:0] Deactivating the disk components of: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_0/storage/partition_0/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}
      2021-05-22T08:52:28.768-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:0:0] Deallocating memory components of: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_0/storage/partition_0/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}
      2021-05-22T08:52:28.768-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:0:0] Deactivating the index: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_0/storage/partition_0/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}. COMPLETED
      2021-05-22T08:52:28.772-04:00 ERRO CBAS.dataflow.IndexDataflowHelper [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:1:0] index storage/partition_1/beer-sample/^_default/_default/0/_default does not exist
      2021-05-22T08:52:28.773-04:00 WARN CBAS.nc.Task [SA:JID:0.23:TAID:TID:ANID:ODID:0:0:1:0:0] Task failed with exception
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0104: Index does not exist
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:57) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.readIndex(IndexDataflowHelper.java:77) ~[hyracks-storage-am-common.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.destroy(IndexDataflowHelper.java:99) ~[hyracks-storage-am-common.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.storage.am.common.dataflow.IndexDropOperatorNodePushable.dropIndex(IndexDropOperatorNodePushable.java:82) ~[hyracks-storage-am-common.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.storage.am.common.dataflow.IndexDropOperatorNodePushable.initialize(IndexDropOperatorNodePushable.java:71) ~[hyracks-storage-am-common.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:227) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      	at java.lang.Thread.run(Thread.java:834) [?:?]
      2021-05-22T08:52:28.773-04:00 WARN CBAS.work.NotifyTaskFailureWork [Worker:0fa315401d95eb5ce7e967e57f5fa545] task TAID:TID:ANID:ODID:0:0:1:0 has failed
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0104: Index does not exist
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:390) ~[hyracks-control-nc.jar:0.3.6-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
      	at java.lang.Thread.run(Thread.java:834) ~[?:?]
      2021-05-22T08:52:28.774-04:00 WARN CBAS.work.TaskFailureWork [Worker:ClusterController] Executing task failure work for TaskFailure: [JID:0.23:TAID:TID:ANID:ODID:0:0:1:0:0fa315401d95eb5ce7e967e57f5fa545]
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0104: Index does not exist
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:390) ~[hyracks-control-nc.jar:0.3.6-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
      	at java.lang.Thread.run(Thread.java:834) ~[?:?]
      2021-05-22T08:52:28.774-04:00 INFO CBAS.work.JobCleanupWork [Worker:ClusterController] Cleanup for job: JID:0.23
      2021-05-22T08:52:28.775-04:00 INFO CBAS.work.AbortTasksWork [Worker:0fa315401d95eb5ce7e967e57f5fa545] Aborting Tasks: JID:0.23:[TAID:TID:ANID:ODID:0:0:2:0]
      2021-05-22T08:52:28.799-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:2:0] Deactivating the index: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_2/storage/partition_2/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}. STARTED
      2021-05-22T08:52:28.799-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:2:0] Deactivating the disk components of: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_2/storage/partition_2/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}
      2021-05-22T08:52:28.799-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:2:0] Deallocating memory components of: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_2/storage/partition_2/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}
      2021-05-22T08:52:28.799-04:00 INFO CBAS.impls.AbstractLSMIndex [SAO:JID:0.23:TAID:TID:ANID:ODID:0:0:2:0] Deactivating the index: {"class" : "LSMBTree", "dir" : "/Users/michaelblow/dev/master/ns_server/data/n_0/datadir/@analytics/v_iodevice_2/storage/partition_2/beer-sample/^_default/_default/0/_default", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}. COMPLETED
      2021-05-22T08:52:28.804-04:00 INFO CBAS.work.JobCleanupWork [Worker:ClusterController] Cleanup for job: JID:0.23
      2021-05-22T08:52:28.804-04:00 ERRO CBAS.translator.QueryTranslator [QueryTranslator:a6c83050-bc77-4891-ac0b-531b6d8e2b7c] failed to drop dataset; executing compensating operations
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0104: Index does not exist
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:390) ~[hyracks-control-nc.jar:0.3.6-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      	at java.lang.Thread.run(Thread.java:834) [?:?]
      2021-05-22T08:52:28.807-04:00 INFO CBAS.utils.DatasetUtil [QueryTranslator:a6c83050-bc77-4891-ac0b-531b6d8e2b7c] DROP DATASET: {"bucketName":"beer-sample","whereClause":null,"nodeGroupName":"beer-sample/_default._default","hints":{},"bucketDataverseName":{"canonicalForm":"beer-sample/_default","parts":["beer-sample","_default"],"partCount":2},"scopeName":"_default","datasetName":"_default","recordTypeDataverseName":{"canonicalForm":"Metadata","parts":["Metadata"],"partCount":1},"datasetDetails":"org.apache.asterix.metadata.entities.InternalDatasetDetails@7f9d9603","pendingOp":"Pending No Operation","collectionName":"_default","metaTypeName":"DCPMeta","rebalanceCount":0,"datasetId":"101","compactionPolicyFactory":"concurrent","compactionPolicyProperties":{"max-component-count":"30","min-merge-component-count":"3","max-merge-component-count":"10","size-ratio":"1.2"},"metaTypeDataverseName":{"canonicalForm":"Metadata","parts":["Metadata"],"partCount":1},"dataverseName":{"canonicalForm":"beer-sample/_default","parts":["beer-sample","_default"],"partCount":2},"bucketLinkName":"Local","datasetType":"INTERNAL","collectionId":"0x0","compressionScheme":"snappy","recordTypeName":"AnyObject"}
      2021-05-22T08:52:28.916-04:00 WARN CBAS.apache.asterix [Executor-10:ClusterController] HYR0104: Index does not exist
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0104: Index does not exist
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70) ~[hyracks-api.jar:0.3.6-SNAPSHOT]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:390) ~[hyracks-control-nc.jar:0.3.6-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      	at java.lang.Thread.run(Thread.java:834) [?:?]
      	Suppressed: org.apache.asterix.common.exceptions.ACIDException: Transaction TxnId:59 doesn't exist.
      		at org.apache.asterix.transaction.management.service.transaction.TransactionManager.getTransactionContext(TransactionManager.java:74) ~[asterix-transactions.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.metadata.MetadataNode.modifyMetadataIndex(MetadataNode.java:470) ~[asterix-metadata.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.metadata.MetadataNode.deleteTupleFromIndex(MetadataNode.java:768) ~[asterix-metadata.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.metadata.MetadataNode.dropDataset(MetadataNode.java:659) ~[asterix-metadata.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.metadata.MetadataManager.dropDataset(MetadataManager.java:271) ~[asterix-metadata.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.metadata.entities.Dataset.drop(Dataset.java:367) ~[asterix-metadata.jar:0.9.6-SNAPSHOT]
      		at com.couchbase.analytics.metadata.ShadowDataset.drop(ShadowDataset.java:173) ~[cbas-connector.jar:<dev build>]
      		at org.apache.asterix.app.translator.QueryTranslator.doDropDataset(QueryTranslator.java:1722) ~[asterix-app.jar:0.9.6-SNAPSHOT]
      		at com.couchbase.analytics.lang.CBStatementExecutor.doDropDataset(CBStatementExecutor.java:784) ~[cbas-connector.jar:<dev build>]
      		at org.apache.asterix.app.translator.QueryTranslator.handleDatasetDropStatement(QueryTranslator.java:1661) ~[asterix-app.jar:0.9.6-SNAPSHOT]
      		at com.couchbase.analytics.lang.CBStatementExecutor.handleDatasetDropStatement(CBStatementExecutor.java:770) ~[cbas-connector.jar:<dev build>]
      		at org.apache.asterix.app.translator.QueryTranslator.compileAndExecute(QueryTranslator.java:359) ~[asterix-app.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.app.message.ExecuteStatementRequestMessage.handle(ExecuteStatementRequestMessage.java:170) ~[asterix-app.jar:0.9.6-SNAPSHOT]
      		at org.apache.asterix.messaging.CCMessageBroker.receivedMessage(CCMessageBroker.java:64) ~[asterix-app.jar:0.9.6-SNAPSHOT]
      		at org.apache.hyracks.control.cc.work.ApplicationMessageWork.lambda$notifyMessageBroker$0(ApplicationMessageWork.java:68) ~[hyracks-control-cc.jar:0.3.6-SNAPSHOT]
      		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      		at java.lang.Thread.run(Thread.java:834) [?:?]
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              umang.agrawal Umang
              michael.blow Michael Blow
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty