Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62121

Failed during startup due to Unequal number of trees and filters found in v_iodevice_3/storage/partition_67

    XMLWordPrintable

Details

    Description

      2024-05-31T16:13:39.859+00:00 ERRO CBAS.message.RegistrationTasksResponseMessage [Executor-8:7b5426716aa8ced8788f457785a64498] Failed during startup taskorg.apache.hyracks.api.exceptions.HyracksDataException: HYR0087: Unequal number of trees and filters found in /var/cb-cache/@analytics/v_iodevice_3/storage/partition_67/Default/Default/remote_y26vh_volCollection_3_bbrad/0/remote_y26vh_volCollection_3_bbrad        at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:57) ~[hyracks-api.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeFileManager.cleanupAndGetValidFiles(LSMBTreeFileManager.java:100) ~[hyracks-storage-am-lsm-btree.jar:1.0.0-2117]        at org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.loadDiskComponents(AbstractLSMIndex.java:206) ~[hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.activate(AbstractLSMIndex.java:200) ~[hyracks-storage-am-lsm-common.jar:1.0.0-2117]        at org.apache.hyracks.storage.am.lsm.btree.column.impls.lsm.LSMColumnBTree.activate(LSMColumnBTree.java:89) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]        at org.apache.asterix.common.context.DatasetLifecycleManager.open(DatasetLifecycleManager.java:220) ~[asterix-common.jar:1.0.0-2117]        at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.cleanUp(AnalyticsLocalRecoveryManager.java:95) ~[columnar-server.jar:1.0.0-2117]        at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.startLocalRecovery(AnalyticsLocalRecoveryManager.java:58) ~[columnar-server.jar:1.0.0-2117]
              at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:45) ~[asterix-app.jar:1.0.0-2117]        at org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:63) ~[asterix-app.jar:1.0.0-2117]
              at org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$0(NCMessageBroker.java:108) ~[asterix-app.jar:1.0.0-2117]        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]        at java.base/java.lang.Thread.run(Thread.java:840) [?:?]2024-05-31T16:13:39.862+00:00 INFO CBAS.util.ExitUtil [ShutdownWatchdog] starting shutdown watchdog- system will halt if shutdown is not completed within 600 seconds
      2024-05-31T16:13:39.862+00:00 WARN CBAS.util.ExitUtil [JVM exit thread] JVM exiting with status 2; bye!java.lang.Throwable: exit callstack        at org.apache.hyracks.util.ExitUtil.exit(ExitUtil.java:92) ~[hyracks-util.jar:1.0.0-2117]        at org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:90) ~[asterix-app.jar:1.0.0-2117]
              at org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$0(NCMessageBroker.java:108) ~[asterix-app.jar:1.0.0-2117]
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
              at java.base/java.lang.Thread.run(Thread.java:840) ~[?:?]
      

      Also seeing:

      2024-05-31T10:16:28.243+00:00 ERRO CBAS.impls.LSMHarness [Executor-5222:7b5426716aa8ced8788f457785a64498] MERGE operation failed on {"class" : "LSMColumnBTree", "dir" : "/var/cb-cache/@analytics/v_iodevice_7/storage/partition_87/Default/Default/remote_y26vh_volCollection_4_blkkr/0/remote_y26vh_volCollection_4_blkkr", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"[969,969]", "index":{"class":"BTree","file":"storage/partition_87/Default/Default/remote_y26vh_volCollection_4_blkkr/0/remote_y26vh_volCollection_4_blkkr_virtual_0"}}, {"class":"LSMBTreeMemoryComponent", "state":"READABLE_WRITABLE", "writers":0, "readers":0, "pendingFlushes":0, "id":"[970,970]", "index":{"class":"BTree","file":"storage/partition_87/Default/Default/remote_y26vh_volCollection_4_blkkr/0/remote_y26vh_volCollection_4_blkkr_virtual_1"}}], "disk" : 7, "num-scheduled-flushes":0, "current-memory-component":1}
      org.apache.hyracks.api.exceptions.HyracksDataException: java.net.SocketException: Connection reset
              at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:49) ~[hyracks-api.jar:1.0.0-2117]
              at org.apache.asterix.cloud.clients.aws.s3.S3CloudClient.read(S3CloudClient.java:138) ~[asterix-cloud.jar:1.0.0-2117]
              at org.apache.asterix.cloud.AbstractCloudIOManager.cloudRead(AbstractCloudIOManager.java:175) ~[asterix-cloud.jar:1.0.0-2117]
              at org.apache.hyracks.cloud.buffercache.context.DefaultCloudReadContext.readAndPersistIfEmpty(DefaultCloudReadContext.java:110) ~[hyracks-cloud.jar:1.0.0-2117]
              at org.apache.hyracks.cloud.buffercache.context.DefaultCloudReadContext.readAndPersistPage(DefaultCloudReadContext.java:82) ~[hyracks-cloud.jar:1.0.0-2117]        at org.apache.hyracks.cloud.buffercache.context.DefaultCloudReadContext.processHeader(DefaultCloudReadContext.java:77) ~[hyracks-cloud.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.file.CompressedBufferedFileHandle.read(CompressedBufferedFileHandle.java:62) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.buffercache.BufferCache.read(BufferCache.java:571) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.buffercache.BufferCache.tryRead(BufferCache.java:544) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.buffercache.BufferCache.pin(BufferCache.java:214) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.buffercache.BufferCache.pin(BufferCache.java:176) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.btree.ColumnBTreeRangeSearchCursor.pin(ColumnBTreeRangeSearchCursor.java:293) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.lsm.tuples.ColumnMultiBufferProvider.readNext(ColumnMultiBufferProvider.java:119) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.lsm.tuples.ColumnMultiBufferProvider.reset(ColumnMultiBufferProvider.java:68) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.lsm.tuples.AbstractColumnTupleReference.reset(AbstractColumnTupleReference.java:146) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.btree.ColumnBTreeRangeSearchCursor.setCursorPosition(ColumnBTreeRangeSearchCursor.java:159) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.btree.ColumnBTreeRangeSearchCursor.fetchNextLeafPage(ColumnBTreeRangeSearchCursor.java:97) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.column.impls.btree.ColumnBTreeRangeSearchCursor.doHasNext(ColumnBTreeRangeSearchCursor.java:109) ~[hyracks-storage-am-lsm-btree-column.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.EnforcedIndexCursor.hasNext(EnforcedIndexCursor.java:69) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexSearchCursor.pushIntoQueueFromCursorAndReplaceThisElement(LSMIndexSearchCursor.java:194) ~[hyracks-storage-am-lsm-common.jar:1.0.0-2117]        at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeRangeSearchCursor.pushOutputElementIntoQueueIfNeeded(LSMBTreeRangeSearchCursor.java:215) ~[hyracks-storage-am-lsm-btree.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeRangeSearchCursor.checkPriorityQueue(LSMBTreeRangeSearchCursor.java:189) ~[hyracks-storage-am-lsm-btree.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexSearchCursor.doHasNext(LSMIndexSearchCursor.java:144) ~[hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.common.EnforcedIndexCursor.hasNext(EnforcedIndexCursor.java:69) ~[hyracks-storage-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.doMerge(LSMBTree.java:330) ~[hyracks-storage-am-lsm-btree.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.merge(AbstractLSMIndex.java:917) ~[hyracks-storage-am-lsm-common.jar:1.0.0-2117]        at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doIo(LSMHarness.java:566) [hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:608) [hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.merge(LSMTreeIndexAccessor.java:128) [hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:52) [hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:33) [hyracks-storage-am-lsm-common.jar:1.0.0-2117]
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
              at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
      

      QE Test

      sudo guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/couchbase_columnar_volume.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.goldfish.GoldfishVolume.Columnar.test_rebalance,num_items=1000000000,num_buckets=1,bucket_names=GleamBook,bucket_type=membase,iterations=2,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,maxttl=10,pc=20,gsi_nodes=3,cbas_nodes=3,fts_nodes=3,kv_nodes=3,n1ql_nodes=2,kv_disk=1000,n1ql_disk=50,gsi_disk=500,fts_disk=1000,cbas_disk=1000,kv_compute=m5.4xlarge,gsi_compute=m5.4xlarge,n1ql_compute=m5.4xlarge,fts_compute=m5.4xlarge,cbas_compute=m5.4xlarge,mutation_perc=20,key_type=CircularKey,capella_run=true,services=data,rebl_services=,max_rebl_nodes=27,provider=AWS,region=us-east-1,type=GP3,size=1000,ops_rate=100000,skip_teardown_cleanup=true,wait_timeout=28800,index_timeout=28800,runtype=columnar1,skip_init=false,rebl_ops_rate=10000,collections=5,gtm=true,valType=Hotel,expiry=true,v_scaling=true,h_scaling=true,horizontal_scale=1,clients_per_db=20,track_failures=False,onPremMongo=False,num_clusters=1,remoteCouchbase=True,steady_state_workload_sleep=0,loop=2 -m rest'
      

      Attachments

        Issue Links

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty