Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45665

[System Test] : Seeing "fatal error occured: Storage corrupted and unrecoverable" errors after recovery

    XMLWordPrintable

Details

    Description

      Build : 7.0.0-4916
      Test : -test tests/2i/cheshirecat/test_idx_clusterops_cheshire_cat.yml -scope tests/2i/cheshirecat/scope_idx_cheshire_cat_dgm.yml
      Scale : 2
      Iteration : 2nd

      To get unblocked from MB-45631, indexer process on both index nodes - 172.23.107.5 & 172.23.97.216 was killed. The rebalance failed as expected, and the test continued.

      On 172.23.107.5, indexer started at 2021-04-13T09:06:50

      2021-04-13T09:06:50.914-07:00 [Info] Indexer started with command line: [/opt/couchbase/bin/indexer -adminPort=9100 -scanPort=9101 -httpPort=9102 -streamInitPort=9103 -streamCatchupPort=9104 -streamMaintPort=9105 --httpsPort=19102 --certFile=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem --keyFile=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem -vbuckets=1024 -cluster=127.0.0.1:8091 -storageDir=/data/@2i -diagDir=/opt/couchbase/var/lib/couchbase/crash -logDir=/opt/couchbase/var/lib/couchbase/logs -nodeUUID=a623a555cd47065c9ec0a97470824b44 -ipv6=false -isEnterprise=true]
      

      While recovery was ongoing, the following is seen in the logs :

      2021-04-13T09:07:06.260-07:00 [Error] plasmaSlice:NewplasmaSlice Id 0xf9fb70 IndexInstId 6836294379681358584 fatal error occured: Unable to initialize /data/@2i/bucket3_idx2_T2JTr4t_6836294379681358584_3.index/docIndex, err = Shard /data/@2i/shards/shard3(3) : fatal: bucket3/idx3_PFDO2Dk/Backstore#7556686072913129312:0 : fatal: found missing page. lastPg maxItem (item key:<ud>(doc_39456)</ud> val:<ud>((nil))</ud> sn:1 insert: false). pg minItem (item key:<ud>(doc_3894)</ud> val:<ud>((nil))</ud> sn:1 insert: false)
      2021-04-13T09:07:06.260-07:00 [Error] plasmaSlice:NewplasmaSlice Id 0 IndexInstId 6836294379681358584 PartitionId 3 fatal error occured: Storage corrupted and unrecoverable
      2021-04-13T09:07:06.260-07:00 [Error] Indexer:: initPartnInstance storage corruption for indexInst
              InstId: 6836294379681358584
              Defn: DefnId: 12233027107853559480 Name: idx2_T2JTr4t Using: plasma Bucket: bucket3 Scope/Id: scope_1/9 Collection/Id: coll_12/1b IsPrimary: false NumReplica: 3 InstVersion: 0
                      SecExprs: <ud>([`free_breakfast` `type` `free_parking` array_count(`public_likes`) `price` `country`])</ud>
                      Desc: [false false false false false false]
                      PartitionScheme: KEY
                      HashScheme: CRC32 PartitionKeys: [(meta().`id`)] WhereExpr: <ud>()</ud> RetainDeletedXATTR: false
              State: INDEX_STATE_ACTIVE
              RState: RebalActive
              Stream: MAINT_STREAM
              Version: 2
              ReplicaId: 0
              PartitionContainer: &{map[2:{2 2 [:9105]} 3:{3 0 [:9105]}] 1024 3 341 KEY 0} partnDefn {3 0 [:9105]}
      2021-04-13T09:07:06.260-07:00 [Info] Indexer::initPartnInstance Initialized Partition:
               Index: 6836294379681358584 Partition: PartitionId: 2 Endpoints: [:9105]
      ...
      ...
      ...
      2021-04-13T09:07:11.779-07:00 [Error] plasmaSlice:NewplasmaSlice Id 0xf9fb70 IndexInstId 10484286148004174088 fatal error occured: Unable to initialize /data/@2i/bucket3_idx1_QVkmK_10484286148004174088_0.index/docIndex, err = Shard /data/@2i/shards/shard3(3) : fatal: bucket3/idx3_PFDO2Dk/Backstore#7556686072913129312:0 : fatal: found missing page. lastPg maxItem (item key:<ud>(doc_37564)</ud> val:<ud>((nil))</ud> sn:1 insert: false). pg minItem (item key:<ud>(doc_37206)</ud> val:<ud>((nil))</ud> sn:1 insert: false)
      2021-04-13T09:07:11.779-07:00 [Error] plasmaSlice:NewplasmaSlice Id 0 IndexInstId 10484286148004174088 PartitionId 0 fatal error occured: Storage corrupted and unrecoverable
      2021-04-13T09:07:11.779-07:00 [Error] Indexer:: initPartnInstance storage corruption for indexInst
              InstId: 10484286148004174088
              Defn: DefnId: 16849351108918044691 Name: idx1_QVkmK Using: plasma Bucket: bucket3 Scope/Id: scope_1/9 Collection/Id: coll_7/16 IsPrimary: false NumReplica: 3 InstVersion: 0
                      SecExprs: <ud>([`country` (distinct (array ((`r`.`ratings`).`Check in / front desk`) for `r` in `reviews` end)) array_count(`public_likes`) array_count(`reviews`) `type` `phone` `price` `email` `address` `name` `url`])</ud>
                      Desc: [false false false true false false false false false false false]
                      PartitionScheme: SINGLE
                      HashScheme: CRC32 PartitionKeys: [] WhereExpr: <ud>()</ud> RetainDeletedXATTR: false
              State: INDEX_STATE_ACTIVE
              RState: RebalActive
              Stream: MAINT_STREAM
              Version: 0
              ReplicaId: 1
              PartitionContainer: &{map[0:{0 0 [:9105]}] 1024 1 1024 SINGLE 0} partnDefn {0 0 [:9105]}
      2021-04-13T09:07:11.781-07:00 [Info] Indexer::initFromPersistedState Starting cleanup for PartitionId: 0 Endpoints: [:9105]
      2021-04-13T09:07:11.781-07:00 [Info] Indexer::forceCleanupIndexPartition 10484286148004174088 0 mark metadata as deleted
      2021-04-13T09:07:11.781-07:00 [Info] ClustMgr:handleCleanupPartition
              Message: MsgCleanupPartition
              Type: 67
              Index defn Id: 16849351108918044691
              Index inst Id: 10484286148004174088
              Index partition Id: 0
              Index replica Id: 1
              Update Status Only: true
      2021-04-13T09:07:11.781-07:00 [Info] LifecycleMgr.DeleteOrPruneIndexInstance() : index defnId 16849351108918044691 instance id 10484286148004174088 real instance id 0 partitions [0]
      2021-04-13T09:07:11.781-07:00 [Info] LifecycleMgr.PruneIndexPartition() : index defnId 16849351108918044691 instance 10484286148004174088 partitions [0]
      2021-04-13T09:07:11.781-07:00 [Info] LifecycleMgr.DeleteIndexInstance() : index defnId 16849351108918044691 instance id 10484286148004174088
      2021-04-13T09:07:11.781-07:00 [Info] LifecycleMgr.DeleteIndexInstance() : there is only a single instance.  Delete index 16849351108918044691
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            jliang John Liang
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There is 1 open Gerrit change

                PagerDuty