Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51363

[Volume Test] Indexer crashed during rebalance - "panic: Timestamp mismatch between snapshot"

    XMLWordPrintable

Details

    Description

      Build : 7.1.0-2451
      Test : -test tests/2i/cheshirecat/test_idx_cc_vol_10K.yml -scope tests/2i/cheshirecat/scope_idx_cc_vol_10K.yml
      Scale : 5
      Iteration : 1st

      While removing a KV node 172.23.96.14 from the cluster, indexer on 172.23.120.74 crashed at 2022-03-09T15:26:26 with the following error. Rebalance failed as a result of this.

      2022-03-09T15:26:25.253-08:00 [Warn] TransferToken6b:b:3d:43:94:5a:78:6e Detected Invalid State Change Notification. Token Id Rebalancer::checkValidNotifyStateDest: Local State TransferTokenMerge Metakv State TransferTokenInProgress
      panic: Timestamp mismatch between snapshot
       target bucket: bucket5, scopeId: , collectionId: , vbuckets: 1024 Crc64: 15109669837870858611 snapType FORCE_COMMIT OpenOSOSnap false -
      ...
      ...
      ...
      goroutine 126044172 [running]:
      github.com/couchbase/indexing/secondary/common.CrashOnError(...)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/common/util.go:464
      github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).destTokenToMergeOrReadyLOCKED.func1(0xc02f5c4b00, 0xc035889a14, 0x23, 0xc0001f2420, 0x20, 0xc0001f2880, 0x20, 0xc0001f29a0, 0x20, 0xc0001f2b00, ...)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:2119 +0x657
      created by github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).destTokenToMergeOrReadyLOCKED
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:2089 +0xf8
      Initializing write barrier = 8000
      2022-03-09T15:26:26.318-08:00 [Info] Indexer started with command line: [/opt/couchbase/bin/indexer -adminPort=9100 -scanPort=9101 -httpPort=9102 -streamInitPort=9103 -streamCatchupPort=9104 -streamMaintPort=9105 --httpsPort=19102 --certFile=/opt/couchbase/var/lib/couchbase/config/certs/chain.pem --keyFile=/opt/couchbase/var/lib/couchbase/config/certs/pkey.pem --caFile=/opt/couchbase/var/lib/couchbase/config/certs/ca.pem -ipv4=required -ipv6=optional -vbuckets=1024 -cluster=127.0.0.1:8091 -storageDir=/data/@2i -diagDir=/opt/couchbase/var/lib/couchbase/crash -logDir=/opt/couchbase/var/lib/couchbase/logs -nodeUUID=f8f36f8ee5a463bdd312c2359f5b833d -isEnterprise=true]
      

      This symptom in this issue is similar to MB-50006 which was fixed recently.

      Nodes with index service : 172.23.106.138, 172.23.120.58, 172.23.120.74, 172.23.120.75, 172.23.120.77, 172.23.120.81, 172.23.123.31, 172.23.123.32, 172.23.123.33, 172.23.96.243, 172.23.96.254, 172.23.97.105, 172.23.97.110, 172.23.97.112, 172.23.97.148

      Attachments

        For Gerrit Dashboard: MB-51363
        # Subject Branch Project Status CR V

        Activity

          People

            varun.velamuri Varun Velamuri
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:

              PagerDuty