Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50688

[System Test] Indexer crash with error "unexpected fault address 0x7f602942f2ad"

    XMLWordPrintable

Details

    Description

      Build : 7.1.0-2187
      Test : -test tests/2i/neo/test_neo_idx_clusterops_recovery.yml -scope tests/2i/neo/scope_neo_plasma_idx_dgm.yml
      Scale : 2
      Iteration : 2nd

      In the GSI system test, there is a step to kill indexer process on a node while rebalance is going on, so that the rebalance fails and retry can kick in. Indexer process was killed on 172.23.97.216. Just after this, indexer process crashed on another indexer node 172.23.107.4 with the following stack trace :

      2022-01-29T12:24:31.950-08:00 [Info] serviceChangeNotifier: received PoolChangeNotification
      2022-01-29T12:24:32.508-08:00 [Info] serviceChangeNotifier: received PoolChangeNotification
      unexpected fault address 0x7f602942f2ad
      fatal error: fault
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x7f602942f2ad pc=0xdb63c0]
       
      goroutine 40947908 [running]:
      runtime.throw(0x13c5142, 0x5)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/panic.go:1117 +0x72 fp=0xc020489608 sp=0xc0204895d8 pc=0x43ee32
      runtime.sigpanic()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/signal_unix.go:741 +0x268 fp=0xc020489640 sp=0xc020489608 pc=0x456288
      github.com/couchbase/plasma.(*item).Sn(...)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/item.go:74
      github.com/couchbase/plasma.(*statsCollector).Adjust(0xc00e2573c0, 0x15a5538, 0x7f5ebedec6dc, 0xc011c3a458)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/mvcc.go:235 +0x140 fp=0xc020489680 sp=0xc020489640 pc=0xdb63c0
      github.com/couchbase/plasma.(*pdMergeIterator).fetchMin(0xc011c3a400)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:651 +0x20a fp=0xc0204896d0 sp=0xc020489680 pc=0xda33ca
      github.com/couchbase/plasma.(*pdMergeIterator).Init(0xc011c3a400)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:605 +0x58 fp=0xc0204896e8 sp=0xc0204896d0 pc=0xda3098
      github.com/couchbase/plasma.(*Iterator).initPgIterator(0xc00d582780, 0x7f5faf714c40, 0x7f5fb3adfee0)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:169 +0x2e2 fp=0xc0204897b0 sp=0xc0204896e8 pc=0xda1a82
      github.com/couchbase/plasma.(*Iterator).Seek(0xc00d582780, 0x7f5fb3adfee0, 0xf, 0x800)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:217 +0x9c fp=0xc020489800 sp=0xc0204897b0 pc=0xda1efc
      github.com/couchbase/plasma.(*MVCCIterator).Seek(0xc00917b440, 0xc00ce43000, 0xf, 0x800)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/mvcc.go:310 +0x105 fp=0xc020489898 sp=0xc020489800 pc=0xdb69e5
      github.com/couchbase/indexing/secondary/indexer.(*plasmaSnapshot).Iterate(0xc06fcdc500, 0x15a5030, 0xc016b0adc8, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x1446808, 0xc01faea300, ...)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:2940 +0x6f0 fp=0xc020489978 sp=0xc020489898 pc=0xfaf8b0
      github.com/couchbase/indexing/secondary/indexer.(*plasmaSnapshot).Range(0xc06fcdc500, 0x15a5030, 0xc016b0adc8, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0xc01faea300, 0x12fc6c0, ...)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:2892 +0xb8 fp=0xc0204899e8 sp=0xc020489978 pc=0xfaf0b8
      github.com/couchbase/indexing/secondary/indexer.scanSingleSlice(0xc01742a580, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x13c962f, 0xb, 0xc00f3e63c0, 0x1, ...)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_scatter.go:206 +0x242 fp=0xc020489b18 sp=0xc0204899e8 pc=0x101b842
      github.com/couchbase/indexing/secondary/indexer.scanOne(0xc01742a580, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x13c962f, 0xb, 0xc00f3e63c0, 0x1, ...)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_scatter.go:148 +0x125 fp=0xc020489c28 sp=0xc020489b18 pc=0x101b3c5
      github.com/couchbase/indexing/secondary/indexer.scatter(0xc01742a580, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x13c962f, 0xb, 0xc00f3e63c0, 0x1, ...)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_scatter.go:55 +0xcf fp=0xc020489cd8 sp=0xc020489c28 pc=0x101a74f
      github.com/couchbase/indexing/secondary/indexer.(*IndexScanSource).Routine(0xc0093dab60, 0x0, 0x0)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_pipeline.go:339 +0xe6b fp=0xc020489f58 sp=0xc020489cd8 pc=0xfff82b
      github.com/couchbase/indexing/secondary/pipeline.(*Pipeline).runIt.func1(0xc010583dc0, 0xc00acae010)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/pipeline/pipeline.go:75 +0x38 fp=0xc020489fd0 sp=0xc020489f58 pc=0xeb8198
      runtime.goexit()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc020489fd8 sp=0xc020489fd0 pc=0x478481
      created by github.com/couchbase/indexing/secondary/pipeline.(*Pipeline).runIt
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/pipeline/pipeline.go:74 +0x66
      

      This test has been run regularly, almost every week, but haven't seen this issue any time so far. Last time this test was run with 7.1.0-2079. Now, I don't see any changes between 2079 and 2187 that can cause this crash, so I am assuming this is not a regression, but we are discovering it only now.

      Indexer nodes : 172.23.107.2, 172.23.107.3, 172.23.107.4, 172.23.107.5, 172.23.97.216, 172.23.97.217

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mihir.kamdar Mihir Kamdar (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty