Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
Untriaged
-
-
1
-
No
Description
Build : 7.1.0-2187
Test : -test tests/2i/neo/test_neo_idx_clusterops_recovery.yml -scope tests/2i/neo/scope_neo_plasma_idx_dgm.yml
Scale : 2
Iteration : 2nd
In the GSI system test, there is a step to kill indexer process on a node while rebalance is going on, so that the rebalance fails and retry can kick in. Indexer process was killed on 172.23.97.216. Just after this, indexer process crashed on another indexer node 172.23.107.4 with the following stack trace :
2022-01-29T12:24:31.950-08:00 [Info] serviceChangeNotifier: received PoolChangeNotification
|
2022-01-29T12:24:32.508-08:00 [Info] serviceChangeNotifier: received PoolChangeNotification
|
unexpected fault address 0x7f602942f2ad
|
fatal error: fault
|
[signal SIGSEGV: segmentation violation code=0x1 addr=0x7f602942f2ad pc=0xdb63c0]
|
|
goroutine 40947908 [running]:
|
runtime.throw(0x13c5142, 0x5)
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/panic.go:1117 +0x72 fp=0xc020489608 sp=0xc0204895d8 pc=0x43ee32
|
runtime.sigpanic()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/signal_unix.go:741 +0x268 fp=0xc020489640 sp=0xc020489608 pc=0x456288
|
github.com/couchbase/plasma.(*item).Sn(...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/item.go:74
|
github.com/couchbase/plasma.(*statsCollector).Adjust(0xc00e2573c0, 0x15a5538, 0x7f5ebedec6dc, 0xc011c3a458)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/mvcc.go:235 +0x140 fp=0xc020489680 sp=0xc020489640 pc=0xdb63c0
|
github.com/couchbase/plasma.(*pdMergeIterator).fetchMin(0xc011c3a400)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:651 +0x20a fp=0xc0204896d0 sp=0xc020489680 pc=0xda33ca
|
github.com/couchbase/plasma.(*pdMergeIterator).Init(0xc011c3a400)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:605 +0x58 fp=0xc0204896e8 sp=0xc0204896d0 pc=0xda3098
|
github.com/couchbase/plasma.(*Iterator).initPgIterator(0xc00d582780, 0x7f5faf714c40, 0x7f5fb3adfee0)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:169 +0x2e2 fp=0xc0204897b0 sp=0xc0204896e8 pc=0xda1a82
|
github.com/couchbase/plasma.(*Iterator).Seek(0xc00d582780, 0x7f5fb3adfee0, 0xf, 0x800)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/iterator.go:217 +0x9c fp=0xc020489800 sp=0xc0204897b0 pc=0xda1efc
|
github.com/couchbase/plasma.(*MVCCIterator).Seek(0xc00917b440, 0xc00ce43000, 0xf, 0x800)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/mvcc.go:310 +0x105 fp=0xc020489898 sp=0xc020489800 pc=0xdb69e5
|
github.com/couchbase/indexing/secondary/indexer.(*plasmaSnapshot).Iterate(0xc06fcdc500, 0x15a5030, 0xc016b0adc8, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x1446808, 0xc01faea300, ...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:2940 +0x6f0 fp=0xc020489978 sp=0xc020489898 pc=0xfaf8b0
|
github.com/couchbase/indexing/secondary/indexer.(*plasmaSnapshot).Range(0xc06fcdc500, 0x15a5030, 0xc016b0adc8, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0xc01faea300, 0x12fc6c0, ...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:2892 +0xb8 fp=0xc0204899e8 sp=0xc020489978 pc=0xfaf0b8
|
github.com/couchbase/indexing/secondary/indexer.scanSingleSlice(0xc01742a580, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x13c962f, 0xb, 0xc00f3e63c0, 0x1, ...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_scatter.go:206 +0x242 fp=0xc020489b18 sp=0xc0204899e8 pc=0x101b842
|
github.com/couchbase/indexing/secondary/indexer.scanOne(0xc01742a580, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x13c962f, 0xb, 0xc00f3e63c0, 0x1, ...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_scatter.go:148 +0x125 fp=0xc020489c28 sp=0xc020489b18 pc=0x101b3c5
|
github.com/couchbase/indexing/secondary/indexer.scatter(0xc01742a580, 0x15abcd0, 0xc00b38a540, 0x15abcd0, 0xc00b38a570, 0x3, 0x13c962f, 0xb, 0xc00f3e63c0, 0x1, ...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_scatter.go:55 +0xcf fp=0xc020489cd8 sp=0xc020489c28 pc=0x101a74f
|
github.com/couchbase/indexing/secondary/indexer.(*IndexScanSource).Routine(0xc0093dab60, 0x0, 0x0)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/scan_pipeline.go:339 +0xe6b fp=0xc020489f58 sp=0xc020489cd8 pc=0xfff82b
|
github.com/couchbase/indexing/secondary/pipeline.(*Pipeline).runIt.func1(0xc010583dc0, 0xc00acae010)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/pipeline/pipeline.go:75 +0x38 fp=0xc020489fd0 sp=0xc020489f58 pc=0xeb8198
|
runtime.goexit()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc020489fd8 sp=0xc020489fd0 pc=0x478481
|
created by github.com/couchbase/indexing/secondary/pipeline.(*Pipeline).runIt
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/pipeline/pipeline.go:74 +0x66
|
This test has been run regularly, almost every week, but haven't seen this issue any time so far. Last time this test was run with 7.1.0-2079. Now, I don't see any changes between 2079 and 2187 that can cause this crash, so I am assuming this is not a regression, but we are discovering it only now.
Indexer nodes : 172.23.107.2, 172.23.107.3, 172.23.107.4, 172.23.107.5, 172.23.97.216, 172.23.97.217