Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
5.5.3
-
Untriaged
-
1
-
Unknown
Description
In the related CBSE-8368, there is a problematic index. During recovery of this index, there is a crash-restart loop with the following stacktrace:
2020-03-03T07:33:42.815+00:00 [Info] ServiceMgr::GetCurrentTopology [0 0 0 0 0 0 0 1] |
unexpected fault address 0x0 |
fatal error: fault
|
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x6c1e35]goroutine 731 [running]: |
runtime.throw(0xf8d3c4, 0x5) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/panic.go:566 +0x95 fp=0xc4281c3308 sp=0xc4281c32e8 |
runtime.sigpanic()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/sigpanic_unix.go:27 +0x288 fp=0xc4281c3360 sp=0xc4281c3308 |
sync/atomic.LoadUint64(0x414e6c696e6622, 0x414e6c696e65736c) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/sync/atomic/asm_amd64.s:102 +0x5 fp=0xc4281c3368 sp=0xc4281c3360 |
github.com/couchbase/nitro/skiplist.(*Node).getNext(0x414e6c696e6573, 0x9, 0x414e6c696e6573, 0x0) |
goproj/src/github.com/couchbase/nitro/skiplist/node_amd64.go:108 +0x4d fp=0xc4281c3388 sp=0xc4281c3368 |
github.com/couchbase/nitro/skiplist.(*Skiplist).GetRangeSplitItems(0xc42011e9c0, 0x1, 0x0, 0x0, 0x1) |
goproj/src/github.com/couchbase/nitro/skiplist/skiplist.go:419 +0x1ad fp=0xc4281c3428 sp=0xc4281c3388 |
github.com/couchbase/plasma.(*Plasma).GetRangePartitions(0xc424de5400, 0x1, 0x0, 0x0, 0x0) |
goproj/src/github.com/couchbase/plasma/page_visitor.go:89 +0x163 fp=0xc4281c34d8 sp=0xc4281c3428 |
github.com/couchbase/plasma.(*Plasma).PageVisitor(0xc424de5400, 0xc4283c2c40, 0x1, 0xc42483f740, 0xc42483c5c0) |
goproj/src/github.com/couchbase/plasma/page_visitor.go:33 +0x79 fp=0xc4281c3598 sp=0xc4281c34d8 |
github.com/couchbase/plasma.(*Plasma).doRecovery(0xc424de5400, 0xc42483f420, 0x21) |
goproj/src/github.com/couchbase/plasma/plasma.go:982 +0x47d fp=0xc4281c3670 sp=0xc4281c3598 |
github.com/couchbase/plasma.New(0x1e, 0x12c, 0x5, 0x4, 0x106c5d8, 0x106c5a0, 0x106c5e0, 0xc424840620, 0xc424840630, 0x106c5a0, ...) |
goproj/src/github.com/couchbase/plasma/plasma.go:598 +0x11b0 fp=0xc4281c3d80 sp=0xc4281c3670 |
github.com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores.func2(0xc4248405f0, 0xc4236c6d80, 0xc424840610, 0xc423d32a80) |
goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:310 +0x91 fp=0xc4281c3f80 sp=0xc4281c3d80 |
runtime.goexit()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4281c3f88 sp=0xc4281c3f80 |
created by github.com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores
|
goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:315 +0x28f1 |
This happens for about 1 month and then a recovery gets stuck while recovering the same index. The go routine dump shows that this also occurs in similar area to the crash:
1 @ 0xcda020 0xcdc05d 0xa92983 0xa92279 0xa9c6dd 0xa9a6b0 0x63ec61 0x465931 |
# 0xcda020 github.com/couchbase/nitro/skiplist.(*Node).getNext+0x0 goproj/src/github.com/couchbase/nitro/skiplist/node_amd64.go:104 |
# 0xcdc05c github.com/couchbase/nitro/skiplist.(*Skiplist).GetRangeSplitItems+0x1ac goproj/src/github.com/couchbase/nitro/skiplist/skiplist.go:419 |
# 0xa92982 github.com/couchbase/plasma.(*Plasma).GetRangePartitions+0x162 goproj/src/github.com/couchbase/plasma/page_visitor.go:89 |
# 0xa92278 github.com/couchbase/plasma.(*Plasma).PageVisitor+0x78 goproj/src/github.com/couchbase/plasma/page_visitor.go:33 |
# 0xa9c6dc github.com/couchbase/plasma.(*Plasma).doRecovery+0x47c goproj/src/github.com/couchbase/plasma/plasma.go:982 |
# 0xa9a6af github.com/couchbase/plasma.New+0x11af goproj/src/github.com/couchbase/plasma/plasma.go:598 |
# 0x63ec60 github.com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores.func2+0x90 goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:310 |
Logs show that the issues is happening for the backstore of the problematic index.
Also, this index appears to have very large keys ~145K.