Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
7.2.1
-
7.2.1-5878 ( upgraded from 7.1.5-3877)
-
Untriaged
-
0
-
Unknown
Description
This is a 7-node cluster ( 3 KV+ 2 Index+ 2 Query nodes). The cluster is running with a very low index RR(0 to 1%). Unfortunately, the nodes were not responsive and I could collect the logs for one of the index nodes.
https://cb-engineering.s3.amazonaws.com/SysTest24Jul_1/collectinfo-2023-07-26T112447-ns_1%40svc-d-node-008.x6xy5nt4xj5p-vn.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest24Jul_1/collectinfo-2023-07-26T112447-ns_1%40svc-d-node-009.x6xy5nt4xj5p-vn.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest24Jul_1/collectinfo-2023-07-26T112447-ns_1%40svc-d-node-010.x6xy5nt4xj5p-vn.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest24Jul_1/collectinfo-2023-07-26T112447-ns_1%40svc-i-node-013.x6xy5nt4xj5p-vn.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest24Jul_1/collectinfo-2023-07-26T112447-ns_1%40svc-q-node-011.x6xy5nt4xj5p-vn.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest24Jul_1/collectinfo-2023-07-26T112447-ns_1%40svc-q-node-014.x6xy5nt4xj5p-vn.sandbox.nonprod-project-avengers.com.zip |
Supportal snapshot ->
https://supportal.couchbase.com/customer/systest_24jul_1/cluster/015c70994c958293105e7e7c7b45d071
// Some comments here
|
public String getFoo() |
{
|
return foo; |
}
|
2023-07-26T11:09:47.862+00:00 [Info] default6/idx3_CvN3/Mainstore#10548483258518200711:1 Plasma: SMR reclaim pending is higher than expected: pending = 19 KB (expected = 12 KB), wCtxCnt = 11, objCnt 3, changed reclaimList flush threshold from 5 to 0, changed reclaimSize flush threshold from 1 KB to 0 KB. |
2023-07-26T11:09:47.863+00:00 [Info] default6/idx3_CvN3/Mainstore#10903970996576475501:4 Plasma: SMR reclaim pending is higher than expected: pending = 70 KB (expected = 12 KB), wCtxCnt = 6, objCnt 7, changed reclaimList flush threshold from 0 to 0, changed reclaimSize flush threshold from 1 KB to 1 KB. |
2023-07-26T11:09:47.926+00:00 [Warn] AutofailoverServiceManager::HealthCheck: Slow heartbeat 2.106421341s. priorTime: 2023-07-26 11:09:45.820183389 +0000 UTC m=+136955.683533216, callTime: 2023-07-26 11:09:47.92660473 +0000 UTC m=+136957.789954557, healthInfo: {DiskFailures:0} |
2023-07-26T11:09:48.068+00:00 [Info] default8/idx8_XW91S5K4YY_idxprefix/Mainstore#3684940844532413095:1 Plasma: SMR reclaim pending is higher than expected: pending = 43 KB (expected = 12 KB), wCtxCnt = 10, objCnt 19, changed reclaimList flush threshold from 1 to 0, changed reclaimSize flush threshold from 0 KB to 0 KB. |
2023-07-26T11:09:48.083+00:00 [Error] PeerPipe.doRecieve() : ecounter error when received mesasage from Peer 10.0.0.204:41666. Error = read tcp4 10.0.0.204:9100->10.0.0.204:41666: i/o timeout. Kill Pipe. |
2023-07-26T11:09:48.083+00:00 [Error] PeerListener.handleConnection error in authfn Server Error : SyncProxy.listen(): channel closed. Terminate for conn 10.0.0.204:9100:10.0.0.204:41666 |
2023-07-26T11:09:48.189+00:00 [Info] default6/idx1_T30z/Mainstore#14406419825967849331:4 Plasma: SMR reclaim pending is higher than expected: pending = 76 KB (expected = 12 KB), wCtxCnt = 8, objCnt 13, changed reclaimList flush threshold from 2 to 0, changed reclaimSize flush threshold from 1 KB to 1 KB. |
2023-07-26T11:09:48.258+00:00 [Info] default8/idx12_rA0XDr/Mainstore#6150617141440904616:3 Plasma: SMR reclaim pending is higher than expected: pending = 70 KB (expected = 12 KB), wCtxCnt = 12, objCnt 11, changed reclaimList flush threshold from 0 to 0, changed reclaimSize flush threshold from 0 KB to 0 KB. |
2023-07-26T11:09:48.599+00:00 [Info] default2/idx9_n8CtS/Mainstore#8671557997859106209:2 Plasma: Warning: not enough memory to hold records in memory. MemStats: {"memory_size":902995,"memory_size_index":770171,"buf_memused":23714463,"mvcc_purge_ratio":1.00000,"resident_ratio":0.00000,"alloc_size":644425855,"free_size":642752689,"items_count":1985086,"recs_in_mem":0,"reclaimed":642752689,"reclaim_pending":0} |
|
2023-07-26T11:09:48.640+00:00 [Info] AutofailoverServiceManager::IsSafe: Called with nodeUUIDs [4a5c0a660929b26912d512e460777943] |
2023-07-26T11:09:48.695+00:00 [Info] default8/idx3_E6CCEDY1DJ_idxprefix/Mainstore#10757568671253448994:0 Plasma: SMR reclaim pending is higher than expected: pending = 27 KB (expected = 12 KB), wCtxCnt = 11, objCnt 5, changed reclaimList flush threshold from 0 to 0, changed reclaimSize flush threshold from 0 KB to 0 KB. |
fatal error: runtime: out of memory
|
|
runtime stack:
|
runtime.throw({0x135df5b?, 0x21d8260?}) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/panic.go:1047 +0x5d fp=0x7fca1a1f2d20 sp=0x7fca1a1f2cf0 pc=0x43dd1d |
runtime.sysMapOS(0xc3bc000000, 0x400000?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mem_linux.go:187 +0x11b fp=0x7fca1a1f2d68 sp=0x7fca1a1f2d20 pc=0x41ef7b |
runtime.sysMap(0x21c0a40?, 0x433a7a?, 0x21d0bd8?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mem.go:142 +0x35 fp=0x7fca1a1f2d98 sp=0x7fca1a1f2d68 pc=0x41e955 |
runtime.(*mheap).grow(0x21c0a40, 0x2000?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mheap.go:1522 +0x252 fp=0x7fca1a1f2e10 sp=0x7fca1a1f2d98 pc=0x42f1b2 |
runtime.(*mheap).allocSpan(0x21c0a40, 0x1, 0x0, 0x52?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mheap.go:1243 +0x1b7 fp=0x7fca1a1f2ea8 sp=0x7fca1a1f2e10 pc=0x42e8f7 |
runtime.(*mheap).alloc.func1()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mheap.go:961 +0x65 fp=0x7fca1a1f2ef0 sp=0x7fca1a1f2ea8 pc=0x42e3a5 |
runtime.systemstack()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/asm_amd64.s:496 +0x49 fp=0x7fca1a1f2ef8 sp=0x7fca1a1f2ef0 pc=0x470f89 |
|
goroutine 231963 [running]: |
runtime.systemstack_switch()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/asm_amd64.s:463 fp=0xc00a171258 sp=0xc00a171250 pc=0x470f20 |
runtime.(*mheap).alloc(0x437910?, 0xdd0546?, 0x0?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mheap.go:955 +0x65 fp=0xc00a1712a0 sp=0xc00a171258 pc=0x42e2e5 |
runtime.(*mcentral).grow(0xc0d0d203f0?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mcentral.go:246 +0x57 fp=0xc00a1712e0 sp=0xc00a1712a0 pc=0x41e2b7 |
runtime.(*mcentral).cacheSpan(0x21d1558) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mcentral.go:166 +0x306 fp=0xc00a171338 sp=0xc00a1712e0 pc=0x41e106 |
runtime.(*mcache).refill(0x7fd5723316b8, 0xc?) |
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.20.4/go/src/runtime/mcache.go:182 +0x152 fp=0xc00a171378 sp=0xc00a171338 pc=0x41d852 |
runtime.(*mcache).nextFree(0x7fd5723316b8, 0xc) |
Attachments
Issue Links
- relates to
-
MB-57814 [System Test on cloud] Index/Query nodes getting failed over and added back frequently
- Closed