Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
5.5.0
-
Untriaged
-
-
No
Description
Build : 5.5.0-2211
In the system test for secondary indexing, following are the steps performed :
1. 6 node cluster : 2 kv, 1 query and 3 indexer node
2. 4 buckets and 4 indexes on each of them, including 1 partitioned indexes.
3. Start constant kv ops
4. Start constant queries including aggregate pushdown queries
5. Leave the system idle for a few minutes.
6. Rebalance in another indexer node.
7. Rebalance out another index node.
There is a failure observed twice at this step. Indexer on node added in Step 6 fails. Here is the error shown in diag logs.
Service 'indexer' exited with status 134. Restarting. Messages:
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/select.go:423 +0x1235 fp=0xc465b97b88 sp=0xc465b97928
|
runtime.selectgo(0xc465b97c38)
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/select.go:238 +0x1c fp=0xc465b97bb0 sp=0xc465b97b88
|
github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).tokenMergeOrReady.func1(0xc4211df600, 0xc49dfcd994, 0x24, 0xc4bf0a5880, 0x20, 0xc4bf0a58a0, 0x20, 0xc4bf0a58c0, 0x20, 0xc4bf0a58e0, ...)
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:788 +0x480 fp=0xc465b97d00 sp=0xc465b97bb0
|
runtime.goexit()
|
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc465b97d08 sp=0xc465b97d00
|
created by github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).tokenMergeOrReady
|
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:829 +0x279
|
[goport(/opt/couchbase/bin/indexer)] 2018/03/16 10:22:54 child process exited with status 134
|
One observation was that even though the UI logs showed message Rebalance completed successfully for Step 6, it was stuck at 99.4% overall progress for >2 mins after that message.
cbcollectinfo attached.
The cluster is currently available for debugging if needed. It may be repurposed over the weekend.
http://172.23.104.18:8091/