Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
Untriaged
-
-
1
-
No
Description
Build : 7.1.0-2416
Test : -test tests/integration/neo/test_neo_couchstore_milestone4.yml -scope tests/integration/neo/scope_couchstore.yml
Scale : 3
Iteration : 3rd
A step in the longevity test is to perform a multi node fail over for KV nodes, and then rebalance them out.
[2022-03-04T21:14:49-08:00, sequoiatools/cbutil:bbd5d9] /cbinit.py 172.23.106.100 root couchbase stop
|
[2022-03-04T21:15:14-08:00, sequoiatools/cmd:bc0cc6] 10
|
[2022-03-04T21:15:30-08:00, sequoiatools/cbutil:c62b83] /cbinit.py 172.23.120.107 root couchbase stop
|
[2022-03-04T21:15:40-08:00, sequoiatools/cmd:fad5eb] 10
|
[2022-03-04T21:15:56-08:00, sequoiatools/cbutil:85b1b7] /cbinit.py 172.23.99.11 root couchbase stop
|
[2022-03-04T21:16:15-08:00, sequoiatools/cmd:102979] 10
|
[2022-03-04T21:16:31-08:00, sequoiatools/couchbase-cli:7.1:1336ff] rebalance -c 172.23.108.103:8091 -u Administrator -p password
|
Rebalance is stuck for ~60 hrs at the indexer phase as 2 indexes are stuck in Moving state.
{
|
"bucket" : "bucket7",
|
"collection" : "coll_3",
|
"completion" : 100,
|
"definition" : "CREATE INDEX `idx10_xXgf` ON `bucket7`.`scope_2`.`coll_3`((all (array (all (array flatten_keys(`n`, `v`) for `n` : `v` in (`r`.`ratings`) end)) for `r` in `reviews` end))) PARTITION BY hash((meta().`id`)) WITH { \"defer_build\":true, \"nodes\":[ \"172.23.104.67:8091\",\"172.23.120.245:8091\",\"172.23.96.251:8091\",\"172.23.96.252:8091\" ], \"num_replica\":1, \"num_partition\":2 }",
|
"defnId" : 5451174261260198682,
|
"hosts" : [
|
"172.23.104.67:8091",
|
"172.23.96.251:8091",
|
"172.23.96.252:8091"
|
],
|
"indexName" : "idx10_xXgf",
|
"indexType" : "plasma",
|
"instId" : 10518299366001305340,
|
"lastScanTime" : "Sat Mar 5 05:37:44 PST 2022",
|
"name" : "idx10_xXgf",
|
"numPartition" : 3,
|
"numReplica" : 1,
|
"partitionMap" : {
|
"172.23.104.67:8091" : [
|
1
|
],
|
"172.23.96.251:8091" : [
|
2
|
],
|
"172.23.96.252:8091" : [
|
1
|
]
|
},
|
"partitioned" : true,
|
"progress" : 100,
|
"replicaId" : 0,
|
"scheduled" : false,
|
"scope" : "scope_2",
|
"secExprs" : [
|
"(all (array (all (array flatten_keys(`n`, `v`) for `n` : `v` in (`r`.`ratings`) end)) for `r` in `reviews` end))"
|
],
|
"stale" : false,
|
"status" : "Moving"
|
}
|
{
|
"bucket" : "bucket7",
|
"collection" : "coll_3",
|
"completion" : 100,
|
"definition" : "CREATE INDEX `idx1_eLiKR` ON `bucket7`.`scope_2`.`coll_3`(`country`,(distinct (array ((`r`.`ratings`).`Check in / front desk`) for `r` in `reviews` end)),array_count(`public_likes`),array_count(`reviews`) DESC,`type`,`phone`,`price`,`email`,`address`,`name`,`url`) PARTITION BY hash((meta().`id`)) WITH { \"defer_build\":true, \"nodes\":[ \"172.23.104.155:8091\",\"172.23.104.67:8091\",\"172.23.120.245:8091\",\"172.23.96.251:8091\",\"172.23.96.252:8091\" ], \"num_replica\":2, \"num_partition\":3 }",
|
"defnId" : 12484504482944935375,
|
"hosts" : [
|
"172.23.104.155:8091",
|
"172.23.120.245:8091",
|
"172.23.96.252:8091"
|
],
|
"indexName" : "idx1_eLiKR",
|
"indexType" : "plasma",
|
"instId" : 8445560715018320344,
|
"lastScanTime" : "Sat Mar 5 05:37:43 PST 2022",
|
"name" : "idx1_eLiKR",
|
"numPartition" : 4,
|
"numReplica" : 2,
|
"partitionMap" : {
|
"172.23.104.155:8091" : [
|
3,
|
2
|
],
|
"172.23.120.245:8091" : [
|
1
|
],
|
"172.23.96.252:8091" : [
|
1
|
]
|
},
|
"partitioned" : true,
|
"progress" : 100,
|
"replicaId" : 0,
|
"scheduled" : false,
|
"scope" : "scope_2",
|
"secExprs" : [
|
"`country`",
|
"(distinct (array ((`r`.`ratings`).`Check in / front desk`) for `r` in `reviews` end))",
|
"array_count(`public_likes`)",
|
"array_count(`reviews`)",
|
"`type`",
|
"`phone`",
|
"`price`",
|
"`email`",
|
"`address`",
|
"`name`",
|
"`url`"
|
],
|
"stale" : false,
|
"status" : "Moving"
|
}
|
On indexer node 172.23.96.252, following can be seen in the logs after the start of the rebalance:
root@k8s-worker-2 logs]# zgrep "bucket7:scope_2:coll_3" indexer.log* | grep "2022-03-04T21" | grep -e "\[Warn" -e "\[Erro"
|
indexer.log.4.gz:2022-03-04T21:40:20.444-08:00 [Warn] StreamState::updateHWT Received Partial Last Snapshot in HWT KeyspaceId bucket7:scope_2:coll_3 StreamId INIT_STREAM vbucket 546 Snapshot 0-154757 Seqno 154757 Vbuuid 130479846032254 lastSnap 1-1 lastSnapSeqno 154756
|
indexer.log.4.gz:2022-03-04T21:40:22.950-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.104.70:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:22.950-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.104.70:9999
|
indexer.log.4.gz:2022-03-04T21:40:22.960-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.105.107:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:22.960-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.105.107:9999
|
indexer.log.4.gz:2022-03-04T21:40:22.986-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.108.103:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:22.986-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.108.103:9999
|
indexer.log.4.gz:2022-03-04T21:40:23.000-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.121.117:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:23.000-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.121.117:9999
|
indexer.log.4.gz:2022-03-04T21:40:23.006-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.96.253:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:23.006-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.96.253:9999
|
indexer.log.4.gz:2022-03-04T21:40:23.012-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.97.121:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:23.012-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.97.121:9999
|
indexer.log.4.gz:2022-03-04T21:40:23.019-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.97.122:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:23.019-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.97.122:9999
|
indexer.log.4.gz:2022-03-04T21:40:23.026-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.99.21:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:23.026-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.99.21:9999
|
indexer.log.4.gz:2022-03-04T21:40:23.040-08:00 [Error] KVSender::sendDelKeyspacesRequest Unexpected Error During Del Buckets Request Projector 172.23.99.25:9999 Topic INIT_STREAM_TOPIC_11e8efedc8e58fadd12f9682c371f3ab KeyspaceIds [bucket7:scope_2:coll_3]. Err genServer.closed
|
indexer.log.4.gz:2022-03-04T21:40:23.040-08:00 [Error] KVSender::deleteKeyspacesFromStream INIT_STREAM [bucket7:scope_2:coll_3] Error Received genServer.closed from 172.23.99.25:9999
|
Similar errors and warnings can be seen on other indexer nodes as well - 172.23.104.155, 172.23.120.245, 172.23.104.67, 172.23.96.251
Logs attached are from ~3/4/22 11 PM. Let me know if you need logs from a later time stamp.