Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
Untriaged
-
-
1
-
Unknown
-
KV 2021-Oct-21, KV 2021-Nov
Description
Build : 7.1.0-1524
Test : -test tests/integration/neo/test_neo_couchstore_milestone2.yml -scope tests/integration/neo/scope_couchstore.yml
Scale : 3
Iteration : 2nd
In the 2nd iteration of the longevity test, a swap rebalance operation for index nodes started at 2021-10-19T18:19:34
[2021-10-19T18:19:18-07:00, sequoiatools/couchbase-cli:7.0:231375] server-add -c 172.23.108.103:8091 --server-add https://172.23.104.67 -u Administrator -p password --server-add-username Administrator --server-add-password password --services index
|
[2021-10-19T18:19:34-07:00, sequoiatools/couchbase-cli:7.0:49dd8c] rebalance -c 172.23.108.103:8091 --server-remove 172.23.104.69 -u Administrator -p password
|
This is stuck since 20 hrs now in the index phase of the rebalance due to one index in Moving state.
{
|
"bucket" : "bucket7",
|
"collection" : "coll_1",
|
"completion" : 100,
|
"definition" : "CREATE INDEX `idx3_b31z` ON `bucket7`.`scope_3`.`coll_1`(`free_breakfast`,`free_parking`,`country`,`city`) PARTITION BY hash((meta().`id`)) WITH { \"defer_build\":true, \"nodes\":[ \"172.23.104.67:8091\",\"172.23.104.69:8091\",\"172.23.105.111:8091\",\"172.23.120.245:8091\",\"172.23.121.117:8091\",\"172.23.96.252:8091\",\"172.23.96.253:8091\" ], \"num_replica\":3, \"num_partition\":5 }",
|
"defnId" : 15684706469183385686,
|
"hosts" : [
|
"172.23.104.67:8091",
|
"172.23.104.69:8091",
|
"172.23.105.111:8091",
|
"172.23.120.245:8091",
|
"172.23.96.252:8091",
|
"172.23.96.253:8091"
|
],
|
"indexName" : "idx3_b31z",
|
"indexType" : "plasma",
|
"instId" : 8555298990214263616,
|
"lastScanTime" : "NA",
|
"name" : "idx3_b31z",
|
"numPartition" : 6,
|
"numReplica" : 3,
|
"partitionMap" : {
|
"172.23.104.67:8091" : [
|
4
|
],
|
"172.23.104.69:8091" : [
|
4
|
],
|
"172.23.105.111:8091" : [
|
3
|
],
|
"172.23.120.245:8091" : [
|
1
|
],
|
"172.23.96.252:8091" : [
|
2
|
],
|
"172.23.96.253:8091" : [
|
5
|
]
|
},
|
"partitioned" : true,
|
"progress" : 100,
|
"replicaId" : 0,
|
"scheduled" : false,
|
"scope" : "scope_3",
|
"secExprs" : [
|
"`free_breakfast`",
|
"`free_parking`",
|
"`country`",
|
"`city`"
|
],
|
"stale" : false,
|
"status" : "Moving"
|
}
|
From the stats, 318 mutations are pending for this index on 172.23.104.67 since a long time. From the indexer logs, the index has been in CATCHUP state since 2021-10-19T20:08:48 -
2021-10-19T20:08:48.386-07:00 [Info] Rebalancer::waitForIndexBuild: Index: bucket7:scope_3:coll_1:idx3_b31z State: INDEX_STATE_CATCHUP Pending: 316 EstTime: 0 Partitions: [4] Destination: 127.0.0.1:9102
|
2021-10-19T20:08:51.404-07:00 [Info] Rebalancer::waitForIndexBuild: Index: bucket7:scope_3:coll_1:idx3_b31z State: INDEX_STATE_CATCHUP Pending: 316 EstTime: 0 Partitions: [4] Destination: 127.0.0.1:9102
|
2021-10-19T20:08:52.159-07:00 [Info] Rebalancer::waitForIndexBuild: Index: bucket7:scope_3:coll_1:idx3_b31z State: INDEX_STATE_CATCHUP Pending: 316 EstTime: 0 Partitions: [4] Destination: 127.0.0.1:9102
|
This doesn't look to be related to MB-49031, as I couldnt find the keyword "committed harakiri" in the index logs on any of the indexer nodes.