Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Cheshire-Cat
-
Untriaged
-
-
1
-
Yes
Description
Build : 7.0.0-5295
Test : -test tests/integration/cheshirecat/test_cheshirecat_kv_gsi_coll_xdcr_backup_sgw_fts_itemct_txns_eventing_cbas_scale3.yml -scope tests/integration/cheshirecat/scope_cheshirecat_with_backup.yml
Scale : 3
Iteration : 5th (Day 5th)
Rebalance operation to add a new index node 172.23.123.24 to the cluster failed.
This seems to be an intermittent issue. It was recently reported and fixed via MB-46039 as well.
From test console
[2021-06-15T18:42:18-07:00, sequoiatools/couchbase-cli:7.0:7b1dbd] server-add -c 172.23.97.74:8091 --server-add https://172.23.123.24 -u Administrator -p password --server-add-username Administrator --server-add-password password --services index
|
[2021-06-15T18:42:34-07:00, sequoiatools/couchbase-cli:7.0:563a15] rebalance -c 172.23.97.74:8091 -u Administrator -p password
|
→
|
|
Error occurred on container - sequoiatools/couchbase-cli:7.0:[rebalance -c 172.23.97.74:8091 -u Administrator -p password]
|
|
docker logs 563a15
|
docker start 563a15
|
|
*Unable to display progress bar on this os
|
JERROR: Rebalance failed. See logs for detailed reason. You can try again.
|
[2021-06-15T19:12:31-07:00, sequoiatools/cmd:9a04e4] 60
|
From error.log on 172.23.106.134 :
[ns_server:error,2021-06-15T19:12:23.994-07:00,ns_1@172.23.106.134:service_rebalancer-index<0.29734.2887>:service_rebalancer:run_rebalance_worker:119]Worker terminated abnormally: {'EXIT',<0.28245.2887>,
|
{rebalance_failed,
|
{service_error,
|
<<"Post http://127.0.0.1:9102/createIndexRebalance: EOF">>}}}
|
[user:error,2021-06-15T19:12:23.996-07:00,ns_1@172.23.106.134:<0.17645.1347>:ns_orchestrator:log_rebalance_completion:1416]Rebalance exited with reason {service_rebalance_failed,index,
|
{worker_died,
|
{'EXIT',<0.28245.2887>,
|
{rebalance_failed,
|
{service_error,
|
<<"Post http://127.0.0.1:9102/createIndexRebalance: EOF">>}}}}}.
|
Rebalance Operation Id = 6058b85561144335d30164f8c1a96327
|
From the rebalance report :
"index":{
|
"totalProgress":69.18429003021149,
|
"perNodeProgress":{
|
"ns_1@172.23.97.110":0.6918429003021148,
|
"ns_1@172.23.96.243":0.6918429003021148,
|
"ns_1@172.23.123.24":0.6918429003021148,
|
"ns_1@172.23.97.105":0.6918429003021148,
|
"ns_1@172.23.120.75":0.6918429003021148,
|
"ns_1@172.23.97.148":0.6918429003021148,
|
"ns_1@172.23.120.58":0.6918429003021148
|
},
|
"startTime":"2021-06-15T18:42:48.907-07:00",
|
"completedTime":false,
|
"timeTaken":1775138
|
}
|
On indexer node 172.23.123.24, seeing the following in the indexer logs :
2021-06-15T19:12:20.974-07:00 [Info] Rebalancer::decodeTransferToken TransferToken TransferToken67:2b:33:90:cf:8e:7e:4b MasterId: a826a4733e9644442e2517288e82a0d8 SourceId: be30ddb96b59e6b70c07733a0155e0d6 (172.23.120.58:8091) DestId: e4198e9e98e43788fae35314ade88f0a (172.23.97.105:8091) RebalId: 5c70774072149bca4d20d7ec2f0ec364 State: TransferTokenCreated BuildSource: Dcp TransferMode: Move Error: Post http://127.0.0.1:9102/createIndexRebalance: EOF InstId: 16276776005385447829 RealInstId: 14031018094979839685 Partitions: [2] Versions: [4] Inst:
|
InstId: 14031018094979839685
|
Defn: DefnId: 7702856850988250970 Name: idx3_ftdQ Using: plasma Bucket: bucket4 Scope/Id: scope_1/9 Collection/Id: coll_2/f IsPrimary: false NumReplica: 2 InstVersion: 4
|
SecExprs: <ud>([`free_breakfast` `free_parking` `country` `city`])</ud>
|
Desc: [false false false false]
|
PartitionScheme: KEY
|
HashScheme: CRC32 PartitionKeys: [(meta().`id`)] WhereExpr: <ud>()</ud> RetainDeletedXATTR: false
|
State: INDEX_STATE_ACTIVE
|
RState: RebalActive
|
Stream: NIL_STREAM
|
Version: 3
|
ReplicaId: 1
|
PartitionContainer: <nil>
|
2021-06-15T19:12:20.974-07:00 [Error] Rebalancer::processTokenAsMaster Detected TransferToken in Error state MasterId: a826a4733e9644442e2517288e82a0d8 SourceId: be30ddb96b59e6b70c07733a0155e0d6 (172.23.120.58:8091) DestId: e4198e9e98e43788fae35314ade88f0a (172.23.97.105:8091) RebalId: 5c70774072149bca4d20d7ec2f0ec364 State: TransferTokenCreated BuildSource: Dcp TransferMode: Move Error: Post http://127.0.0.1:9102/createIndexRebalance: EOF InstId: 16276776005385447829 RealInstId: 14031018094979839685 Partitions: [2] Versions: [4] Inst:
|
InstId: 14031018094979839685
|
Defn: DefnId: 7702856850988250970 Name: idx3_ftdQ Using: plasma Bucket: bucket4 Scope/Id: scope_1/9 Collection/Id: coll_2/f IsPrimary: false NumReplica: 2 InstVersion: 4
|
SecExprs: <ud>([`free_breakfast` `free_parking` `country` `city`])</ud>
|
Desc: [false false false false]
|
PartitionScheme: KEY
|
HashScheme: CRC32 PartitionKeys: [(meta().`id`)] WhereExpr: <ud>()</ud> RetainDeletedXATTR: false
|
State: INDEX_STATE_ACTIVE
|
RState: RebalActive
|
Stream: NIL_STREAM
|
Version: 3
|
ReplicaId: 1
|
PartitionContainer: <nil>
|
. Abort.
|
2021-06-15T19:12:20.974-07:00 [Info] Rebalancer::doFinish Cleanup Post http://127.0.0.1:9102/createIndexRebalance: EOF
|
2021-06-15T19:12:20.975-07:00 [Info] Rebalancer::processDropIndexQueue Done Received
|
2021-06-15T19:12:20.975-07:00 [Info] Rebalancer::observeRebalance exiting err <nil>
|
2021-06-15T19:12:20.975-07:00 [Info] Rebalancer::updateProgress Done Received
|
Attachments
Issue Links
- Clones
-
MB-46945 [System Test] : Rebalance failure due to reason service_rebalance_failed,index - Post http://127.0.0.1:9102/createIndexRebalance: EOF
- Closed