Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
6.6.5
-
None
-
Enterprise Edition 6.6.5 build 10104
-
Untriaged
-
Centos 64-bit
-
1
-
Unknown
Description
Steps to Repro
1. Create a 6 node cluster with 3kv, 2 indexing and 1 n1ql nodes.
2. Create buckets/data/indexes and push buckets to dgm and ensure indexes are in DGM as well. Start running queries in background with request_plus consistency level.
3. Ran the following script to validate MB-53057 which kills memcached(on 172.23.100.34), waits for AF to kick in, does full recovery and then rebalances in an infinite loop.
#!/bin/bash
|
while :
|
do
|
echo "killing memcached..."
|
kill -9 `pidof memcached`
|
echo "Waiting for auto failover to kick in..."
|
sleep 180
|
echo "Listing node status post Auto failover..."
|
/opt/couchbase/bin/couchbase-cli server-list -c localhost:8091 --username Administrator --password password
|
sleep 30
|
echo "Starting full recovery..."
|
/opt/couchbase/bin/couchbase-cli recovery -c localhost:8091 --username Administrator --password password --server-recovery 172.23.100.34:8091 --recovery-type full
|
sleep 30
|
echo "Starting Rebalance after recovering a failed over node..."
|
/opt/couchbase/bin/couchbase-cli rebalance -c localhost:8091 --username Administrator --password password
|
sleep 4000
|
echo "Listing rebalance status..."
|
/opt/couchbase/bin/couchbase-cli rebalance-status -c localhost:8091 --username Administrator --password password
|
sleep 30
|
echo "Listing node status post rebalance..."
|
/opt/couchbase/bin/couchbase-cli server-list -c localhost:8091 --username Administrator --password password
|
sleep 300
|
done
|
172.23.106.159
-bash-4.2# grep rollbackAllToZero *
|
indexer.log:2022-07-28T23:21:27.611-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test6
|
indexer.log:2022-07-28T23:25:12.092-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test7
|
grep: rebalance: Is a directory
|
-bash-4.2#
|
-bash-4.2# grep "RESPONSE status:(error" indexer.log
|
2022-07-28T20:52:09.953-07:00 [Info] SCAN##6 RESPONSE status:(error = Indexer rollback), requestId: a2a386e2-c8be-4e09-b3e3-08f7d07b7664
|
2022-07-28T21:34:01.142-07:00 [Info] SCAN##33 RESPONSE status:(error = Index scan timed out), requestId: 8176078c-a052-49d2-8ed2-e17c7fc39e11
|
2022-07-28T21:37:30.653-07:00 [Info] SCAN##34 RESPONSE status:(error = Index scan timed out), requestId: 06f18bfe-3121-4160-95dc-4910d9044dbf
|
2022-07-28T21:43:30.666-07:00 [Info] SCAN##35 RESPONSE status:(error = Index scan timed out), requestId: 06f18bfe-3121-4160-95dc-4910d9044dbf
|
2022-07-28T21:56:50.734-07:00 [Info] SCAN##36 RESPONSE status:(error = Index scan timed out), requestId: e2771988-044c-42ee-8be6-dd087450b1f8
|
2022-07-28T21:58:50.746-07:00 [Info] SCAN##37 RESPONSE status:(error = Index scan timed out), requestId: e2771988-044c-42ee-8be6-dd087450b1f8
|
2022-07-28T22:05:10.760-07:00 [Info] SCAN##38 RESPONSE status:(error = Index scan timed out), requestId: 784fbd84-f159-4c97-83bc-ecea954b0570
|
2022-07-28T22:07:10.772-07:00 [Info] SCAN##39 RESPONSE status:(error = Index scan timed out), requestId: 784fbd84-f159-4c97-83bc-ecea954b0570
|
2022-07-28T22:13:30.786-07:00 [Info] SCAN##40 RESPONSE status:(error = Index scan timed out), requestId: fb1c5f7d-b40c-4a10-b29f-93d906062552
|
2022-07-28T22:17:30.798-07:00 [Info] SCAN##41 RESPONSE status:(error = Index scan timed out), requestId: fb1c5f7d-b40c-4a10-b29f-93d906062552
|
2022-07-28T22:19:50.810-07:00 [Info] SCAN##42 RESPONSE status:(error = Index scan timed out), requestId: ff34ed61-1a6d-4b1e-9c9f-4e03392133de
|
2022-07-28T22:25:50.823-07:00 [Info] SCAN##43 RESPONSE status:(error = Index scan timed out), requestId: ff34ed61-1a6d-4b1e-9c9f-4e03392133de
|
2022-07-28T22:28:30.836-07:00 [Info] SCAN##44 RESPONSE status:(error = Index scan timed out), requestId: 1eedd0ca-72cb-489b-9b69-89cde7a369ec
|
2022-07-28T22:55:31.234-07:00 [Info] SCAN##53 RESPONSE status:(error = Index scan timed out), requestId: fe917f09-c561-4796-8b33-567c11c73d1c
|
2022-07-28T22:57:31.245-07:00 [Info] SCAN##54 RESPONSE status:(error = Index scan timed out), requestId: fe917f09-c561-4796-8b33-567c11c73d1c
|
2022-07-28T23:24:18.134-07:00 [Info] SCAN##68 RESPONSE status:(error = Indexer rollback), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
|
2022-07-28T23:28:27.462-07:00 [Info] SCAN##69 RESPONSE status:(error = Index scan timed out), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
|
172.23.106.163
-bash-4.2# grep rollbackAllToZero *
|
indexer.log:2022-07-28T23:24:27.242-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test7
|
grep: rebalance: Is a directory
|
-bash-4.2#
|
-bash-4.2# grep "RESPONSE status:(error" indexer.log
|
2022-07-28T23:22:18.924-07:00 [Info] SCAN##112 RESPONSE status:(error = Indexer rollback), requestId: aaee71c1-8153-45f2-a10d-5b30b542d882
|
2022-07-28T23:24:27.449-07:00 [Info] SCAN##116 RESPONSE status:(error = Indexer rollback), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
|
2022-07-28T23:26:27.460-07:00 [Info] SCAN##117 RESPONSE status:(error = Index scan timed out), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
|
-bash-4.2#
|
cbcollect_info attached.
Attachments
Issue Links
- duplicates
-
MB-53084 Index rollback to zero on memcached OOM kill
- Closed
- is duplicated by
-
MB-53180 [6.6.5 build 10104] - Primary Index rollback to zero after KV node auto failover
- Closed
-
MB-53186 [6.6.5 build 10104] - Multiple primary Indexes rollback to zero after KV node auto failover
- Closed
-
MB-53236 [6.6.5-15322] : RollbackAll to zero seen in the indexer logs
- Closed
- relates to
-
MB-53189 Set failover table branch point based on in-memory state
- Open