Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: 6.6.5, 7.1.2
Affects Version/s: 6.6.5
Component/s: secondary-index
Labels:
None
Environment:
Enterprise Edition 6.6.5 build 10104

Triage:
Untriaged
Operating System:
Centos 64-bit
Story Points:
1
Is this a Regression?:
No

Description

Steps to Repro
1. Create a 6 node cluster with 3kv, 2 indexing and 1 n1ql nodes.
2. Create buckets/data/indexes and push buckets to dgm and ensure indexes are in DGM as well. Start running queries in background with request_plus consistency level.
3. Ran the following script to validate ~~MB-53057~~ which kills memcached(on 172.23.100.34), waits for AF to kick in, does full recovery and then rebalances in an infinite loop.

#!/bin/bash

while :

do

    echo "killing memcached..."

    kill -9 `pidof memcached`

    echo "Waiting for auto failover to kick in..."

    sleep 180

    echo "Listing node status post Auto failover..."

    /opt/couchbase/bin/couchbase-cli server-list -c localhost:8091 --username Administrator --password password

    sleep 30

    echo "Starting full recovery..."

    /opt/couchbase/bin/couchbase-cli recovery -c localhost:8091 --username Administrator --password password --server-recovery 172.23.100.34:8091 --recovery-type full

    sleep 30

    echo "Starting Rebalance after recovering a failed over node..."

    /opt/couchbase/bin/couchbase-cli rebalance -c localhost:8091 --username Administrator --password password

    sleep 4000

    echo "Listing rebalance status..."

    /opt/couchbase/bin/couchbase-cli rebalance-status -c localhost:8091 --username Administrator --password password

    sleep 30

    echo "Listing node status post rebalance..."

    /opt/couchbase/bin/couchbase-cli server-list -c localhost:8091 --username Administrator --password password

    sleep 300

done

Exactly same test as the one in ~~MB-53180~~. However in this case, It seems like we have rolled back 2 primary indexes.

172.23.106.159 : index

/opt/couchbase/var/lib/couchbase/logs/indexer.log:2022-07-29T05:07:37.832-07:00 [Info] StorageMgr::handleRollback Rollback Index: 10943164515644793993 PartitionId: 0 SliceId: 0 To Zero

/opt/couchbase/var/lib/couchbase/logs/indexer.log:2022-07-29T05:07:37.832-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test4

/opt/couchbase/var/lib/couchbase/logs/indexer.log:2022-07-29T05:07:37.940-07:00 [Info] StorageMgr::handleRollback Rollback Index: 10943164515644793993 PartitionId: 0 SliceId: 0 To Zero

172.23.106.163 : index

/opt/couchbase/var/lib/couchbase/logs/indexer.log:2022-07-29T05:09:02.679-07:00 [Info] StorageMgr::handleRollback Rollback Index: 3721238277937800766 PartitionId: 0 SliceId: 0 To Zero

/opt/couchbase/var/lib/couchbase/logs/indexer.log:2022-07-29T05:09:02.679-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test2

/opt/couchbase/var/lib/couchbase/logs/indexer.log:2022-07-29T05:09:02.784-07:00 [Info] StorageMgr::handleRollback Rollback Index: 3721238277937800766 PartitionId: 0 SliceId: 0 To Zero

cbcollect_info attached.

Attachments

Issue Links

duplicates

MB-53172 [6.6.5 build 10104] - Secondary Index rollback to zero after KV node auto failover

Resolved

MB-53084 Index rollback to zero on memcached OOM kill

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Balakumaran Gopal

Reporter:: Balakumaran Gopal

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 29/Jul/22 6:54 AM

Updated:: 29/Jul/22 3:40 PM

Resolved:: 29/Jul/22 10:25 AM

Gerrit Reviews

There are no open Gerrit changes

[6.6.5 build 10104] - Multiple primary Indexes rollback to zero after KV node auto failover

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty