Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
Cheshire-Cat, 5.5.6, 6.0.4, 6.6.1
-
Untriaged
-
1
-
No
Description
1. During rebalance of a partitioned index, individual partitions move between nodes.
e.g. partition P1 and P2 are moving from nodeA to nodeB and nodeC.
2. Once the partitions have successfully moved to nodeB/nodeC, those need to be cleaned up from nodeA. The cleanup involves both metadata cleanup and actual partition data cleanup.
3. After metadata cleanup, a tombstone is created to make sure the actual partition cleanup can happen even if the indexer crashes.
In this example, 2 tombstone records will get created (one for P1 and one for P2). Each tombstone has its own instanceId with PendingDelete state.
4. These tombstones get deleted by the indexer on next subsequent restart.
Also, if the partition with tombstone were to be recreated as part of subsequent rebalance(e.g. P1 or P2 moving back again on nodeA), it will cleanup the tombstone for the partition to make sure indexer doesn't delete a valid partition after restart.
5. The tombstone cleanup mechanism has a bug due to which some of the tombstones can skip the cleanup if there are multiple tombstones for the same index.
The function returns prematurely after it has found the first tombstone.
https://github.com/couchbase/indexing/blob/d17aa55f4fd2ec421b563b913862c34481265905/secondary/manager/topology.go#L545
6. In the presence of such tombstones, any indexer restart can lead to partition data being cleaned up for a valid partition on the node.
Attachments
Issue Links
- is a backport of
-
MB-42160 Multiple partition tombstones for an index during rebalance can lead to partition cleanup on restart
-
- Closed
-
Steps to reproduce:
1. 1KV+n1ql node and 2 index service nodes(I1 and I2) on 6.6.1.
2. Load 100k documents in the bucket.
3. Create 5 partitioned indexes with 16 partitions each.
4. Rebalance In 2 more index service nodes(I3 and I4)(together in a single rebalance).
5. Rebalance Out I3 and I4(together in a single rebalance).
6. Restart Indexer process on I1 and I2.
7. Indexer will end up cleaning some of the partitions and restart DCP stream from 0.
This can be checked from the following logs(notice RestartTs is nil) or high indexer cpu utilization.
2020-10-14T00:16:11.295-04:00 [Info] Indexer::startBucketStream Stream: MAINT_STREAM Bucket: merchant_trans SessionId 1 RestartTS <nil>