Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-42108

[BP 6.6.1] Multiple partition tombstones for an index during rebalance can lead to partition cleanup on restart

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Cheshire-Cat, 5.5.6, 6.0.4, 6.6.1
    • Fix Version/s: 6.6.1
    • Component/s: secondary-index
    • Triage:
      Untriaged
    • Story Points:
      1
    • Is this a Regression?:
      No

      Description

      1. During rebalance of a partitioned index, individual partitions move between nodes.
      e.g. partition P1 and P2 are moving from nodeA to nodeB and nodeC.

      2. Once the partitions have successfully moved to nodeB/nodeC, those need to be cleaned up from nodeA. The cleanup involves both metadata cleanup and actual partition data cleanup.

      3. After metadata cleanup, a tombstone is created to make sure the actual partition cleanup can happen even if the indexer crashes.

      In this example, 2 tombstone records will get created (one for P1 and one for P2). Each tombstone has its own instanceId with PendingDelete state.

      4. These tombstones get deleted by the indexer on next subsequent restart.
      Also, if the partition with tombstone were to be recreated as part of subsequent rebalance(e.g. P1 or P2 moving back again on nodeA), it will cleanup the tombstone for the partition to make sure indexer doesn't delete a valid partition after restart.

      5. The tombstone cleanup mechanism has a bug due to which some of the tombstones can skip the cleanup if there are multiple tombstones for the same index.
      The function returns prematurely after it has found the first tombstone.
      https://github.com/couchbase/indexing/blob/d17aa55f4fd2ec421b563b913862c34481265905/secondary/manager/topology.go#L545

      6. In the presence of such tombstones, any indexer restart can lead to partition data being cleaned up for a valid partition on the node.

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            deepkaran.salooja Deepkaran Salooja added a comment -

            Steps to reproduce:

            1. 1KV+n1ql node and 2 index service nodes(I1 and I2) on 6.6.1.
            2. Load 100k documents in the bucket.
            3. Create 5 partitioned indexes with 16 partitions each.
            4. Rebalance In 2 more index service nodes(I3 and I4)(together in a single rebalance).
            5. Rebalance Out I3 and I4(together in a single rebalance).
            6. Restart Indexer process on I1 and I2.
            7. Indexer will end up cleaning some of the partitions and restart DCP stream from 0.
            This can be checked from the following logs(notice RestartTs is nil) or high indexer cpu utilization.

            2020-10-14T00:16:11.295-04:00 [Info] Indexer::startBucketStream Stream: MAINT_STREAM Bucket: merchant_trans SessionId 1 RestartTS <nil>
            

            Show
            deepkaran.salooja Deepkaran Salooja added a comment - Steps to reproduce: 1. 1KV+n1ql node and 2 index service nodes(I1 and I2) on 6.6.1. 2. Load 100k documents in the bucket. 3. Create 5 partitioned indexes with 16 partitions each. 4. Rebalance In 2 more index service nodes(I3 and I4)(together in a single rebalance). 5. Rebalance Out I3 and I4(together in a single rebalance). 6. Restart Indexer process on I1 and I2. 7. Indexer will end up cleaning some of the partitions and restart DCP stream from 0. This can be checked from the following logs(notice RestartTs is nil) or high indexer cpu utilization. 2020-10-14T00:16:11.295-04:00 [Info] Indexer::startBucketStream Stream: MAINT_STREAM Bucket: merchant_trans SessionId 1 RestartTS <nil>
            Hide
            deepkaran.salooja Deepkaran Salooja added a comment -

            There will be 2 changes as part of the fix:

            1. The fix for the actual issue to make sure all tombstone get cleaned up for partitioned index.
            2. Even with 1, an existing cluster which has already run into this issue(i.e. it has the bad tombstones), cannot be safely upgraded with failover/recovery. There is another fix which would allow such a cluster to be upgraded with failover/recovery mechanism.

            For testing of the fix:
            1. Reproduce the issue with the steps mentioned above. With the fix, the issue should no longer be reproducible.
            2. Test upgrade by first reproducing the issue on a cluster with older build and then use failover/recovery to upgrade the cluster. The expectation is that after failover/recovery, none of the index node needs to restart DCP stream from 0. The bad tombstone would be identified and cleaned up without cleaning any valid partition.

            Show
            deepkaran.salooja Deepkaran Salooja added a comment - There will be 2 changes as part of the fix: 1. The fix for the actual issue to make sure all tombstone get cleaned up for partitioned index. 2. Even with 1, an existing cluster which has already run into this issue(i.e. it has the bad tombstones), cannot be safely upgraded with failover/recovery. There is another fix which would allow such a cluster to be upgraded with failover/recovery mechanism. For testing of the fix: 1. Reproduce the issue with the steps mentioned above. With the fix, the issue should no longer be reproducible. 2. Test upgrade by first reproducing the issue on a cluster with older build and then use failover/recovery to upgrade the cluster. The expectation is that after failover/recovery, none of the index node needs to restart DCP stream from 0. The bad tombstone would be identified and cleaned up without cleaning any valid partition.
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-6.6.1-9136 contains indexing commit 49b01d6 with commit message:
            MB-42108 [BP 6.6.1] skip cleanup of valid partition even if tombstone exists

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.1-9136 contains indexing commit 49b01d6 with commit message: MB-42108 [BP 6.6.1] skip cleanup of valid partition even if tombstone exists
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-6.6.1-9136 contains indexing commit cf9416a with commit message:
            MB-42108 [BP 6.6.1] evaluate all tombstones for an index when removing partitions

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.1-9136 contains indexing commit cf9416a with commit message: MB-42108 [BP 6.6.1] evaluate all tombstones for an index when removing partitions
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-6.6.0-7922 contains indexing commit 777ac53 with commit message:
            MB-42108 [BP 6.6.1] skip cleanup of valid partition even if tombstone exists

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.0-7922 contains indexing commit 777ac53 with commit message: MB-42108 [BP 6.6.1] skip cleanup of valid partition even if tombstone exists
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-6.6.0-7922 contains indexing commit eb743dc with commit message:
            MB-42108 [BP 6.6.1] evaluate all tombstones for an index when removing partitions

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.0-7922 contains indexing commit eb743dc with commit message: MB-42108 [BP 6.6.1] evaluate all tombstones for an index when removing partitions
            Hide
            girish.benakappa Girish Benakappa added a comment -

            Verified with 6.6.1-9192 and 6.6.0-7924- No issues seen with tests : https://docs.google.com/document/d/1QWiyp5EAv_BHSQGjn7koyvfKMnF3d1l_nQhP2ufL0hM/edit#heading=h.hcywjh2uumv8
            No issues seen with related regression jobs

            Show
            girish.benakappa Girish Benakappa added a comment - Verified with 6.6.1-9192 and 6.6.0-7924- No issues seen with tests : https://docs.google.com/document/d/1QWiyp5EAv_BHSQGjn7koyvfKMnF3d1l_nQhP2ufL0hM/edit#heading=h.hcywjh2uumv8 No issues seen with related regression jobs

              People

              Assignee:
              mihir.kamdar Mihir Kamdar
              Reporter:
              deepkaran.salooja Deepkaran Salooja
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  PagerDuty