Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-54414

[BP 6.6.X] XDCR - AdvFilter upgrade happens pre-emptively leading to missed documents

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 6.6.0, 6.6.1, 6.6.2, 7.0.0, 6.6.3, 7.1.0, 7.0.3, 7.0.2, 7.0.1, 6.6.5, 6.6.4, 7.1.1, 7.0.4, 7.1.2, 7.1.4, 7.0.5
    • 6.6.6
    • XDCR
    • Untriaged
    • 1
    • No

    Description

      XDCR Advanced filtering was introduced in 6.5.0. Customers that are running Server prior to 6.5.0 would not be using Adv Filering, but rather a traditional key-based regex filtering.

      As part of the implementation in 6.5.0, there was an upgrade code that would automatically upgrade the key-based regex filter to become an advanced filter. However, the upgrade code did not check the cluster version before upgrading the filter. This means when one node is upgraded from <6.5.0 to >= 6.5.0 and rejoins a cluster that's pre-6.5, the replication's filter is upgraded to REGEXP_CONTAINS(meta.id(), "<filter>") automatically and persisted.

      Future pipeline restart will cause the upgraded filter to be applied to nodes running pre-Adv filtering. This means mutations will be incorrectly filtered out, and will cause data loss.

      This should have been tested and found back in the 6.5.0 timeframe, but it seems unlikely that it was run.

      A simple test case could suffice:

      1. Start cluster on 6.0.x, 2-node source
      2. Create replication with a filter such as "^KU"
      3. Rebalance one node out. Upgrade to 6.6.X, rebalance back in.
      4. Filter gets upgraded. The node on 6.0.X will show the advanced filter instead of original key-based one.
      5. Pipeline will restart because of the 6.6.X node that rebalanced back in
      6. Further mutations even if it matches original filter of "^KU", will not be replicated.

      Upgrade docs https://docs.couchbase.com/server/current/install/upgrade.html says that 6.6.X is the intermediate release, so this needs to be fixed at least in 6.6.x.
      However, backport fixes to 7.0.x or 7.1.x should be considered given that once 6.6.X is out of the picture, we'll need to support upgrading to those version as baseline.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-6.6.6-10536 contains goxdcr commit 41c8238 with commit message:
            MB-54414: code review comments

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.6-10536 contains goxdcr commit 41c8238 with commit message: MB-54414 : code review comments

            Build couchbase-server-6.6.6-10536 contains goxdcr commit fdc1c03 with commit message:
            MB-54414: implement cooldown to prevent hitting pools/default endpoint too much

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.6-10536 contains goxdcr commit fdc1c03 with commit message: MB-54414 : implement cooldown to prevent hitting pools/default endpoint too much

            Build couchbase-server-6.6.6-10536 contains goxdcr commit 4e69f88 with commit message:
            MB-54414: advanced filter upgrade fixes

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.6-10536 contains goxdcr commit 4e69f88 with commit message: MB-54414 : advanced filter upgrade fixes

            Build couchbase-server-6.6.6-10536 contains goxdcr commit 6f65d06 with commit message:
            MB-54414: XDCR to implement ability to get cluster compatibility locally

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.6-10536 contains goxdcr commit 6f65d06 with commit message: MB-54414 : XDCR to implement ability to get cluster compatibility locally

            People

              ayush.nayyar Ayush Nayyar
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty