[BP 7.1.X] XDCR - AdvFilter upgrade happens pre-emptively leading to missed documents

Description

XDCR Advanced filtering was introduced in 6.5.0. Customers that are running Server prior to 6.5.0 would not be using Adv Filering, but rather a traditional key-based regex filtering.

As part of the implementation in 6.5.0, there was an upgrade code that would automatically upgrade the key-based regex filter to become an advanced filter. However, the upgrade code did not check the cluster version before upgrading the filter. This means when one node is upgraded from <6.5.0 to >= 6.5.0 and rejoins a cluster that's pre-6.5, the replication's filter is upgraded to REGEXP_CONTAINS(meta.id(), "<filter>") automatically and persisted.

Future pipeline restart will cause the upgraded filter to be applied to nodes running pre-Adv filtering. This means mutations will be incorrectly filtered out, and will cause data loss.

This should have been tested and found back in the 6.5.0 timeframe, but it seems unlikely that it was run.

A simple test case could suffice:

  1. Start cluster on 6.0.x, 2-node source

  2. Create replication with a filter such as "^KU"

  3. Rebalance one node out. Upgrade to 6.6.X, rebalance back in.

  4. Filter gets upgraded. The node on 6.0.X will show the advanced filter instead of original key-based one.

  5. Pipeline will restart because of the 6.6.X node that rebalanced back in

  6. Further mutations even if it matches original filter of "^KU", will not be replicated.

Upgrade docs https://docs.couchbase.com/server/current/install/upgrade.html says that 6.6.X is the intermediate release, so this needs to be fixed at least in 6.6.x.
However, backport fixes to 7.0.x or 7.1.x should be considered given that once 6.6.X is out of the picture, we'll need to support upgrading to those version as baseline.

Components

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

1

Activity

Show:

CB robot December 6, 2022 at 12:12 PM

Build couchbase-server-8.0.0-1185 contains indexing commit 086671b with commit message:
Use sync=true to get stats to avoid sporadic failures

CB robot December 6, 2022 at 5:54 AM

Build couchbase-server-7.5.0-3371 contains indexing commit 086671b with commit message:
Use sync=true to get stats to avoid sporadic failures

Varun Velamuri December 5, 2022 at 11:30 PM

Due to a typo, the patch MB-54416 Use sync=true to get stats to avoid sporadic failures got into this MB. Apologies for that.

CB robot November 10, 2022 at 6:47 PM

Build couchbase-server-7.1.3-3478 contains goxdcr commit 4a5ae58 with commit message:
: fix locking issues for XDCRCompTopologySvc

CB robot November 10, 2022 at 6:38 PM

Build couchbase-server-7.2.0-5018 contains goxdcr commit 4a5ae58 with commit message:
: fix locking issues for XDCRCompTopologySvc

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

No

Triage

Untriaged

Due date

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created November 4, 2022 at 4:45 AM
Updated February 24, 2025 at 7:05 PM
Resolved November 8, 2022 at 11:11 PM
Instabug