Loading...

XML

Word

Printable

Details

Type: Epic
Resolution: Unresolved
Priority: Major
Fix Version/s: Morpheus
Affects Version/s: None
Component/s: XDCR
Labels:
None

Epic Name:
XDCR Conflict Logging
Epic Status:
To Do
Story Points:
0

Description

Multiple customers have asked that XDCR log when XDCR resolves a conflict (MB-15561).

Basically, what customers mean by a conflict is when the two documents being compared were updated on different clusters. For example, the same doc id was updated by an application connected to cluster1, and at the same time (within conflict window), updated by an application connected to cluster2.

This is problematic for XDCR since XDCR does not detect conflicts (every replication is a conflict resolution – one document winning over another – without any understanding of causality). In the case of MB-15561, there is a PRD for logging whenever the target cluster wins a conflict, but even in the simple case of a one-way replication in an active-passive scenario, this may mean (potentially many) false positives whenever an old mutation is sent from the source more than once due to a backfill, network instability/latency, resend due to errors, or due to a complex topology. If a customer has questions, then, there would need to be an investigation of what was happening in the environment, and a best guess to the reason why the target document won.

Keeping document history information (modification time and in which cluster) in version vectors was a way of detecting conflict for custom conflict resolution. However, even without the custom conflict resolution functionality, we should be able to use version vectors (already developed in beta) simply to detect true conflicts to be able to log when there is a true conflict.

This would mean that we would need to enable the HLV (hybrid logical vector) feature in XDCR, and when the conflict is detected, we could log the true conflict but then use the configured conflict resolution mode – Sequence or Timestamp – to resolve the conflict. The customers would not have to provide a custom merge function to resolve the conflict. The ability to detect true conflicts would allow us to log that event, including the two documents that were in conflict.

Basic requirements would be:

Detect true conflicts (HLV feature allows you to do this)
Log or store the the document id's, document bodies, the conflicting history information in a system collection/conflict bucket – in some way such that the information is easy to access programmatically (like via APIs) so that they can be acted on quickly, if needed
Resolve the conflict using the currently existing/configured conflict resolution mode – Sequence or Timestamp – so that there is no need for a custom way to resolve the documents that are in conflict
Allow on-line (no down time) enablement of this feature (enabling HLV, logging true conflicts, etc.)

The drawback of enabling HLV and the ability to detect true conflicts would be a performance penalty (for detecting conflicts) and the increase in document metadata size (to keep version history for each document).

Note 1: There were some performance tests done with version vectors (aka hybrid logical vector – HLV) a long time ago, and I don't recall exactly, but I believe the performance penalty was less than ~5% for just the replication without any conflicts. If there are conflicts, then, each conflict (or rate of conflicts) will draw additional performance penalties.

Note 2: If a document can be modified by 3 different clusters within an hour, the size overhead in a 2KB document would be around 5.5% (from a previous estimate). So, the storage size increase (due to document size increase from the metadata increase) would be dynamic as well, dependent on the number/rate of modifications to the same document in different clusters.

There would, also, be other additional overhead since the document history info will need to be pruned regularly, otherwise, each document size could become quite large.

Attachments

Issue Links

relates to

MB-15561 Log when XDCR resolves a conflict

Open

MB-62214 [UI] XDCR Conflict Logging Feature

Open

MB-60961 XDCR - Decrease HLV size via CAS delta computation

Closed

links to

Design Doc

PRD

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Neil Huang

Reporter:: Hyun-Ju Vega

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 05/Oct/23 3:51 PM

Updated:: 7 hours ago

Gerrit Reviews

There are no open Gerrit changes

XDCR: Detect conflicts to be able to log when a true conflict is resolved

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty