Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: Mobile 3.0, Mobile 3.1
Component/s: documentation
Labels:
None

Epic Link:
DOCS Sync Gateway Bugs
Story Points:
0

Description

Pages incriminated are Sync Gateway doc pages 3.0 and 3.1:

https://docs.couchbase.com/sync-gateway/3.0/deploy-dr-cluster.html

https://docs.couchbase.com/sync-gateway/3.1/deploy-dr-cluster.html (current doc page)

Problems
Reading this doc page I think there is 2 major issues on line 6 of section Activation:

FIRST ISSUE: Wrong cluster :

6. Flush all buckets in the new Primary cluster; this prevents any writes coming into the Primary cluster that had not been replicated when XDCR was stopped.

Beginning of sentence is wrong, it should be “Flush all buckets in the (old) Primary cluster” OR “Flush all buckets in the new Disaster Recovery cluster”

Note that this other (quite similar) doc page https://docs.couchbase.com/sync-gateway/current/setting-up-dr-cluster.html does NOT have this 1st issue. Indeed it is mentioned there:

Flush all replicated buckets in the Primary cluster; as a precaution against any spurious writes coming into the Primary cluster that had not been replicated when XDCR was stopped.

2. SECOND ISSUE
The justification of the flush is not OK for me: IF you are relying on XDCR Timestamp resolution conflict strategy, then even if we suppose some data “had not been replicated when XDCR was stopped”, there will be no XDCR syncing issue between the NEW Primary Cluster and the NEW Disaster Recovery cluster.

I cannot see from where spurious writes may come... : any write made in the NEW Primary Cluster are now synced to the NEW Disaster Recovery cluster.

All that to say : does step 6 still apply? I would say NO. So personally I would recommend removing step 6, unless there is a real justification (please provide details) of the need for flushing buckets on the NEW Disaster Recovery cluster.

Also I would like to add that, if a flush is really needed (which I doubt), then :

re-populating the NEW Disaster Recovery cluster from scratch will take some times (some seconds or minutes in case customer has NOT a lot of data => OK that's acceptable)
re-creating GSI indexes (not to mention FTS indexes) could take hours if not days.

So it means that, once the 1st DR cluster is used, our customers MUST live without any DR cluster (i.e. the “NEW Disaster Recovery cluster”) ready for hours/days, which seems not acceptable.

Discussion about those 2 issues and my proposal on slack #mobile-support channel:
https://couchbase.slack.com/archives/C0N7DCWM8/p1704382413701649

Thanks.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Screenshot 2024-01-04 at 16.06.23.png
139 kB
04/Jan/24 8:24 AM

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Elliot Hunter

Reporter:: Fabrice Leray

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Jan/24 8:32 AM

Updated:: 08/Apr/24 3:23 AM

Gerrit Reviews

There are no open Gerrit changes

Mobile : fix "Using XDCR for Disaster Recovery in Mobile Clusters" doc pages

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty