Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Unresolved
Priority: Major
Fix Version/s: feature-backlog
Affects Version/s: None
Component/s: CouchbaseLite, SyncGateway
Labels:
None

Description

Raising this as a high level idea/suggestion/improvment.

Our current replicator design is built to ensure consistency in the case of channels (or other filters) changing. We do this by hashing these channels (or other filters - I'm mostly going to refer to channels here though) into the checkpoint ID. As such, if the channels change, we expect to start from 0 (or the last time we replicated that set).

This approach has a slight downside in that it makes it a little difficult to do have a "one true replicator" within your app - if you add and remove from a list of channels on an ad-hoc basis, you need to keep rechecking all the documents you've already got.

Taking a particularly bad case - imagine having a channel metadata which contains ~10k docs of metadata that are generally considered to be needed by everyone, and therefore almost everyone replicates this channel. Add to this specific channels for location_N where a user might be replicating a handful which changes frequently ("today, you're assigned to locations 1, 19 ,127..." etc.). In this case, having one replicator with [metadata, location_1, location_19, location_127] will recheck all of metadata even if it only actually needs to pick up 1 change in location_127 - we do of course get the saving that we don't need to pul those documents, but the checking will still be a substantial overhead.

Handful of thoughts on this:

Sequential Replicators as a 1st/2nd class pattern within CBL.
- This is actually a pattern I've adopted in the past. At an app level, you'd often want to have a concept of "replicated" or "not replicated", and this doesn't fit well with having multiple discrete replicators (MetadataReplicator and LocationsReplicator for example). Instead (or rather, as a refinement of that) it's nice to daisy-chain replicators. This way, you can easily have a MetadataReplicator which is expected to never/very infrequently change channels, and a more dynamic LocationsReplicator which can change channels with minimal overhead.
Discrete Channel Replications
- Suspect that this would have perf implications, but if CBL was to demux the channels and individually replicate each, it could checkpoint each channel individually as its own replicator.
Itemised Checkpoints
- Rather than checkpointing for a given set of channels, define the checkpoint based on the usual other params (CBL_UUID, SG_URL, etc) and maintain a checkpoint per channel within that. Obviously, this makes the checkpoints larger, and implies an upper bound of channels you can fit into a checkpoint...
Encourage channel sets at the user level.
- We already have the functionality for this in Sync Gateway - simply have users normally replicate with */no filter and assign/unassign channels from that user as needed. However, this is costly on the SG side, and limiting in that even if you allow users to self-assign channels, it's effectively a hard filter at the SG level - e.g. I can't go and grab the odd document in another channel, I need to add the channel to my user and effectively grab all of it.
Combined with any of the above More Flags!
- Not always the best option, but picking any of these options and allowing at as a non-default mode with a ReplicatorConfig feels reasonable. Being able to set FilterOptimise.STATIC vs FilterOptimise.DYNAMIC allows us to keep the same behaviour, but provide the dev with a potential benefit. Somewhat similar to High/Low IO Priority in Server.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Priya Rajagopal

Reporter:: James Flather (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 09/Mar/20 5:08 AM

Updated:: 28/Aug/20 3:35 PM

Gerrit Reviews

There are no open Gerrit changes

Efficiency improvements for channel set changes

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty