Uploaded image for project: 'Couchbase Mobile'
  1. Couchbase Mobile
  2. CM-403

Efficiency improvements for channel set changes

    XMLWordPrintable

Details

    Description

      Raising this as a high level idea/suggestion/improvment.

      Our current replicator design is built to ensure consistency in the case of channels (or other filters) changing. We do this by hashing these channels (or other filters - I'm mostly going to refer to channels here though) into the checkpoint ID. As such, if the channels change, we expect to start from 0 (or the last time we replicated that set).

      This approach has a slight downside in that it makes it a little difficult to do have a "one true replicator" within your app - if you add and remove from a list of channels on an ad-hoc basis, you need to keep rechecking all the documents you've already got.

      Taking a particularly bad case - imagine having a channel metadata which contains ~10k docs of metadata that are generally considered to be needed by everyone, and therefore almost everyone replicates this channel. Add to this specific channels for location_N where a user might be replicating a handful which changes frequently ("today, you're assigned to locations 1, 19 ,127..." etc.). In this case, having one replicator with [metadata, location_1, location_19, location_127] will recheck all of metadata even if it only actually needs to pick up 1 change in location_127 - we do of course get the saving that we don't need to pul those documents, but the checking will still be a substantial overhead.

      Handful of thoughts on this:

      • Sequential Replicators as a 1st/2nd class pattern within CBL.
        • This is actually a pattern I've adopted in the past. At an app level, you'd often want to have a concept of "replicated" or "not replicated", and this doesn't fit well with having multiple discrete replicators (MetadataReplicator and LocationsReplicator for example). Instead (or rather, as a refinement of that) it's nice to daisy-chain replicators. This way, you can easily have a MetadataReplicator which is expected to never/very infrequently change channels, and a more dynamic LocationsReplicator which can change channels with minimal overhead.
      • Discrete Channel Replications
        • Suspect that this would have perf implications, but if CBL was to demux the channels and individually replicate each, it could checkpoint each channel individually as its own replicator.
      • Itemised Checkpoints
        • Rather than checkpointing for a given set of channels, define the checkpoint based on the usual other params (CBL_UUID, SG_URL, etc) and maintain a checkpoint per channel within that. Obviously, this makes the checkpoints larger, and implies an upper bound of channels you can fit into a checkpoint...
      • Encourage channel sets at the user level.
        • We already have the functionality for this in Sync Gateway - simply have users normally replicate with */no filter and assign/unassign channels from that user as needed. However, this is costly on the SG side, and limiting in that even if you allow users to self-assign channels, it's effectively a hard filter at the SG level - e.g. I can't go and grab the odd document in another channel, I need to add the channel to my user and effectively grab all of it.
      • Combined with any of the above More Flags!
        • Not always the best option, but picking any of these options and allowing at as a non-default mode with a ReplicatorConfig feels reasonable. Being able to set FilterOptimise.STATIC vs FilterOptimise.DYNAMIC allows us to keep the same behaviour, but provide the dev with a potential benefit. Somewhat similar to High/Low IO Priority in Server.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            priya.rajagopal Priya Rajagopal
            James Flather James Flather (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty