Uploaded image for project: 'Couchbase Gateway'
  1. Couchbase Gateway
  2. CBG-4016

[3.1.9 backport] nextSequenceGreaterThan should update to current _sync:seq before releasing sequences

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 3.1.9
    • 3.1.0
    • SyncGateway
    • Security Level: Public
    • None
    • CBG Sprint 152
    • 1

    Description

      nextSequenceGreaterThan currently releases sequences based on the highest sequence currently reserved by that node.  In most cases this is appropriate - if the node is under load this value will be up to date with _sync:seq, and if it's been idle for 1.5 seconds it will also be up to date (previous allocation will have been released).

      However, within that 1.5 second window it's possible for _sync:seq to be moved ahead by another node under heavy load, such that the last allocated sequence on the current node is behind _sync:seq by 1000s or tens of 1000s.  In this scenario nextSequenceGreaterThan will increment _sync:seq by that difference, even though _sync:seq is actually higher than the target existingSequence passed to the function.

      If multiple writes for a document are occuring on multiple low-load nodes (close to) concurrently, the problem can be amplified.  Consider the following, for low-load nodes c1, c2, c3 and high volume node i1.

      1. Allocators for c1, c2 and c3 have reserved sequences c1:(101-110), c2:(111-120), (121-130) at time t
      2. Node i1 allocates 10000 sequences in 1 second, moving _sync:seq to 10130
      3. Node i1 updates doc foo, setting sequence to 10000
      4. Node c1 updates doc foo.  Triggers a call to nextSequenceGreaterThan(10000), which results in incrementing _sync:seq by 9890 (10000 - 110). Updates foo, setting sequence to 19890
      5. Node c2 updates doc foo.  Triggers a call to nextSequenceGreaterThan(19890), which results in incrementing _sync:seq by 18770 (19890 - 120).  Updates foo, setting sequence to 38660
      6. Node c3 updates doc foo.  Triggers a call to nextSequenceGreaterThan(38660), which results in incrementing _sync:seq by 38530 (38660 - 130).  Updates foo, setting sequence to 77190

      This can result in near exponential growth in released sequences until each node is used.

      This bug is generally rare, as it's unusual to trigger multiple successful writes to the same document within the sequence allocation batch window.  An exception may be a document that's being frequently updated via import (many times/second), and has connected clients on each node replicating the updates.  In this situation the connected clients may trigger on-demand import - they are notified of the imported update, but when they attempt to replicate it, they detect the doc has been updated on the server and do an on-demand import.  If the document is being rapidly updated on the server, this could result in the concurrent updates to the same document from multiple nodes/sequence allocators.

      The proposed fix is to release the current allocator batch and fetch the current value of _sync:seq when calling nextSequenceGreaterThan, and only release additional sequences when _sync:seq is greater than the target existingSequence.  This will limit the correction to only sequence values that weren't allocated in the normal way in the cluster, and not to cases of allocator latency.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            adamf Adam Fraser
            adamf Adam Fraser
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty