Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-38269

Projector goes into a stream termination loop while trying to stream a near 20 MB document



    • Bug
    • Resolution: Fixed
    • Major
    • 7.0.0
    • 6.0.3, 6.5.1, 6.5.0
    • secondary-index
    • None
    • Untriaged
    • Unknown


      For any document, the 20 MB limit applies to document body + User xattrs.
      Memcached reserves a buffer of 1 MB for System xattrs.
      While streaming a document to a DCP consumer, KV will stream body + user xattrs + system xattrs.
      Hence, consider a case where the document body is 19.9 MB. With 1 MB of system xattr buffer fully used, KV will sent 19.9 + 1 = 20.9 MB to the consumer.

      Actual behaviour:

      When projector receives this mutation, it extracts the payload (which also includes system xattrs) and compares it to the hard coded value of 20 MB:


      For a document described above, this condition is met, after which projector logs a message as follows:

      2020-02-27T19:30:23.716+00:00 [Error] DCPT[secidx:proj-sxoprd_posentities-MAINT_STREAM_TOPIC_ea678d87a86f96705627397d634ec781-1339850182609080814/1] doReceive(): 20976104 is too big (max 20971520)

      More importantly, it then terminates this DCP stream and tries to recreate it, hence going into a loop:

       2020-02-27T19:30:23.716+00:00 [Info] DCPT[secidx:proj-sxoprd_posentities-MAINT_STREAM_TOPIC_ea678d87a86f96705627397d634ec781-1339850182609080814/1] ##45fe ... stopped

      Expected behaviour:

      1. Firstly, checking for document size at consumer level is redundant code, as KV will do the filtering itself before sending over the document.

      2. Even if a consumer is checking for the size, it should actually be having the logic to distinguish between document body size and system xattr size and prevent only those mutations having body size > 20 MB (however this again will be dead code).

      3. The most important aspect here is the way projector handles this document. Views currently simply log the document ID and skip processing this document. Projector on the other hand goes into a loop of recreating DCPT streams. This means that it doesn't stream any further sequence numbers from that particular vbucket and affects index builds, and eventually request_plus queries causing an outage.

      Instead, projector should also simply skip such a mutation.


        Issue Links

          For Gerrit Dashboard: MB-38269
          # Subject Branch Project Status CR V



              ajay.bhullar Ajay Bhullar
              abhishek.jindal Abhishek Jindal
              0 Vote for this issue
              11 Start watching this issue



                Gerrit Reviews

                  There are no open Gerrit changes