Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-42270

OSO mode not sending down streamEnd

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.0.0
    • Cheshire-Cat
    • XDCR
    • None

    Description

      XDCR starts a stream from DCP and asks it for from seqno 0 to a certain seqEnd.
      Without OSO mode, things seem to be fine. Given an example vb, here are the logs from memcached:

      neil.huang@NeilsMacbookPro:~/source/couchbase/ns_server/logs/n_0$ grep "vb:493" memcached.log.000000.txt | grep backfill
      2020-10-23T20:43:31.099066-07:00 INFO 63: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:euPfwUgtMxeBtuGvZZy-xA== - (vb:493) ActiveStream::scheduleBackfill_UNLOCKED register cursor with name "eq_dcpq:xdcr:dcp_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:euPfwUgtMxeBtuGvZZy-xA==" backfill:true, seqno:7
      2020-10-23T20:43:31.099071-07:00 INFO 63: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:euPfwUgtMxeBtuGvZZy-xA== - (vb:493) Scheduling backfill from 1 to 6, reschedule flag : False
      2020-10-23T20:43:31.100238-07:00 INFO 63: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:euPfwUgtMxeBtuGvZZy-xA== - (vb:493) ActiveStream::registerCursor name "eq_dcpq:xdcr:dcp_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:euPfwUgtMxeBtuGvZZy-xA==", backfill:false, seqno:8
      2020-10-23T20:43:31.111637-07:00 INFO 63: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:euPfwUgtMxeBtuGvZZy-xA== - (vb:493) ActiveStream::transitionState: Transitioning from backfilling to in-memory
      2020-10-23T20:43:50.240856-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) Creating stream with start seqno 0 and end seqno 49; requested end seqno was 49, collections-manifest uid:none, filter:1 sid:none
      2020-10-23T20:43:50.240860-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) ActiveStream::scheduleBackfill_UNLOCKED register cursor with name "eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ==" backfill:true, seqno:50
      2020-10-23T20:43:50.240862-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) Scheduling backfill from 1 to 49, reschedule flag : False
      2020-10-23T20:43:50.242725-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) ActiveStream::markDiskSnapshot: Sending disk snapshot with start 0, end 49, and high completed none, max visible none
      2020-10-23T20:43:50.242732-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) ActiveStream::registerCursor name "eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ==", backfill:true, seqno:50
      2020-10-23T20:43:50.248169-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) Backfill complete, 1 items read from disk, 14 from memory, last seqno read: 42, pendingBackfill : False
      2020-10-23T20:43:50.249931-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) ActiveStream::transitionState: Transitioning from backfilling to in-memory
      2020-10-23T20:43:50.249934-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/
       
      vvvvvvvvvv
       
      B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) Stream closing, sent until seqno 49 remaining items 0, reason: The stream ended due to all items being streamed
       
      ^^^^^^^^^^^ 
      <XDCR receives a STREAMEND message just fine without OSO. Presumably corresponding to the above message>
       
      2020-10-23T20:43:50.250305-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_11971573298d68ae8159a102cb89d342/B1/B2_127.0.0.1:12000_0:0CxWZWz1jVhBAb7BUeBeMQ== - (vb:493) Stream closed, 16 items sent from backfill phase, 0 items sent from memory phase, 49 was last seqno sent
      

      However, given the same exact scenario where the only variance is XDCR requesting OSO, XDCR doesn't hear back the streamEND message and waits forever.

      neil.huang@NeilsMacbookPro:~/source/couchbase/ns_server/logs/n_0$ grep "vb:493" memcached.log.000000.txt  | grep backfill
      2020-10-23T20:45:16.576417-07:00 INFO 62: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:-SOcNcY4aeo6opp1wIh58A== - (vb:493) ActiveStream::scheduleBackfill_UNLOCKED register cursor with name "eq_dcpq:xdcr:dcp_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:-SOcNcY4aeo6opp1wIh58A==" backfill:true, seqno:7
      2020-10-23T20:45:16.576423-07:00 INFO 62: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:-SOcNcY4aeo6opp1wIh58A== - (vb:493) Scheduling backfill from 1 to 6, reschedule flag : False
      2020-10-23T20:45:16.577797-07:00 INFO 62: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:-SOcNcY4aeo6opp1wIh58A== - (vb:493) ActiveStream::registerCursor name "eq_dcpq:xdcr:dcp_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:-SOcNcY4aeo6opp1wIh58A==", backfill:false, seqno:8
      2020-10-23T20:45:16.582783-07:00 INFO 62: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:-SOcNcY4aeo6opp1wIh58A== - (vb:493) ActiveStream::transitionState: Transitioning from backfilling to in-memory
      2020-10-23T20:45:37.223467-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) Creating stream with start seqno 0 and end seqno 49; requested end seqno was 49, collections-manifest uid:none, filter:1 sid:none
      2020-10-23T20:45:37.223478-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) ActiveStream::scheduleBackfill_UNLOCKED register cursor with name "eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw==" backfill:true, seqno:50
      2020-10-23T20:45:37.223493-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) Scheduling OSO backfill for cid:0xb lastReadSeqno:0 curChkSeqno:50
      2020-10-23T20:45:37.225325-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) ActiveStream::registerCursor name "eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw==", backfill:true, seqno:50
      2020-10-23T20:45:37.225334-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) ActiveStream::markOSODiskSnapshot: Sent snapshot begin marker, cursor requested:49 curChkSeqno:50
      2020-10-23T20:45:37.232832-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) OSO Backfill complete, 1 items read from disk, 14 from memory, lastReadSeqno:45, pendingBackfill:False
      2020-10-23T20:45:37.232859-07:00 INFO 81: (B1) DCP (Producer) eq_dcpq:xdcr:dcp_backfill_c3faf3bcfdae0f7bafa0fa2cf9ba65bb/B1/B2_127.0.0.1:12000_0:PReGyLcM58khlYDYaWKAiw== - (vb:493) ActiveStream::transitionState: Transitioning from backfilling to in-memory
       
      <nothing here - stream end should have come by now>
      

      Not sure what I'm doing wrong or if it's an issue in KV? I've attached cbcollect for the source node if you could so kindly help. Thanks.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-42270
          # Subject Branch Project Status CR V

          Activity

            People

              neil.huang Neil Huang
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty