Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45907

Eventing: Data miss match with SBM + multiple kv rebalance

    XMLWordPrintable

Details

    Description

      Build: 7.0.0-5017

      Test

      ./testrunner -i /tmp/testexec.18313.ini -p get-cbcollect-info=True,GROUP=bucket_op,skip_log_scan=False,bucket_storage=couchstore -t eventing.eventing_rebalance_collection.EventingRebalanceCollection.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,GROUP=bucket_op,java_sdk_client=True,get-cbcollect-info=True,skip_log_scan=False,bucket_storage=couchstore 
       
       
      Exception: missing data in destination bucket. Current : 94233 Expected : 120960  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.122.171': {}, '172.23.122.172': {}}

      Seeing no lcb exceptions in the stats

      Attachments

        1. 172.23.122.170-20210424-2153-diag.zip
          6.70 MB
        2. 172.23.122.171-20210424-2153-diag.zip
          13.87 MB
        3. 172.23.122.172-20210424-2153-diag.zip
          13.11 MB
        4. 172.23.122.173-20210424-2153-diag.zip
          12.29 MB
        5. 172.23.122.174-20210424-2153-diag.zip
          14.88 MB
        6. 172.23.122.175-20210424-2153-diag.zip
          12.66 MB
        7. 172.23.122.176-20210424-2153-diag.zip
          13.23 MB
        8. 172.23.122.177-20210424-2153-diag.zip
          11.77 MB
        9. mb-45907_testlogs.log
          579 kB
        10. mb-54907_stats
          67 kB
        11. Test_logs.txt
          1.02 MB
        12. test_output.log
          398 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            I see LCB_ERR_BUCKET_NOT_FOUND errors on node 172.23.122.171

            2021-04-24T21:50:47.230-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,cccp L:187 I:1872108005] <NOHOST:NOPORT> (CTX=(nil),) Could not get configuration: LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.230-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,confmon L:211 I:1872108005] Provider 'CCCP' failed: LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.231-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:961 I:1489768050] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed00dc870,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-04-24T21:50:47.231-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:818 I:1489768050] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed00dc870,IX=0) Failing command (pkt=0x7faecc0238f0, opaque=12413, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.236-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:961 I:1872108005] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed0126e90,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-04-24T21:50:47.236-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:818 I:1872108005] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed0126e90,IX=0) Failing command (pkt=0x7faec802d740, opaque=12413, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210)

            and on node 172.23.122.172

            2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:961 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f56280dc870,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f56280dc870,IX=0) Failing command (pkt=0x7f5620035fa0, opaque=12449, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f56280dc870,IX=0) Failing command (pkt=0x7f5620036010, opaque=12450, opcode=0xb5) with error LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,cccp L:187 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),) Could not get configuration: LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,confmon L:211 I:2437775077] Provider 'CCCP' failed: LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:961 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f5628126e90,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f5628126e90,IX=0) Failing command (pkt=0x7f5618025620, opaque=12449, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f5628126e90,IX=0) Failing command (pkt=0x7f5618025690, opaque=12450, opcode=0xb5) with error LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,cccp L:187 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),) Could not get configuration: LCB_ERR_BUCKET_NOT_FOUND (210)
            2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,confmon L:211 I:3930440849] Provider 'CCCP' failed: LCB_ERR_BUCKET_NOT_FOUND (210)
            

            Clearly related to MB-44324

            vinayaka.kamath Vinayaka Kamath (Inactive) added a comment - I see LCB_ERR_BUCKET_NOT_FOUND errors on node 172.23.122.171 2021-04-24T21:50:47.230-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,cccp L:187 I:1872108005] <NOHOST:NOPORT> (CTX=(nil),) Could not get configuration: LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.230-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,confmon L:211 I:1872108005] Provider 'CCCP' failed: LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.231-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:961 I:1489768050] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed00dc870,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-04-24T21:50:47.231-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:818 I:1489768050] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed00dc870,IX=0) Failing command (pkt=0x7faecc0238f0, opaque=12413, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.236-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:961 I:1872108005] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed0126e90,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-04-24T21:50:47.236-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:113727] [lcb,server L:818 I:1872108005] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7faed0126e90,IX=0) Failing command (pkt=0x7faec802d740, opaque=12413, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210) and on node 172.23.122.172 2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:961 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f56280dc870,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f56280dc870,IX=0) Failing command (pkt=0x7f5620035fa0, opaque=12449, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f56280dc870,IX=0) Failing command (pkt=0x7f5620036010, opaque=12450, opcode=0xb5) with error LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,cccp L:187 I:2437775077] <NOHOST:NOPORT> (CTX=(nil),) Could not get configuration: LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.517-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,confmon L:211 I:2437775077] Provider 'CCCP' failed: LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:961 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f5628126e90,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f5628126e90,IX=0) Failing command (pkt=0x7f5618025620, opaque=12449, opcode=0x0) with error LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,server L:818 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f5628126e90,IX=0) Failing command (pkt=0x7f5618025690, opaque=12450, opcode=0xb5) with error LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,cccp L:187 I:3930440849] <NOHOST:NOPORT> (CTX=(nil),) Could not get configuration: LCB_ERR_BUCKET_NOT_FOUND (210) 2021-04-24T21:50:47.534-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] [lcb,confmon L:211 I:3930440849] Provider 'CCCP' failed: LCB_ERR_BUCKET_NOT_FOUND (210) Clearly related to MB-44324

            Vinayaka Kamath why we are not seeing these errors in lcb stats? Do we have a regression there?

            vikas.chaudhary Vikas Chaudhary added a comment - Vinayaka Kamath  why we are not seeing these errors in lcb stats? Do we have a regression there?

            Vikas Chaudhary lcb_exception_stats are non zero

            2021-04-24T21:49:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: thr_map:1 latency_stats:600 curl_latency_stats:600 *lcb_exception_stats:600* log_level:1 thr_count:1 v8_init:1 v8_load:1 vb_map:2 execution_stats:600 failure_stats:600 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:49:50.075-07:00 [Info] Consumer::processStatsEvents [worker_bucket_op_0:/tmp/127.0.0.1:8091_0_3698201870.sock:107682] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: curl_latency_stats:660 log_level:1 v8_init:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 thr_map:1 thr_count:1 v8_load:1 *lcb_exception_stats:660* Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:49:53.898-07:00 [Info] Consumer::processStatsEvents [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: vb_map:2 execution_stats:600 lcb_exception_stats:600 curl_latency_stats:600 log_level:1 thr_count:1 v8_init:1 v8_load:1 failure_stats:600 latency_stats:600 thr_map:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:50:06.010-07:00 [Info] Consumer::processStatsEvents [worker_curl_0:/tmp/127.0.0.1:8091_0_2876695861.sock:107709] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 curl_latency_stats:660 thr_count:1 thr_map:1 v8_init:1 lcb_exception_stats:660 log_level:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:50:21.942-07:00 [Info] Consumer::processStatsEvents [worker_n1ql_0:/tmp/127.0.0.1:8091_0_15129118.sock:107782] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 latency_stats:660 curl_latency_stats:660 thr_map:1 thr_count:1 v8_init:1 lcb_exception_stats:660 log_level:1 execution_stats:660 failure_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:50:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: vb_map:2 execution_stats:660 failure_stats:660 2021-04-24T21:49:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: thr_map:1 latency_stats:600 curl_latency_stats:600 lcb_exception_stats:600 log_level:1 thr_count:1 v8_init:1 v8_load:1 vb_map:2 execution_stats:600 failure_stats:600 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:49:50.075-07:00 [Info] Consumer::processStatsEvents [worker_bucket_op_0:/tmp/127.0.0.1:8091_0_3698201870.sock:107682] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: curl_latency_stats:660 log_level:1 v8_init:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 thr_map:1 thr_count:1 v8_load:1 lcb_exception_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:49:53.898-07:00 [Info] Consumer::processStatsEvents [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: vb_map:2 execution_stats:600 lcb_exception_stats:600 curl_latency_stats:600 log_level:1 thr_count:1 v8_init:1 v8_load:1 failure_stats:600 latency_stats:600 thr_map:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:50:06.010-07:00 [Info] Consumer::processStatsEvents [worker_curl_0:/tmp/127.0.0.1:8091_0_2876695861.sock:107709] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 curl_latency_stats:660 thr_count:1 thr_map:1 v8_init:1 lcb_exception_stats:660 log_level:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:50:21.942-07:00 [Info] Consumer::processStatsEvents [worker_n1ql_0:/tmp/127.0.0.1:8091_0_15129118.sock:107782] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 latency_stats:660 curl_latency_stats:660 thr_map:1 thr_count:1 v8_init:1 lcb_exception_stats:660 log_level:1 execution_stats:660 failure_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            2021-04-24T21:50:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: vb_map:2 execution_stats:660 failure_stats:660 lcb_exception_stats:660 log_level:1 thr_count:1 v8_init:1 v8_load:1 thr_map:1 latency_stats:660 curl_latency_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]log_level:1 thr_count:1 v8_init:1 v8_load:1 thr_map:1 latency_stats:660 curl_latency_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]
            

            I see the same values in Test_logs.txt file in the attachment too. For example

            "event_processing_stats": {
                        "adhoc_timer_response_received": 2,
                        "agg_messages_sent_to_worker": 4370,
                        "agg_queue_memory": 0,
                        "agg_queue_memory_cap": 53477376,
                        "agg_queue_size": 0,
                        "agg_queue_size_cap": 100000,
                        "agg_timer_feedback_queue_cap": 500,
                        "dcp_mutation": 26982,
                        "dcp_snapshot": 512,
                        "dcp_stream_req_counter": 512,
                        "dcp_streamreq": 512,
                        "execution_stats": 667,
                        "failure_stats": 667,
                        "latency_stats": 667,
                        "lcb_exception_stats": 667,
                        "log_level": 1,
                        "num_processed_events": 512,
                        "processed_events_size": 49152,
                        "thr_count": 1,
                        "thr_map": 1,
                        "v8_init": 1,
                        "v8_load": 1
                    },

            Edit: Sorry, I think I misunderstood your question there Vikas, I am looking into it right now. Will update the ticket with information. Thanks!

            vinayaka.kamath Vinayaka Kamath (Inactive) added a comment - - edited Vikas Chaudhary lcb_exception_stats are non zero 2021-04-24T21:49:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: thr_map:1 latency_stats:600 curl_latency_stats:600 *lcb_exception_stats:600* log_level:1 thr_count:1 v8_init:1 v8_load:1 vb_map:2 execution_stats:600 failure_stats:600 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:49:50.075-07:00 [Info] Consumer::processStatsEvents [worker_bucket_op_0:/tmp/127.0.0.1:8091_0_3698201870.sock:107682] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: curl_latency_stats:660 log_level:1 v8_init:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 thr_map:1 thr_count:1 v8_load:1 *lcb_exception_stats:660* Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:49:53.898-07:00 [Info] Consumer::processStatsEvents [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: vb_map:2 execution_stats:600 lcb_exception_stats:600 curl_latency_stats:600 log_level:1 thr_count:1 v8_init:1 v8_load:1 failure_stats:600 latency_stats:600 thr_map:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:50:06.010-07:00 [Info] Consumer::processStatsEvents [worker_curl_0:/tmp/127.0.0.1:8091_0_2876695861.sock:107709] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 curl_latency_stats:660 thr_count:1 thr_map:1 v8_init:1 lcb_exception_stats:660 log_level:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:50:21.942-07:00 [Info] Consumer::processStatsEvents [worker_n1ql_0:/tmp/127.0.0.1:8091_0_15129118.sock:107782] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 latency_stats:660 curl_latency_stats:660 thr_map:1 thr_count:1 v8_init:1 lcb_exception_stats:660 log_level:1 execution_stats:660 failure_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:50:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: vb_map:2 execution_stats:660 failure_stats:660 2021-04-24T21:49:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: thr_map:1 latency_stats:600 curl_latency_stats:600 lcb_exception_stats:600 log_level:1 thr_count:1 v8_init:1 v8_load:1 vb_map:2 execution_stats:600 failure_stats:600 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:49:50.075-07:00 [Info] Consumer::processStatsEvents [worker_bucket_op_0:/tmp/127.0.0.1:8091_0_3698201870.sock:107682] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: curl_latency_stats:660 log_level:1 v8_init:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 thr_map:1 thr_count:1 v8_load:1 lcb_exception_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:49:53.898-07:00 [Info] Consumer::processStatsEvents [worker_timers_0:/tmp/127.0.0.1:8091_0_2026117502.sock:107835] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: vb_map:2 execution_stats:600 lcb_exception_stats:600 curl_latency_stats:600 log_level:1 thr_count:1 v8_init:1 v8_load:1 failure_stats:600 latency_stats:600 thr_map:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:50:06.010-07:00 [Info] Consumer::processStatsEvents [worker_curl_0:/tmp/127.0.0.1:8091_0_2876695861.sock:107709] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 execution_stats:660 failure_stats:660 latency_stats:660 curl_latency_stats:660 thr_count:1 thr_map:1 v8_init:1 lcb_exception_stats:660 log_level:1 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:50:21.942-07:00 [Info] Consumer::processStatsEvents [worker_n1ql_0:/tmp/127.0.0.1:8091_0_15129118.sock:107782] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:512 DCP_MUTATION:26982 V8 events: v8_load:1 vb_map:2 latency_stats:660 curl_latency_stats:660 thr_map:1 thr_count:1 v8_init:1 lcb_exception_stats:660 log_level:1 execution_stats:660 failure_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] 2021-04-24T21:50:37.903-07:00 [Info] Consumer::processStatsEvents [worker_sbm_0:/tmp/127.0.0.1:8091_0_2698172145.sock:107807] DCP events: DCP_STREAMREQ:512 DCP_SNAPSHOT:510 DCP_MUTATION:2919 V8 events: vb_map:2 execution_stats:660 failure_stats:660 lcb_exception_stats:660 log_level:1 thr_count:1 v8_init:1 v8_load:1 thr_map:1 latency_stats:660 curl_latency_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023]log_level:1 thr_count:1 v8_init:1 v8_load:1 thr_map:1 latency_stats:660 curl_latency_stats:660 Timer events: Doc: 0, vbs owned len: 512 vbs owned: [512-1023] I see the same values in Test_logs.txt file in the attachment too. For example "event_processing_stats": { "adhoc_timer_response_received": 2, "agg_messages_sent_to_worker": 4370, "agg_queue_memory": 0, "agg_queue_memory_cap": 53477376, "agg_queue_size": 0, "agg_queue_size_cap": 100000, "agg_timer_feedback_queue_cap": 500, "dcp_mutation": 26982, "dcp_snapshot": 512, "dcp_stream_req_counter": 512, "dcp_streamreq": 512, "execution_stats": 667, "failure_stats": 667, "latency_stats": 667, "lcb_exception_stats": 667, "log_level": 1, "num_processed_events": 512, "processed_events_size": 49152, "thr_count": 1, "thr_map": 1, "v8_init": 1, "v8_load": 1 }, Edit: Sorry, I think I misunderstood your question there Vikas, I am looking into it right now. Will update the ticket with information. Thanks!

            Vinayaka Kamath we have exception details as , it contains error codes as well 

            "lcb_exception_stats": {}, 

            vikas.chaudhary Vikas Chaudhary added a comment - Vinayaka Kamath  we have exception details as , it contains error codes as well  "lcb_exception_stats": {},

            Vikas Chaudhary Can you run this test with the latest build? Please close the ticket if it's not present in the latest run. Eventing has forwarded the manifest to libcouchbase 3.1.2.

            Coming back to the lcb_exception_stats, it seems like all the LCB_ERR_BUCKET_NOT_FOUND are actually logged by libcouchbase and none by eventing. Seems like they are struck in a retry loop from the logs. The latest build should have the fix for that too.

            vinayaka.kamath Vinayaka Kamath (Inactive) added a comment - Vikas Chaudhary Can you run this test with the latest build? Please close the ticket if it's not present in the latest run. Eventing has forwarded the manifest to libcouchbase 3.1.2. Coming back to the lcb_exception_stats, it seems like all the LCB_ERR_BUCKET_NOT_FOUND are actually logged by libcouchbase and none by eventing. Seems like they are struck in a retry loop from the logs. The latest build should have the fix for that too.

            Vinayaka Kamath we have started a run on 7.0.0-5046

            vikas.chaudhary Vikas Chaudhary added a comment - Vinayaka Kamath we have started a run on 7.0.0-5046

            Vinayaka Kamath it passed on rerun. Why "lcb_exception_stats": {} are empty. can you confirm there is no issue with stats?

            vikas.chaudhary Vikas Chaudhary added a comment - Vinayaka Kamath  it passed on rerun. Why "lcb_exception_stats": {} are empty. can you confirm there is no issue with stats?

            Build couchbase-server-7.0.0-5092 contains eventing commit 42dcdf8 with commit message:
            MB-45907: Update LCB stats when Bucket Connect Fails

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-5092 contains eventing commit 42dcdf8 with commit message: MB-45907 : Update LCB stats when Bucket Connect Fails
            chanabasappa.ghali Chanabasappa Ghali added a comment - - edited

            Verified with Enterprise Edition 7.0.0 build 5092

            I am still seeing below exception

            Exception: missing data in destination bucket. Current : 0 Expected : 40320  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'10.112.206.104': {}}

            chanabasappa.ghali Chanabasappa Ghali added a comment - - edited Verified with Enterprise Edition 7.0.0 build 5092 I am still seeing below exception Exception: missing data in destination bucket. Current : 0 Expected : 40320  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'10.112.206.104': {}}

            In the logs, I see that the bucket is getting deleted(or SDKs are returning that the bucket is deleted) about the same time as undeploy of n1ql and timers handlers.

            2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: n1ql current state: 1 requested status for deployment: false processing: false
            2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: n1ql Begin undeploy process. Current state: 1
            2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: n1ql enabled, settings change requesting undeployment
            2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::deleteFromLocallyDeployedApps [0] Function: n1ql deleting from locally deployed apps map
            2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::undeployFunctionCallback [0] Function: timers response from server:  resp: &{200 OK 200 HTTP/1.1 1 1 map[Content-Length:[0] Date:[Wed, 05 May 2021 05:11:11 GMT] Status:[0]] 0xc00105e3c0 0 [] false false map[] 0xc0004c2e00 <nil>}
            

            2021-05-04T22:11:11.121-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: timers Begin undeploy process. Current state: 1
            2021-05-04T22:11:11.121-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: timers enabled, settings change requesting undeployment
            2021-05-04T22:11:11.121-07:00 [Info] SuperSupervisor::deleteFromLocallyDeployedApps [0] Function: timers deleting from locally deployed apps map
            2021-05-04T22:11:11.129-07:00 [Info] SuperSupervisor::deleteFromDeployedApps [0] Function: timers deleting from deployed apps map
            2021-05-04T22:11:11.129-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: timers undeployment done
            2021-05-04T22:11:11.253-07:00 [Error] Unable to connect with bucket "src_bucket"
            2021-05-04T22:11:11.253-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_n1ql_0:/tmp/127.0.0.1:8091_0_3213650520.sock:19323] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket
            2021-05-04T22:11:11.259-07:00 [Error] Unable to connect with bucket "src_bucket"
            2021-05-04T22:11:11.259-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_sbm_0:/tmp/127.0.0.1:8091_0_2795806267.sock:19284] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket
            

            This is followed by a lot of connection failures to the source bucket in both gocb and lcb:

            2021-05-04T22:11:11.462-07:00 [Error] Unable to connect with bucket "src_bucket"
            2021-05-04T22:11:11.463-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_bucket_op_0:/tmp/127.0.0.1:8091_0_1017459917.sock:19243] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket
            2021-05-04T22:11:11.642-07:00 [Info] serviceChangeNotifier: received PoolChangeNotification
            2021-05-04T22:11:11.660-07:00 [Error] Unable to connect with bucket "src_bucket"
            2021-05-04T22:11:11.660-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_curl_0:/tmp/127.0.0.1:8091_0_31498936.sock:19304] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket
            

            2021-05-04T22:11:12.296-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:3637500671] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16301a04c0,IX=2) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-05-04T22:11:12.297-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:3637500671] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f163019ebb0,IX=1) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-05-04T22:11:12.297-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:2692714925] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16300e6620,IX=2) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-05-04T22:11:12.305-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:2692714925] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16300e5c40,IX=1) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-05-04T22:11:12.306-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:3637500671] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16301a6090,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            2021-05-04T22:11:12.306-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:2692714925] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f163004ef70,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system
            

            We can also see that the current connections we had opened with the buckets have also been shutdown:

            2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:3637500671] <10.112.206.102:11210> (CTX=0x7f1630129570,memcached,SRV=0x7f163019ebb0,IX=1) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025)
            2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,connection L:485 I:3637500671] <10.112.206.102:11210> (SOCK=ebb2360296b71299) Starting. Timeout=5000000us
            2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:3637500671] <10.112.206.103:11210> (CTX=0x7f163023a7f0,memcached,SRV=0x7f16301a04c0,IX=2) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025)
            2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,connection L:485 I:3637500671] <10.112.206.103:11210> (SOCK=34cd6a6ec17d57c1) Starting. Timeout=5000000us
            2021-05-04T22:11:12.274-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:2692714925] <10.112.206.102:11210> (CTX=0x7f1630109950,memcached,SRV=0x7f16300e5c40,IX=1) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025)
            2021-05-04T22:11:12.274-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,connection L:485 I:2692714925] <10.112.206.102:11210> (SOCK=838e9f3e985013f0) Starting. Timeout=5000000us
            2021-05-04T22:11:12.274-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:2692714925] <10.112.206.103:11210> (CTX=0x7f16301e4ef0,memcached,SRV=0x7f16300e6620,IX=2) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025)
            

            Concurrently, I see a shutdown in the memcached logs too

            2021-05-04T22:11:40.266407-07:00 INFO Received shutdown request
            2021-05-04T22:11:40.266440-07:00 INFO Initiating graceful shutdown.
            2021-05-04T22:11:40.266442-07:00 INFO Stop all buckets
            2021-05-04T22:11:40.266448-07:00 INFO Shutting down Prometheus exporter
            2021-05-04T22:11:40.317389-07:00 INFO Shutting down audit daemon
            2021-05-04T22:11:40.317896-07:00 INFO Shutting down client worker threads
            2021-05-04T22:11:40.317954-07:00 INFO Stopping worker thread 0
            2021-05-04T22:11:40.318053-07:00 INFO Stopping worker thread 1
            2021-05-04T22:11:40.318156-07:00 INFO Stopping worker thread 2
            2021-05-04T22:11:40.318206-07:00 INFO Stopping worker thread 3
            2021-05-04T22:11:40.319026-07:00 INFO Releasing server sockets
            2021-05-04T22:11:40.319077-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11210
            2021-05-04T22:11:40.319118-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11209
            2021-05-04T22:11:40.319192-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11207
            2021-05-04T22:11:40.319197-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11206
            2021-05-04T22:11:40.319201-07:00 INFO Shutting down IPv6 interface: [::]:11210
            2021-05-04T22:11:40.319206-07:00 INFO Shutting down IPv6 interface: [::]:11209
            2021-05-04T22:11:40.319210-07:00 INFO Shutting down IPv6 interface: [::]:11207
            2021-05-04T22:11:40.319213-07:00 INFO Shutting down IPv6 interface: [::]:11206
            2021-05-04T22:11:40.319219-07:00 INFO Releasing bucket resources
            2021-05-04T22:11:40.319240-07:00 INFO Shutting down RBAC subsystem
            2021-05-04T22:11:40.319246-07:00 INFO Releasing thread resources
            2021-05-04T22:11:40.319289-07:00 INFO Shutting down executor pool 
            2021-05-04T22:11:40.319813-07:00 INFO Releasing signal handlers
            2021-05-04T22:11:40.319819-07:00 INFO Shutting down external authentication service
            2021-05-04T22:11:40.319894-07:00 INFO Release cached SSL contexts
            2021-05-04T22:11:40.319896-07:00 INFO Shutting down SASL server
            2021-05-04T22:11:40.319897-07:00 INFO Deinitialising tracing
            2021-05-04T22:11:40.319983-07:00 INFO Shutting down all engines
            2021-05-04T22:11:40.320060-07:00 INFO Removing breakpad
            2021-05-04T22:11:40.320066-07:00 INFO Shutting down OpenSSL
            2021-05-04T22:11:40.320073-07:00 INFO Shutting down libevent
            2021-05-04T22:11:40.320084-07:00 INFO Shutting down logger extension
            2021-05-04T22:11:40.320295-07:00 INFO ---------- Closing logfile
            2021-05-04T22:11:40.353242-07:00 INFO ---------- Opening logfile: /opt/couchbase/var/lib/couchbase/logs/memcached.log.000000.txt
            2021-05-04T22:11:40.353969-07:00 INFO Couchbase version 7.0.0-5092 starting.
            2021-05-04T22:11:40.353983-07:00 INFO recalculate_max_connections: {"max_fds":200000,"max_connections":65000,"system_connections":5000,"engine_fds":133982}
            2021-05-04T22:11:40.354027-07:00 INFO Breakpad enabled. Minidumps will be written to '/opt/couchbase/var/lib/couchbase/crash'
            2021-05-04T22:11:40.354222-07:00 INFO Using SLA configuration: {"COMPACT_DB":{"slow":"1800 s"},"CREATE_BUCKET":{"slow":"5 s"},"DELETE_BUCKET":{"slow":"10 s"},"SELECT_BUCKET":{"slow":"10 ms"},"SEQNO_PERSISTENCE":{"slow":"30 s"},"comment":"Current MCBP SLA configuration","default":{"slow":"500 ms"},"version":1}
            

            I see a PrepareTopologyChange after I start seeing the LCB_ERR_BUCKET_NOT_FOUND errors

            2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange change: service.TopologyChange{ID:"a638566cbdd32a41b90cb11b7db11679", CurrentTopologyRev:service.Revision(nil), Type:"topology-change-rebalance", KeepNodes:[]struct { NodeInfo service.NodeInfo "json:\"nodeInfo\""; RecoveryType service.RecoveryType "json:\"recoveryType\"" }{}, EjectNodes:[]service.NodeInfo{service.NodeInfo{NodeID:"88674c4c66bb1af41616ae62f507d464", Priority:0, Opaque:interface {}(nil)}}}
            2021-05-04T22:11:12.456-07:00 [Info] SuperSupervisor::RebalanceStatus [0] Rebalance status from all running applications: map[string]bool{}
            2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange ejectNodeUUIDs: [88674c4c66bb1af41616ae62f507d464] keepNodeUUIDs: []
            2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange m.rebalanceID: a638566cbdd32a41b90cb11b7db11679 m.servers: [88674c4c66bb1af41616ae62f507d464]
            2021-05-04T22:11:12.456-07:00 [Error] SuperSupervisor::NotifyPrepareTopologyChange [0] 0 eventing nodes supplied as keepNodes
            2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange completed, isBalanced: true
            

            Chanabasappa Ghali Would it be possible to get the output log of the test attached to the ticket to get a better idea of what was going on?

            vinayaka.kamath Vinayaka Kamath (Inactive) added a comment - In the logs, I see that the bucket is getting deleted(or SDKs are returning that the bucket is deleted) about the same time as undeploy of n1ql and timers handlers. 2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: n1ql current state: 1 requested status for deployment: false processing: false 2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: n1ql Begin undeploy process. Current state: 1 2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: n1ql enabled, settings change requesting undeployment 2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::deleteFromLocallyDeployedApps [0] Function: n1ql deleting from locally deployed apps map 2021-05-04T22:11:11.111-07:00 [Info] SuperSupervisor::undeployFunctionCallback [0] Function: timers response from server: resp: &{200 OK 200 HTTP/1.1 1 1 map[Content-Length:[0] Date:[Wed, 05 May 2021 05:11:11 GMT] Status:[0]] 0xc00105e3c0 0 [] false false map[] 0xc0004c2e00 <nil>} 2021-05-04T22:11:11.121-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: timers Begin undeploy process. Current state: 1 2021-05-04T22:11:11.121-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: timers enabled, settings change requesting undeployment 2021-05-04T22:11:11.121-07:00 [Info] SuperSupervisor::deleteFromLocallyDeployedApps [0] Function: timers deleting from locally deployed apps map 2021-05-04T22:11:11.129-07:00 [Info] SuperSupervisor::deleteFromDeployedApps [0] Function: timers deleting from deployed apps map 2021-05-04T22:11:11.129-07:00 [Info] SuperSupervisor::SettingsChangeCallback [0] Function: timers undeployment done 2021-05-04T22:11:11.253-07:00 [Error] Unable to connect with bucket "src_bucket" 2021-05-04T22:11:11.253-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_n1ql_0:/tmp/127.0.0.1:8091_0_3213650520.sock:19323] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket 2021-05-04T22:11:11.259-07:00 [Error] Unable to connect with bucket "src_bucket" 2021-05-04T22:11:11.259-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_sbm_0:/tmp/127.0.0.1:8091_0_2795806267.sock:19284] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket This is followed by a lot of connection failures to the source bucket in both gocb and lcb: 2021-05-04T22:11:11.462-07:00 [Error] Unable to connect with bucket "src_bucket" 2021-05-04T22:11:11.463-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_bucket_op_0:/tmp/127.0.0.1:8091_0_1017459917.sock:19243] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket 2021-05-04T22:11:11.642-07:00 [Info] serviceChangeNotifier: received PoolChangeNotification 2021-05-04T22:11:11.660-07:00 [Error] Unable to connect with bucket "src_bucket" 2021-05-04T22:11:11.660-07:00 [Error] Consumer::dcpEventsRemainingToProcess [worker_curl_0:/tmp/127.0.0.1:8091_0_31498936.sock:19304] Failed to fetch get_all_vb_seqnos, err: No bucket named src_bucket 2021-05-04T22:11:12.296-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:3637500671] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16301a04c0,IX=2) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-05-04T22:11:12.297-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:3637500671] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f163019ebb0,IX=1) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-05-04T22:11:12.297-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:2692714925] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16300e6620,IX=2) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-05-04T22:11:12.305-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:2692714925] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16300e5c40,IX=1) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-05-04T22:11:12.306-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:3637500671] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f16301a6090,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system 2021-05-04T22:11:12.306-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:961 I:2692714925] <NOHOST:NOPORT> (CTX=(nil),,SRV=0x7f163004ef70,IX=0) Connection attempt failed. Received LCB_ERR_BUCKET_NOT_FOUND (210) from libcouchbase, received 0 from operating system We can also see that the current connections we had opened with the buckets have also been shutdown: 2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:3637500671] <10.112.206.102:11210> (CTX=0x7f1630129570,memcached,SRV=0x7f163019ebb0,IX=1) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025) 2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,connection L:485 I:3637500671] <10.112.206.102:11210> (SOCK=ebb2360296b71299) Starting. Timeout=5000000us 2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:3637500671] <10.112.206.103:11210> (CTX=0x7f163023a7f0,memcached,SRV=0x7f16301a04c0,IX=2) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025) 2021-05-04T22:11:12.273-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,connection L:485 I:3637500671] <10.112.206.103:11210> (SOCK=34cd6a6ec17d57c1) Starting. Timeout=5000000us 2021-05-04T22:11:12.274-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:2692714925] <10.112.206.102:11210> (CTX=0x7f1630109950,memcached,SRV=0x7f16300e5c40,IX=1) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025) 2021-05-04T22:11:12.274-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,connection L:485 I:2692714925] <10.112.206.102:11210> (SOCK=838e9f3e985013f0) Starting. Timeout=5000000us 2021-05-04T22:11:12.274-07:00 [Info] eventing-consumer [worker_timers_0:/tmp/127.0.0.1:8091_0_2569735366.sock:19262] [lcb,server L:1108 I:2692714925] <10.112.206.103:11210> (CTX=0x7f16301e4ef0,memcached,SRV=0x7f16300e6620,IX=2) Got socket error LCB_ERR_SOCKET_SHUTDOWN (1025) Concurrently, I see a shutdown in the memcached logs too 2021-05-04T22:11:40.266407-07:00 INFO Received shutdown request 2021-05-04T22:11:40.266440-07:00 INFO Initiating graceful shutdown. 2021-05-04T22:11:40.266442-07:00 INFO Stop all buckets 2021-05-04T22:11:40.266448-07:00 INFO Shutting down Prometheus exporter 2021-05-04T22:11:40.317389-07:00 INFO Shutting down audit daemon 2021-05-04T22:11:40.317896-07:00 INFO Shutting down client worker threads 2021-05-04T22:11:40.317954-07:00 INFO Stopping worker thread 0 2021-05-04T22:11:40.318053-07:00 INFO Stopping worker thread 1 2021-05-04T22:11:40.318156-07:00 INFO Stopping worker thread 2 2021-05-04T22:11:40.318206-07:00 INFO Stopping worker thread 3 2021-05-04T22:11:40.319026-07:00 INFO Releasing server sockets 2021-05-04T22:11:40.319077-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11210 2021-05-04T22:11:40.319118-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11209 2021-05-04T22:11:40.319192-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11207 2021-05-04T22:11:40.319197-07:00 INFO Shutting down IPv4 interface: 0.0.0.0:11206 2021-05-04T22:11:40.319201-07:00 INFO Shutting down IPv6 interface: [::]:11210 2021-05-04T22:11:40.319206-07:00 INFO Shutting down IPv6 interface: [::]:11209 2021-05-04T22:11:40.319210-07:00 INFO Shutting down IPv6 interface: [::]:11207 2021-05-04T22:11:40.319213-07:00 INFO Shutting down IPv6 interface: [::]:11206 2021-05-04T22:11:40.319219-07:00 INFO Releasing bucket resources 2021-05-04T22:11:40.319240-07:00 INFO Shutting down RBAC subsystem 2021-05-04T22:11:40.319246-07:00 INFO Releasing thread resources 2021-05-04T22:11:40.319289-07:00 INFO Shutting down executor pool 2021-05-04T22:11:40.319813-07:00 INFO Releasing signal handlers 2021-05-04T22:11:40.319819-07:00 INFO Shutting down external authentication service 2021-05-04T22:11:40.319894-07:00 INFO Release cached SSL contexts 2021-05-04T22:11:40.319896-07:00 INFO Shutting down SASL server 2021-05-04T22:11:40.319897-07:00 INFO Deinitialising tracing 2021-05-04T22:11:40.319983-07:00 INFO Shutting down all engines 2021-05-04T22:11:40.320060-07:00 INFO Removing breakpad 2021-05-04T22:11:40.320066-07:00 INFO Shutting down OpenSSL 2021-05-04T22:11:40.320073-07:00 INFO Shutting down libevent 2021-05-04T22:11:40.320084-07:00 INFO Shutting down logger extension 2021-05-04T22:11:40.320295-07:00 INFO ---------- Closing logfile 2021-05-04T22:11:40.353242-07:00 INFO ---------- Opening logfile: /opt/couchbase/var/lib/couchbase/logs/memcached.log.000000.txt 2021-05-04T22:11:40.353969-07:00 INFO Couchbase version 7.0.0-5092 starting. 2021-05-04T22:11:40.353983-07:00 INFO recalculate_max_connections: {"max_fds":200000,"max_connections":65000,"system_connections":5000,"engine_fds":133982} 2021-05-04T22:11:40.354027-07:00 INFO Breakpad enabled. Minidumps will be written to '/opt/couchbase/var/lib/couchbase/crash' 2021-05-04T22:11:40.354222-07:00 INFO Using SLA configuration: {"COMPACT_DB":{"slow":"1800 s"},"CREATE_BUCKET":{"slow":"5 s"},"DELETE_BUCKET":{"slow":"10 s"},"SELECT_BUCKET":{"slow":"10 ms"},"SEQNO_PERSISTENCE":{"slow":"30 s"},"comment":"Current MCBP SLA configuration","default":{"slow":"500 ms"},"version":1} I see a PrepareTopologyChange after I start seeing the LCB_ERR_BUCKET_NOT_FOUND errors 2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange change: service.TopologyChange{ID:"a638566cbdd32a41b90cb11b7db11679", CurrentTopologyRev:service.Revision(nil), Type:"topology-change-rebalance", KeepNodes:[]struct { NodeInfo service.NodeInfo "json:\"nodeInfo\""; RecoveryType service.RecoveryType "json:\"recoveryType\"" }{}, EjectNodes:[]service.NodeInfo{service.NodeInfo{NodeID:"88674c4c66bb1af41616ae62f507d464", Priority:0, Opaque:interface {}(nil)}}} 2021-05-04T22:11:12.456-07:00 [Info] SuperSupervisor::RebalanceStatus [0] Rebalance status from all running applications: map[string]bool{} 2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange ejectNodeUUIDs: [88674c4c66bb1af41616ae62f507d464] keepNodeUUIDs: [] 2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange m.rebalanceID: a638566cbdd32a41b90cb11b7db11679 m.servers: [88674c4c66bb1af41616ae62f507d464] 2021-05-04T22:11:12.456-07:00 [Error] SuperSupervisor::NotifyPrepareTopologyChange [0] 0 eventing nodes supplied as keepNodes 2021-05-04T22:11:12.456-07:00 [Info] ServiceMgr::PrepareTopologyChange completed, isBalanced: true Chanabasappa Ghali Would it be possible to get the output log of the test attached to the ticket to get a better idea of what was going on?

            Chanabasappa Ghali is it possible to get a quick rerun on build 5102, we made recent changes to LCB command retry on it.

            vinayaka.kamath Vinayaka Kamath (Inactive) added a comment - Chanabasappa Ghali is it possible to get a quick rerun on build 5102, we made recent changes to LCB command retry on it.

            With Enterprise Edition 7.0.0 build 5102, same exception

            Exception: missing data in destination bucket. Current : 0 Expected : 40320  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'10.112.206.104': {}}

            chanabasappa.ghali Chanabasappa Ghali added a comment - With Enterprise Edition 7.0.0 build 5102, same exception Exception: missing data in destination bucket. Current : 0 Expected : 40320  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'10.112.206.104': {}}
            ankit.prabhu Ankit Prabhu added a comment -

            Chanabasappa Ghali, In the latest run, mutation phase of test failed with unable to access jar file

            Error: Unable to access jarfile java_sdk_client/collections/target/javaclient/javaclient.jar
            

            2021-05-05 10:56:40 | INFO | MainProcess | Cluster_Thread | [task.execute_for_collection] java -jar java_sdk_client/collections/target/javaclient/javaclient.jar -i 10.112.206.101 -u Administrator -p password -b src_bucket -s _default -c _default -n 40320 -pc 100 -pu 0 -pd 0 -l uniform -dsn 1 -dpx doc_ -dt Person -de 0 -ds 500 -ac False -st 0 -en 40319 -o False -sd False Error: Unable to access jarfile java_sdk_client/collections/target/javaclient/javaclient.jar [<FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 890 in _bootstrap>, <FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 932 in _bootstrap_inner>, <FrameSummary file lib/tasks/taskmanager.py, line 34 in run>, <FrameSummary file lib/tasks/task.py, line 102 in step>, <FrameSummary file lib/tasks/task.py, line 6436 in execute>, <FrameSummary file lib/tasks/task.py, line 6424 in execute_for_collection>, <FrameSummary file lib/tasks/future.py, line 265 in set_exception>] Wed May 5 10:56:40 2021 2021-05-05 10:56:43 | INFO | MainProcess | test_thread | [rest_client.rebalance_reached] rebalance reached >100% in 3.006394863128662 seconds 2021-05-05 10:57:10 | INFO | MainProcess | Cluster_Thread | [task.execute_for_collection] java -jar java_sdk_client/collections/target/javaclient/javaclient.jar -i 10.112.206.101 -u Administrator -p password -b src_bucket -s scope_1 -c coll_1 -n 40320 -pc 100 -pu 0 -pd 0 -l uniform -dsn 1 -dpx doc_ -dt Person -de 0 -ds 500 -ac False -st 0 -en 40319 -o False -sd False Error: Unable to access jarfile java_sdk_client/collections/target/javaclient/javaclient.jar [<FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 890 in _bootstrap>, <FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 932 in _bootstrap_inner>, <FrameSummary file lib/tasks/taskmanager.py, line 34 in run>, <FrameSummary file lib/tasks/task.py, line 102 in step>, <FrameSummary file lib/tasks/task.py, line 6436 in execute>, <FrameSummary file lib/tasks/task.py, line 6424 in execute_for_collection>, <FrameSummary file lib/tasks/future.py, line 265 in set_exception>]
            

             Could you please re run this test after fixing that error.

            ankit.prabhu Ankit Prabhu added a comment - Chanabasappa Ghali , In the latest run, mutation phase of test failed with unable to access jar file Error: Unable to access jarfile java_sdk_client/collections/target/javaclient/javaclient.jar 2021-05-05 10:56:40 | INFO | MainProcess | Cluster_Thread | [task.execute_for_collection] java -jar java_sdk_client/collections/target/javaclient/javaclient.jar -i 10.112.206.101 -u Administrator -p password -b src_bucket -s _default -c _default -n 40320 -pc 100 -pu 0 -pd 0 -l uniform -dsn 1 -dpx doc_ -dt Person -de 0 -ds 500 -ac False -st 0 -en 40319 -o False -sd False Error: Unable to access jarfile java_sdk_client/collections/target/javaclient/javaclient.jar [<FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 890 in _bootstrap>, <FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 932 in _bootstrap_inner>, <FrameSummary file lib/tasks/taskmanager.py, line 34 in run>, <FrameSummary file lib/tasks/task.py, line 102 in step>, <FrameSummary file lib/tasks/task.py, line 6436 in execute>, <FrameSummary file lib/tasks/task.py, line 6424 in execute_for_collection>, <FrameSummary file lib/tasks/future.py, line 265 in set_exception>] Wed May 5 10:56:40 2021 2021-05-05 10:56:43 | INFO | MainProcess | test_thread | [rest_client.rebalance_reached] rebalance reached >100% in 3.006394863128662 seconds 2021-05-05 10:57:10 | INFO | MainProcess | Cluster_Thread | [task.execute_for_collection] java -jar java_sdk_client/collections/target/javaclient/javaclient.jar -i 10.112.206.101 -u Administrator -p password -b src_bucket -s scope_1 -c coll_1 -n 40320 -pc 100 -pu 0 -pd 0 -l uniform -dsn 1 -dpx doc_ -dt Person -de 0 -ds 500 -ac False -st 0 -en 40319 -o False -sd False Error: Unable to access jarfile java_sdk_client/collections/target/javaclient/javaclient.jar [<FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 890 in _bootstrap>, <FrameSummary file /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py, line 932 in _bootstrap_inner>, <FrameSummary file lib/tasks/taskmanager.py, line 34 in run>, <FrameSummary file lib/tasks/task.py, line 102 in step>, <FrameSummary file lib/tasks/task.py, line 6436 in execute>, <FrameSummary file lib/tasks/task.py, line 6424 in execute_for_collection>, <FrameSummary file lib/tasks/future.py, line 265 in set_exception>]  Could you please re run this test after fixing that error.

            Exception: missing data in destination bucket. Current : 38032 Expected : 40320  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'10.112.206.104': {}}

            chanabasappa.ghali Chanabasappa Ghali added a comment - Exception: missing data in destination bucket. Current : 38032 Expected : 40320  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'10.112.206.104': {}}
            ankit.prabhu Ankit Prabhu added a comment -

            Looking at test log the data mismatch happened for curl test handler.

            2021-05-05 21:17:00 | INFO | MainProcess | test_thread | [basetestcase.sleep] sleep for 30 secs. Waiting for handler code dst_bucket.scope_1.coll_3 to complete bucket operations... Current : 38032 Expected : 40320 ...
            

            If there is some exception while sending/receiving request then function skips writing to destination scope.

            function OnUpdate(doc, meta) {
                var request = {
                    path : 'job/test_suite_executor/api/json?tree=jobs[component]'
                }
                try {
                    var response = curl("GET", server, request);
                    log('response body received from server:', response.body);
                    log('response headers received from server:', response.headers);
                    log('response status received from server:', response.status);
                    var res= new Uint8Array(response.body);
                    if(response.status == 200){
                        var result= couchbase.insert(dst_bucket,meta,response.body);
                        log(result);
                    }
                    else{
                        var result= couchbase.insert(dst_bucket,meta,response.status);
                        log(result);
                    }
                }
                catch (e) {
                    log('error:', e);
                    }
            }
             
             
             
            function OnDelete(meta) {
                var request = {
                    path : 'job/test_suite_executor/api/json?tree=jobs[component]'
                };
                try {
                    var response = curl("GET", server, request);
                    log('response body received from server:', response.body);
                    log('response headers received from server:', response.headers);
                    log('response status received from server:', response.status);
                    var res= new Uint8Array(response.body);
                    var doc_meta={"id":meta.id};
                    var result = couchbase.delete(dst_bucket,doc_meta);
                    log(result);
                }
                catch (e) {
                    log('error:', e);
                    }
            }
            

            Looking at the curl function stats eventing executed all 40320 mutations and since function is in try catch block it won't increment on_update_failure if there is some exception. From stats there are curl failure count as 2286 and curl timeout 2 which gives total of 38032 mutations written to dst collection.

              "app_worker_setting_events_lost": 0,
                        "bkt_ops_cas_mismatch_count": 0,
                        "bucket_op_exception_count": 0,
                        "checkpoint_failure_count": 0,
                        "curl_failure_count": 2286,
                        "curl_max_resp_size_exceeded": 0,
                        "curl_non_200_response": 0,
                        "curl_timeout_count": 2,
                        "dcp_events_lost": 0,
                        "debugger_events_lost": 0,
                        "delete_events_lost": 0,
                        "mutation_events_lost": 0,
                        "n1ql_op_exception_count": 0,
                        "timeout_count": 0,
                        "timer_callback_missing_counter": 0,
                        "timer_context_size_exceeded_counter": 0,
                        "timer_events_lost": 0
            

            Function is logging curl error to application logs.
            Chanabasappa Ghali Could you please check application logs of curl function to check why these curl failure happened?

            One such error is unable to resolve host.

               "last_log": "<ud>\"error:\" {\"message\":\"Unable to perform the request: Couldn't resolve host name\",\"stack\":\"Error\\n    at OnUpdate (curl.js:7:21)\"} </ud>"
            

            ankit.prabhu Ankit Prabhu added a comment - Looking at test log the data mismatch happened for curl test handler. 2021-05-05 21:17:00 | INFO | MainProcess | test_thread | [basetestcase.sleep] sleep for 30 secs. Waiting for handler code dst_bucket.scope_1.coll_3 to complete bucket operations... Current : 38032 Expected : 40320 ... If there is some exception while sending/receiving request then function skips writing to destination scope. function OnUpdate(doc, meta) { var request = { path : 'job/test_suite_executor/api/json?tree=jobs[component]' } try { var response = curl("GET", server, request); log('response body received from server:', response.body); log('response headers received from server:', response.headers); log('response status received from server:', response.status); var res= new Uint8Array(response.body); if(response.status == 200){ var result= couchbase.insert(dst_bucket,meta,response.body); log(result); } else{ var result= couchbase.insert(dst_bucket,meta,response.status); log(result); } } catch (e) { log('error:', e); } }       function OnDelete(meta) { var request = { path : 'job/test_suite_executor/api/json?tree=jobs[component]' }; try { var response = curl("GET", server, request); log('response body received from server:', response.body); log('response headers received from server:', response.headers); log('response status received from server:', response.status); var res= new Uint8Array(response.body); var doc_meta={"id":meta.id}; var result = couchbase.delete(dst_bucket,doc_meta); log(result); } catch (e) { log('error:', e); } } Looking at the curl function stats eventing executed all 40320 mutations and since function is in try catch block it won't increment on_update_failure if there is some exception. From stats there are curl failure count as 2286 and curl timeout 2 which gives total of 38032 mutations written to dst collection. "app_worker_setting_events_lost": 0, "bkt_ops_cas_mismatch_count": 0, "bucket_op_exception_count": 0, "checkpoint_failure_count": 0, "curl_failure_count": 2286, "curl_max_resp_size_exceeded": 0, "curl_non_200_response": 0, "curl_timeout_count": 2, "dcp_events_lost": 0, "debugger_events_lost": 0, "delete_events_lost": 0, "mutation_events_lost": 0, "n1ql_op_exception_count": 0, "timeout_count": 0, "timer_callback_missing_counter": 0, "timer_context_size_exceeded_counter": 0, "timer_events_lost": 0 Function is logging curl error to application logs. Chanabasappa Ghali Could you please check application logs of curl function to check why these curl failure happened? One such error is unable to resolve host. "last_log": "<ud>\"error:\" {\"message\":\"Unable to perform the request: Couldn't resolve host name\",\"stack\":\"Error\\n at OnUpdate (curl.js:7:21)\"} </ud>"

            verified with 7.0.0-5102 build and issue is not observed

            chanabasappa.ghali Chanabasappa Ghali added a comment - verified with 7.0.0-5102 build and issue is not observed

            People

              chanabasappa.ghali Chanabasappa Ghali
              vikas.chaudhary Vikas Chaudhary
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty