Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-28414

Eventing misses some mutations with non default vbuckets(535,1001 etc) and goes into dcp_backlog

    XMLWordPrintable

Details

    • Bug
    • Resolution: Declined
    • Major
    • 7.0.0
    • 5.5.0
    • eventing
    • 5.5.0-1992
    • Untriaged
    • Centos 64-bit
    • No

    Description

      Scripts to Repro

      ./testrunner -i /tmp/testexec.16849.ini get-cbcollect-info=True -t eventing.eventing_sanity.EventingSanity.test_doc_timer_events_from_handler_code_with_bucket_ops,nodes_init=4,services_init=kv-eventing-index-n1ql,dataset=default,groups=simple,reset_services=True,skip_cleanup=True,vbuckets=1001
      ./testrunner -i /tmp/testexec.16849.ini get-cbcollect-info=True -t eventing.eventing_sanity.EventingSanity.test_cron_timer_events_from_handler_code_with_n1ql,nodes_init=4,services_init=kv-eventing-index-n1ql,dataset=default,groups=simple,reset_services=True,skip_cleanup=True,vbuckets=535
      

       
      Let me clarify that the goal of non default vbuckets tests is not to run tests with different vbucket sizes between 4-1024 in the assumption that customers will do that too. We know that customers are not encouraged to use non default vbuckets and with exception of Mac builds(64 vbuckets, again not used in production) all the other builds defaults to 1024 vbuckets.

      The goal of these tests was to test vbucket ownership assignment to different eventing nodes under different scenarios eg: When we have prime number of eventing nodes and we can't distribute vbuckets equally between all the nodes. We have TC's with 3 and 5 eventing nodes. Having Number of eventing nodes with prime number larger than 5 will be a overkill in functional tests.

      Please feel free to reduce the priority of the bug if you think this line of testing is not of high priority for now. At the same time if you think running with different vbuckets could help uncovering bugs in rebalance tests(or any other tests for that matter) let me know, as of now we are running only sanity tests with non default values.

      Attached automation logs and cbcollect_info logs.

      Stats:

      [
          {
              "event_processing_stats": {
                  "ADHOC_DOC_TIMER_RESPONSES_RECEIVED": 96, 
                  "AGG_MESSAGES_SENT_TO_WORKER": 4922, 
                  "DCP_MUTATION": 2022, 
                  "DCP_MUTATION_SENT_TO_WORKER": 2016, 
                  "DCP_SNAPSHOT": 70, 
                  "DCP_STREAMREQ": 1001, 
                  "DOC_TIMER_EVENTS": 1409, 
                  "DOC_TIMER_RESPONSES_RECEIVED": 2016, 
                  "EXECUTION_STATS": 369, 
                  "FAILURE_STATS": 369, 
                  "HANDLER_CODE": 3, 
                  "LATENCY_STATS": 369, 
                  "LCB_EXCEPTION_STATS": 369, 
                  "LOG_LEVEL": 3, 
                  "SOURCE_MAP": 3, 
                  "THR_COUNT": 3, 
                  "THR_MAP": 3, 
                  "V8_INIT": 3, 
                  "V8_LOAD": 3
              }, 
              "events_remaining": {
                  "dcp_backlog": 1919
              }, 
              "execution_stats": {
                  "agg_queue_size": 0, 
                  "cron_timer_msg_counter": 0, 
                  "dcp_delete_msg_counter": 0, 
                  "dcp_mutation_msg_counter": 2016, 
                  "doc_timer_create_failure": 2010, 
                  "doc_timer_msg_counter": 1409, 
                  "doc_timer_responses_sent": 2016, 
                  "enqueued_cron_timer_msg_counter": 0, 
                  "enqueued_dcp_delete_msg_counter": 0, 
                  "enqueued_dcp_mutation_msg_counter": 2016, 
                  "enqueued_doc_timer_msg_counter": 1409, 
                  "feedback_queue_size": 0, 
                  "messages_parsed": 4913, 
                  "on_delete_failure": 0, 
                  "on_delete_success": 0, 
                  "on_update_failure": 0, 
                  "on_update_success": 2016
              }, 
              "failure_stats": {
                  "bucket_op_exception_count": 0, 
                  "checkpoint_failure_count": 408, 
                  "n1ql_op_exception_count": 0, 
                  "timeout_count": 0
              }, 
              "function_name": "Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops", 
              "lcb_exception_stats": {}, 
              "planner_stats": [
                  {
                      "host_name": "172.23.107.134:8096", 
                      "start_vb": 0, 
                      "vb_count": 1001
                  }
              ], 
              "vb_distribution_stats": {
                  "172.23.107.134:8096": {
                      "worker_Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops_0": "[0-333]", 
                      "worker_Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops_1": "[334-667]", 
                      "worker_Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops_2": "[668-1000]"
                  }
              }, 
              "worker_pids": {
                  "worker_Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops_0": 23338, 
                  "worker_Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops_1": 23344, 
                  "worker_Function_46311861_test_doc_timer_events_from_handler_code_with_bucket_ops_2": 23351
              }
          }
      ] 
      

      Attachments

        1. test_7.zip
          21.00 MB
        2. test_8.zip
          22.66 MB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            vikas.chaudhary Vikas Chaudhary
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty