Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45378

Memcached becomes unresponsive during doc_gets + mcstat reset execution in loop

    XMLWordPrintable

Details

    Description

      Build: 7.0.0-4023

      Scenario:

      1. Load initial docs into the bucket
      2. Perform continuous reads
      3. Perform 'mcstat reset' in parallel to the reads
      4. Perform 'cbstats timings' command to read the current values
      5. Validate there is no crash when stats are getting reset continuously
      

      Observation:

      After some ~240K gets, memcached becomes unresponsive followed by auto-failover of the target node (.211)

      From debug.log,

      [ns_server:debug,2021-03-31T12:27:11.550-07:00,ns_1@172.23.105.211:ns_memcached-default<0.7137.0>:ns_memcached:handle_info:751]Got {'EXIT',<0.7149.0>,
                  {{badmatch,{error,timeout}},
                   [{mc_client_binary,stats_recv,4,
                                      [{file,"src/mc_client_binary.erl"},
                                       {line,168}]},
                    {mc_client_binary,stats,4,
                                      [{file,"src/mc_client_binary.erl"},
                                       {line,419}]},
                    {ns_memcached,do_handle_call,3,
                                  [{file,"src/ns_memcached.erl"},{line,472}]},
                    {ns_memcached,worker_loop,3,
                                  [{file,"src/ns_memcached.erl"},{line,245}]},
                    {proc_lib,init_p_do_apply,3,
                              [{file,"proc_lib.erl"},{line,249}]}]}}. Exiting.
      [user:info,2021-03-31T12:27:11.552-07:00,ns_1@172.23.105.211:ns_memcached-default<0.7137.0>:ns_memcached:do_terminate:802]Control connection to memcached on 'ns_1@172.23.105.211' disconnected. Check logs for details.
      [ns_server:debug,2021-03-31T12:27:11.552-07:00,ns_1@172.23.105.211:ns_memcached-default<0.7137.0>:ns_memcached:terminate:765]Terminated.
      [error_logger:error,2021-03-31T12:27:11.552-07:00,ns_1@172.23.105.211:ns_memcached-default<0.7137.0>:ale_error_logger_handler:do_log:107]
      =========================ERROR REPORT=========================
      

      Last working build: 7.0.0-3814

      QE testcase:

      epengine.basic_ops.basic_ops.test_MB_41510,nodes_init=3,num_items=100000,replicas=1,sdk_timeout=60,batch_size=1000,process_concurrency=6

      Attachments

        1. bt-all-threads.txt
          124 kB
        2. kill-sanitizers.log.34785
          31 kB
        3. sanitizers.log.34523
          18 kB
        4. test.log.2021-04-08_11
          5.08 MB
        5. test.log-1.2021-04-08_11
          5.08 MB

        Issue Links

          Activity

            People

              owend Daniel Owen
              ashwin.govindarajulu Ashwin Govindarajulu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty