Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-41712

Memcached hangs forever trying to shut down the scrubber

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • KV Sprint 2020-Oct

    Description

      Memcached process seen handing after CV run on Windows, with the below back trace and just one thread:

      0:000> k
       # Child-SP          RetAddr           Call Site
      00 000000af`57cff168 00007ffc`5e76bd75 ntdll!NtWaitForAlertByThreadId+0x14
      01 000000af`57cff170 00007ffc`5b6d8c78 ntdll!RtlSleepConditionVariableSRW+0xf5
      02 000000af`57cff1f0 00007ffc`4cfb53dd KERNELBASE!SleepConditionVariableSRW+0x28
      03 000000af`57cff230 00007ffc`4cf92004 msvcp140!__crtSleepConditionVariableSRW+0x11 [d:\agent\_work\1\s\src\vctools\crt\github\stl\src\winapisupp.cpp @ 482] 
      04 000000af`57cff270 00007ffc`4cf91eaf msvcp140!Concurrency::details::stl_condition_variable_win7::wait_for+0x14 [d:\agent\_work\1\s\src\vctools\crt\github\stl\src\primitives.h @ 179] 
      05 000000af`57cff2a0 00007ffc`4d69cb34 msvcp140!do_wait+0x93 [d:\agent\_work\1\s\src\vctools\crt\github\stl\src\cond.cpp @ 66] 
      06 (Inline Function) --------`-------- default_engine!std::_Cnd_timedwaitX+0x10 [c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.16.27023\include\thr\xthread @ 103] 
      07 (Inline Function) --------`-------- default_engine!std::condition_variable::wait_until+0x24 [c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.16.27023\include\mutex @ 775] 
      08 (Inline Function) --------`-------- default_engine!std::condition_variable::_Wait_until1+0x3b [c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.16.27023\include\mutex @ 822] 
      09 (Inline Function) --------`-------- default_engine!std::condition_variable::wait_for+0x7a [c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.16.27023\include\mutex @ 742] 
      0a 000000af`57cff310 00007ffc`4d69c925 default_engine!EngineManager::waitForScrubberToBeIdle+0x114 [c:\jenkins\workspace\kv_engine-windows-master\kv_engine\engines\default_engine\engine_manager.cc @ 98] 
      0b 000000af`57cff3b0 00007ffc`4d699b2e default_engine!EngineManager::shutdown+0x155 [c:\jenkins\workspace\kv_engine-windows-master\kv_engine\engines\default_engine\engine_manager.cc @ 142] 
      0c (Inline Function) --------`-------- default_engine!EngineManager::{dtor}+0x8 [c:\jenkins\workspace\kv_engine-windows-master\kv_engine\engines\default_engine\engine_manager.cc @ 41] 
      0d 000000af`57cff420 00007ffc`5ad90eb3 default_engine!std::default_delete<EngineManager>::operator()+0x1e [c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.16.27023\include\memory @ 2084] 
      0e 000000af`57cff460 00007ffc`5ad8da0b ucrtbase!malloc_base+0x1d3
      0f 000000af`57cff4c0 00007ffc`5ad89ed4 ucrtbase!__crt_seh_guarded_call<int>::operator()<<lambda_7777bce6b2f8c936911f934f8298dc43>,<lambda_f03950bc5685219e0bcd2087efbe011e> & __ptr64,<lambda_3883c3dff614d5e0c5f61bb1ac94921c> >+0x3b
      10 000000af`57cff4f0 00007ffc`4d6aae92 ucrtbase!execute_onexit_table+0x34
      11 000000af`57cff520 00007ffc`4d6aafdc default_engine!dllmain_crt_process_detach+0x52 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp @ 108] 
      12 000000af`57cff550 00007ffc`5e729d9f default_engine!dllmain_dispatch+0xe8 [d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp @ 212] 
      13 000000af`57cff5b0 00007ffc`5e70806b ntdll!LdrpCallInitRoutine+0x4b
      14 000000af`57cff610 00007ffc`5e707d94 ntdll!LdrShutdownProcess+0x14b
      15 000000af`57cff720 00007ffc`5dfdce6a ntdll!RtlExitUserProcess+0xb4
      16 000000af`57cff750 00007ffc`5ad8694d kernel32!ExitProcessImplementation+0xa
      17 000000af`57cff780 00007ffc`5ad868df ucrtbase!exit+0xed
      18 000000af`57cff7b0 00007ff6`3609bb1f ucrtbase!exit+0x7f
      19 000000af`57cff800 00007ff6`3609bd6b memcached!<lambda_4d1eb2070d36a9f66ee8d456cd273e56>::operator()+0x13f [c:\jenkins\workspace\kv_engine-windows-master\kv_engine\daemon\parent_monitor.cc @ 70] 
      1a 000000af`57cff890 00007ff6`3609bc89 memcached!std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_4d1eb2070d36a9f66ee8d456cd273e56> >,std::default_delete<std::tuple<<lambda_4d1eb2070d36a9f66ee8d456cd273e56> > > > >::_Go+0xb [c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.16.27023\include\thr\xthread @ 229] 
      1b 000000af`57cff8d0 00007ffc`5ad9cab0 memcached!`fmt::v5::basic_writer<fmt::v5::back_insert_range<fmt::v5::internal::basic_buffer<char> > >::write_double<double>'::`2'::write_inf_or_nan_t::operator()+0x49
      1c 000000af`57cff900 00007ffc`5dfc8364 ucrtbase!o__realloc_base+0x60
      1d 000000af`57cff930 00007ffc`5e765e91 kernel32!BaseThreadInitThunk+0x14
      1e 000000af`57cff960 00000000`00000000 ntdll!RtlUserThreadStart+0x21
      

      Looks like its stuck on the cond.wait_for() line of EngineManager::waitForScrubberToBeIdle().

      void EngineManager::waitForScrubberToBeIdle(std::unique_lock<std::mutex>& lck) {
          if (!lck.owns_lock()) {
              throw std::logic_error("EngineManager::waitForScrubberToBeIdle: Lock must be held");
          }
          while (!scrubberTask.isIdle()) {
              auto& task = scrubberTask;
              // There is a race for the isIdle call, and I don't want to solve it
              // by using a mutex as that would result in the use of trying to
              // acquire multiple locks (which is a highway to deadlocks ;-)
              //
              // The scrubber does *not* hold the for the scrubber while calling
              // notify on this condition variable.. And the state is then Scrubbing
              // That means that this thread will wake, grab the mutex and check
              // the state which is still Scrubbing and go back to sleep (depending
              // on the scheduling order)..
      >>>    cond.wait_for(lck,
                            std::chrono::milliseconds(10),
                            [&task] {
                  return task.isIdle();
              });
          }
      }
      

      Attachments

        For Gerrit Dashboard: MB-41712
        # Subject Branch Project Status CR V

        Activity

          People

            richard.demellow Richard deMellow
            richard.demellow Richard deMellow
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty