Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35413

TSan: Data race in Timings::get_or_create_timing_histogram

    XMLWordPrintable

Details

    • Untriaged
    • Unknown
    • KV-Engine MH Beta part 2

    Description

      As reported by ThreadSanitizer when running a 4-node cluster-run and performing an initial rebalance (2 replicas):

      Steps to Reproduce

      1. Build with ThreadSanitizer:

        make -j8 EXTRA_CMAKE_OPTIONS="-D CB_THREADSANITIZER=1"
        

      2. Start cluster with 4 nodes, 2 replicas with TSan set to log:

        TSAN_OPTIONS="log_path=sanitizers.log second_deadlock_stack=1" ./cluster_run --nodes 4
        ./cluster_connect -n4 -r2
        

      3. Direct SyncWrites in a loop at vBucket:0

        kv_engine/engines/ep/management$ ./sync_repl.py 127.0.0.1:12000 Administrator asdasd default loop_setD key value 1000000 1 
        

      Expected Results
      Empty tsan.log files (no TSan issues reported).

      Actual Results

      WARNING: ThreadSanitizer: data race (pid=1085)
        Read of size 8 at 0x0000009bad18 by thread T7 (mutexes: write M2377):
          #0 std::__uniq_ptr_impl<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >::_M_ptr() const /usr/local/include/c++/7.3.0/bits/unique_ptr.h:147 (memcached+0x0000004bf317)
          #1 std::unique_ptr<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >::get() const /usr/local/include/c++/7.3.0/bits/unique_ptr.h:337 (memcached+0x0000004bf317)
          #2 std::unique_ptr<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >::operator bool() const /usr/local/include/c++/7.3.0/bits/unique_ptr.h:351 (memcached+0x0000004bf317)
          #3 bool std::operator==<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >(std::unique_ptr<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> > const&, decltype(nullptr)) /usr/local/include/c++/7.3.0/bits/unique_ptr.h:690 (memcached+0x0000004bf317)
          #4 Timings::get_or_create_timing_histogram(unsigned char) /home/couchbase/couchbase/kv_engine/daemon/timings.cc:146 (memcached+0x0000004bf317)
          #5 Timings::collect(cb::mcbp::ClientOpcode, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) /home/couchbase/couchbase/kv_engine/daemon/timings.cc:42 (memcached+0x0000004bf4aa)
      ...
       
        Previous write of size 8 at 0x0000009bad18 by thread T8 (mutexes: write M2371, write M2392):
          #0 std::enable_if<std::__and_<std::__not_<std::__is_tuple_like<Hdr1sfMicroSecHistogram*> >, std::is_move_constructible<Hdr1sfMicroSecHistogram*>, std::is_move_assignable<Hdr1sfMicroSecHistogram*> >::value, void>::type std::swap<Hdr1sfMicroSecHistogram*>(Hdr1sfMicroSecHistogram*&, Hdr1sfMicroSecHistogram*&) /usr/local/include/c++/7.3.0/bits/move.h:199 (memcached+0x0000004bf3cf)
          #1 std::unique_ptr<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >::reset(Hdr1sfMicroSecHistogram*) /usr/local/include/c++/7.3.0/bits/unique_ptr.h:374 (memcached+0x0000004bf3cf)
          #2 std::unique_ptr<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >::operator=(std::unique_ptr<Hdr1sfMicroSecHistogram, std::default_delete<Hdr1sfMicroSecHistogram> >&&) /usr/local/include/c++/7.3.0/bits/unique_ptr.h:283 (memcached+0x0000004bf3cf)
          #3 Timings::get_or_create_timing_histogram(unsigned char) /home/couchbase/couchbase/kv_engine/daemon/timings.cc:149 (memcached+0x0000004bf3cf)
          #4 Timings::collect(cb::mcbp::ClientOpcode, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) /home/couchbase/couchbase/kv_engine/daemon/timings.cc:42 (memcached+0x0000004bf4aa)
      ...
       
        Location is global 'all_buckets' of size 879104 at 0x0000009ba980 (memcached+0x0000009bad18)
       
        Mutex M2377 (0x7b8c00000388) created at:
          #0 pthread_mutex_lock <null> (libtsan.so.0+0x00000003bbbf)
          #1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (memcached+0x0000004bb837)
          #2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (memcached+0x0000004bb837)
          #3 phosphor::MutexEventGuard<std::mutex>::MutexEventGuard(phosphor::tracepoint_info const*, phosphor::tracepoint_info const*, bool, std::mutex&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) /home/couchbase/couchbase/phosphor/include/phosphor/scoped_event_guard.h:93 (memcached+0x0000004bb837)
          #4 thread_libevent_process /home/couchbase/couchbase/kv_engine/daemon/thread.cc:300 (memcached+0x0000004bb837)
          #5 event_persist_closure /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/libevent/libevent-prefix/src/libevent/event.c:1580 (libevent_core.so.2.1.8+0x000000017856)
          #6 event_process_active_single_queue /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/libevent/libevent-prefix/src/libevent/event.c:1639 (libevent_core.so.2.1.8+0x000000017856)
          #7 CouchbaseThread::run() /home/couchbase/couchbase/platform/src/cb_pthreads.cc:58 (libplatform_so.so.0.1.0+0x00000000a217)
          #8 platform_thread_wrap /home/couchbase/couchbase/platform/src/cb_pthreads.cc:71 (libplatform_so.so.0.1.0+0x00000000a217)
          #9 <null> <null> (libtsan.so.0+0x00000002843b)
       
        Mutex M2371 (0x7b8c00000638) created at:
          #0 pthread_mutex_lock <null> (libtsan.so.0+0x00000003bbbf)
          #1 __gthread_mutex_lock /usr/local/include/c++/7.3.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748 (memcached+0x0000004bb837)
          #2 std::mutex::lock() /usr/local/include/c++/7.3.0/bits/std_mutex.h:103 (memcached+0x0000004bb837)
          #3 phosphor::MutexEventGuard<std::mutex>::MutexEventGuard(phosphor::tracepoint_info const*, phosphor::tracepoint_info const*, bool, std::mutex&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) /home/couchbase/couchbase/phosphor/include/phosphor/scoped_event_guard.h:93 (memcached+0x0000004bb837)
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              drigby Dave Rigby (Inactive)
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty