Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49832

Data race in magma::TaskQueue::Enqueue

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not a Bug
    • Neo
    • Neo
    • storage-engine
    • Untriaged
    • 1
    • Unknown

    Description

      Summary

      When running a TAF test under cluster_run, the following race is observed in Magma's task management:

       WARNING: ThreadSanitizer: data race (pid=27242)
         Write of size 8 at 0x000120e0bd88 by thread T23 (mutexes: write M36503, write M546342934386906912):
           #0 magma::TaskQueue::Enqueue(magma::Task const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >) <null>:2 (memcached:x86_64+0x10069da72)
           #1 magma::ThreadPool::Schedule(magma::ThreadPool::WorkerType, magma::Task const&) <null>:2 (memcached:x86_64+0x1006a1f68)
           #2 magma::KVStoreConfig::ScheduleTask(std::__1::function<void ()>, magma::WaitGroup&) config.cc:205 (memcached:x86_64+0x100586343)
           #3 magma::KVStore::flushMemTables(magma::WAL*, magma::WALOffset, magma::FlushMode, magma::BlockingMode) <null>:2 (memcached:x86_64+0x1006658b2)
           #4 magma::KVStore::FlushMemTables(magma::WAL*, magma::FlushMode, magma::BlockingMode) <null>:2 (memcached:x86_64+0x10066613f)
           #5 magma::Magma::Impl::tryWriteCacheFlush(magma::BlockingMode, magma::FlushMode) db.cc:1129 (memcached:x86_64+0x1005a0391)
           #6 magma::Magma::Impl::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) db_write.cc:50 (memcached:x86_64+0x1005c2335)
           #7 magma::Magma::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) db_write.cc:59 (memcached:x86_64+0x1005c2635)
           #8 MagmaMemoryTrackingProxy::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) magma-memory-tracking-proxy.cc:362 (memcached:x86_64+0x10025c010)
           #9 MagmaKVStore::saveDocs(MagmaKVStoreTransactionContext&, VB::Commit&, kvstats_ctx&) magma-kvstore.cc:1411 (memcached:x86_64+0x10025b1ea)
           #10 MagmaKVStore::commit(std::__1::unique_ptr<TransactionContext, std::__1::default_delete<TransactionContext> >, VB::Commit&) magma-kvstore.cc:707 (memcached:x86_64+0x100255b28)
           #11 EPBucket::commit(KVStoreIface&, std::__1::unique_ptr<TransactionContext, std::__1::default_delete<TransactionContext> >, VB::Commit&) ep_bucket.cc:927 (memcached:x86_64+0x10041ceae)
           #12 EPBucket::flushVBucket_UNLOCKED(LockedVBucketPtr) ep_bucket.cc:799 (memcached:x86_64+0x10041b78b)
           #13 EPBucket::flushVBucket(Vbid) ep_bucket.cc:376 (memcached:x86_64+0x10041a1fa)
           #14 Flusher::flushVB() flusher.cc:285 (memcached:x86_64+0x1003e31ef)
           #15 Flusher::step(GlobalTask*) flusher.cc:200 (memcached:x86_64+0x1003e2c8d)
           #16 FlusherTask::run() tasks.cc:28 (memcached:x86_64+0x100356896)
           #17 FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()::operator()() const folly_executorpool.cc:189 (memcached:x86_64+0x10070651c)
           #18 void folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()>(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x100706169)
           #19 folly::ThreadPoolExecutor::runTask(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ThreadPoolExecutor.cpp:97 (memcached:x86_64+0x1008b81f8)
           #20 folly::CPUThreadPoolExecutor::threadRun(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>) CPUThreadPoolExecutor.cpp:265 (memcached:x86_64+0x100893540)
           #21 void folly::detail::function::FunctionTraits<void ()>::callSmall<std::__1::__bind<void (folly::ThreadPoolExecutor::*)(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*, std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>&> >(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x1008bdc18)
           #22 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()::operator()() folly_executorpool.cc:47 (memcached:x86_64+0x1007037c4)
           #23 void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()>(folly::detail::function::Data&) Function.h:401 (memcached:x86_64+0x100703681)
           #24 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()> >(void*) thread:291 (memcached:x86_64+0x10070423f)
       
         Previous write of size 8 at 0x000120e0bd88 by thread T46 (mutexes: write M5066554427358624):
           #0 magma::TimedTask::Complete(bool) <null>:2 (memcached:x86_64+0x10069f0ab)
           #1 std::__1::__shared_ptr_emplace<magma::TimedTask, std::__1::allocator<magma::TimedTask> >::__on_zero_shared() <null>:2 (memcached:x86_64+0x1006afd36)
           #2 magma::TaskWorker::loop(void*) <null>:2 (memcached:x86_64+0x10069efe1)
           #3 platform_thread_wrap(void*) cb_pthreads.cc:64 (memcached:x86_64+0x10085ff2c)
       
         Mutex M36503 (0x0001120eb000) created at:
           #0 pthread_mutex_trylock <null>:3 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2dfbe)
           #1 std::__1::mutex::try_lock() <null>:2 (libc++.1.dylib:x86_64+0x393f6)
           #2 Flusher::flushVB() flusher.cc:285 (memcached:x86_64+0x1003e31ef)
           #3 Flusher::step(GlobalTask*) flusher.cc:200 (memcached:x86_64+0x1003e2c8d)
           #4 FlusherTask::run() tasks.cc:28 (memcached:x86_64+0x100356896)
           #5 FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()::operator()() const folly_executorpool.cc:189 (memcached:x86_64+0x10070651c)
           #6 void folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()>(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x100706169)
           #7 folly::ThreadPoolExecutor::runTask(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ThreadPoolExecutor.cpp:97 (memcached:x86_64+0x1008b81f8)
           #8 folly::CPUThreadPoolExecutor::threadRun(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>) CPUThreadPoolExecutor.cpp:265 (memcached:x86_64+0x100893540)
           #9 void folly::detail::function::FunctionTraits<void ()>::callSmall<std::__1::__bind<void (folly::ThreadPoolExecutor::*)(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*, std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>&> >(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x1008bdc18)
           #10 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()::operator()() folly_executorpool.cc:47 (memcached:x86_64+0x1007037c4)
           #11 void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()>(folly::detail::function::Data&) Function.h:401 (memcached:x86_64+0x100703681)
           #12 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()> >(void*) thread:291 (memcached:x86_64+0x10070423f)
       
         Mutex M546342934386906912 is already destroyed.
       
         Mutex M5066554427358624 is already destroyed.
       
         Thread T23 (tid=1479076, running) created by main thread at:
           #0 pthread_create <null>:3 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2cd8d)
           #1 std::__1::thread::thread<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'(), void>(folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()&&) thread:307 (memcached:x86_64+0x1007040d4)
           #2 folly::NamedThreadFactory::newThread(folly::Function<void ()>&&) NamedThreadFactory.h:37 (memcached:x86_64+0x100703c20)
           #3 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&) folly_executorpool.cc:42 (memcached:x86_64+0x100703559)
           #4 folly::ThreadPoolExecutor::addThreads(unsigned long) ThreadPoolExecutor.cpp:215 (memcached:x86_64+0x1008b8f12)
      

      Magma SHA: 832cef05c
      KV_engine SHA: eeb315140

      Steps to Reproduce

      1. Build CB Server with TSan enabled - e.g.

        mkdir build && cd build
        cmake -G Ninja -DCB_THREADSANITIZER=1 -DCMAKE_INSTALL_PREFIX=../install ..
        ninja install
        

      2. Start a local 3 node cluster:

        COUCHBASE_NUM_VBUCKETS=64 COUCHBASE_CPU_COUNT=4 ./cluster_run --nodes=3 --dont-rename
        

      3. Run the following TAF test:

        JAVA_HOME=/usr/local/opt/openjdk\@8 nice ./gradlew testrunner -P jython=$(which jython) -P 'args=-i b/resources/dev-3-nodes.ini rerun=False,disk_optimized_thread_settings=True,get-cbcollect-info=True,GROUP=rebalance_out -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_out,nodes_init=3,nodes_out=1,bucket_spec=magma_dgm.10_percent_dgm.5_node_1_replica_magma_256_single_bucket,doc_size=256,randomize_value=True,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=during,quota_percent=80,skip_validations=False,GROUP=rebalance_out,skip_cleanup=True,skip_collections_cleanup=True,log_level=debug,infra_log_level=debug'
        

      Check the ns_server/logs directory for WARNING: ThreadSanitizer messages.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            drigby Dave Rigby added a comment -

            FWIW I've attached the sanitizers.log from my run today: sanitizers.log.46853

            There does seem to be a a couple of different variants of this data race - one triggered via flushMemTables, another via magma::ASyncFileRemover::operator().

            drigby Dave Rigby added a comment - FWIW I've attached the sanitizers.log from my run today: sanitizers.log.46853 There does seem to be a a couple of different variants of this data race - one triggered via flushMemTables , another via magma::ASyncFileRemover::operator().
            drigby Dave Rigby added a comment - - edited

            I've managed to get a backtrace with line numbers: sanitizers.log.59880

            ==================
            WARNING: ThreadSanitizer: data race (pid=59861)
              Write of size 8 at 0x0001318d6b48 by thread T22 (mutexes: write M665969799437594240, write M605171204462813120):
                #0 magma::TaskQueue::Enqueue(magma::Task const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >) worker.cc:63 (memcached:x86_64+0x1006b49c2)
                #1 magma::TaskQueue::Enqueue(magma::Task const&) worker.cc:35 (memcached:x86_64+0x1006b4698)
                #2 magma::ThreadPool::Schedule(magma::ThreadPool::WorkerType, magma::Task const&) thread_pool.cc:146 (memcached:x86_64+0x1006b9ed1)
                #3 magma::KVStoreConfig::ScheduleTask(std::__1::function<void ()>, magma::WaitGroup&) config.cc:206 (memcached:x86_64+0x10058ba13)
                #4 magma::KVStore::flushMemTables(magma::WAL*, magma::WALOffset, magma::FlushMode, magma::BlockingMode) <null>:2 (memcached:x86_64+0x100662452)
                #5 magma::KVStore::FlushMemTables(magma::WAL*, magma::FlushMode, magma::BlockingMode) <null>:2 (memcached:x86_64+0x100662cdf)
                #6 magma::Magma::Impl::tryWriteCacheFlush(magma::BlockingMode, magma::FlushMode) db.cc:1129 (memcached:x86_64+0x1005a74b1)
                #7 magma::Magma::Impl::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) db_write.cc:50 (memcached:x86_64+0x1005c9455)
                #8 magma::Magma::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) db_write.cc:59 (memcached:x86_64+0x1005c9755)
                #9 MagmaMemoryTrackingProxy::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) magma-memory-tracking-proxy.cc:362 (memcached:x86_64+0x10025a370)
                #10 MagmaKVStore::saveDocs(MagmaKVStoreTransactionContext&, VB::Commit&, kvstats_ctx&) magma-kvstore.cc:1415 (memcached:x86_64+0x10025954a)
                #11 MagmaKVStore::commit(std::__1::unique_ptr<TransactionContext, std::__1::default_delete<TransactionContext> >, VB::Commit&) magma-kvstore.cc:711 (memcached:x86_64+0x100253e88)
                #12 EPBucket::commit(KVStoreIface&, std::__1::unique_ptr<TransactionContext, std::__1::default_delete<TransactionContext> >, VB::Commit&) ep_bucket.cc:939 (memcached:x86_64+0x10041afce)
                #13 EPBucket::flushVBucket_UNLOCKED(LockedVBucketPtr) ep_bucket.cc:811 (memcached:x86_64+0x10041984c)
                #14 EPBucket::flushVBucket(Vbid) ep_bucket.cc:376 (memcached:x86_64+0x1004182ba)
                #15 Flusher::flushVB() flusher.cc:285 (memcached:x86_64+0x1003e12ef)
                #16 Flusher::step(GlobalTask*) flusher.cc:200 (memcached:x86_64+0x1003e0d8d)
                #17 FlusherTask::run() tasks.cc:28 (memcached:x86_64+0x100354586)
                #18 FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()::operator()() const folly_executorpool.cc:189 (memcached:x86_64+0x10070521c)
                #19 void folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()>(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x100704e69)
                #20 folly::ThreadPoolExecutor::runTask(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ThreadPoolExecutor.cpp:97 (memcached:x86_64+0x1008b6ef8)
                #21 folly::CPUThreadPoolExecutor::threadRun(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>) CPUThreadPoolExecutor.cpp:265 (memcached:x86_64+0x100892240)
                #22 void folly::detail::function::FunctionTraits<void ()>::callSmall<std::__1::__bind<void (folly::ThreadPoolExecutor::*)(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*, std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>&> >(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x1008bc918)
                #23 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()::operator()() folly_executorpool.cc:47 (memcached:x86_64+0x1007024c4)
                #24 void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()>(folly::detail::function::Data&) Function.h:401 (memcached:x86_64+0x100702381)
                #25 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()> >(void*) thread:291 (memcached:x86_64+0x100702f3f)
             
              Previous write of size 8 at 0x0001318d6b48 by thread T35 (mutexes: write M979814399056112480):
                #0 magma::TimedTask::Complete(bool) worker.cc:136 (memcached:x86_64+0x1006b5ffb)
                #1 std::__1::__shared_ptr_emplace<magma::TimedTask, std::__1::allocator<magma::TimedTask> >::__on_zero_shared() memory:3318 (memcached:x86_64+0x1006b7516)
                #2 magma::TaskWorker::loop(void*) worker.cc:160 (memcached:x86_64+0x1006b5f31)
                #3 platform_thread_wrap(void*) cb_pthreads.cc:64 (memcached:x86_64+0x10085ec2c)
             
              Mutex M665969799437594240 is already destroyed.
             
              Mutex M605171204462813120 is already destroyed.
             
              Mutex M979814399056112480 is already destroyed.
             
              Thread T22 (tid=777413, running) created by main thread at:
                #0 pthread_create <null>:3 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2cd8d)
                #1 std::__1::thread::thread<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'(), void>(folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()&&) thread:307 (memcached:x86_64+0x100702dd4)
                #2 folly::NamedThreadFactory::newThread(folly::Function<void ()>&&) NamedThreadFactory.h:37 (memcached:x86_64+0x100702920)
                #3 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&) folly_executorpool.cc:42 (memcached:x86_64+0x100702259)
                #4 folly::ThreadPoolExecutor::addThreads(unsigned long) ThreadPoolExecutor.cpp:215 (memcached:x86_64+0x1008b7c12)
                #5 folly::ThreadPoolExecutor::setNumThreads(unsigned long) ThreadPoolExecutor.cpp:197 (memcached:x86_64+0x1008b78f8)
                #6 folly::CPUThreadPoolExecutor::CPUThreadPoolExecutor(unsigned long, std::__1::shared_ptr<folly::ThreadFactory>) CPUThreadPoolExecutor.cpp:81 (memcached:x86_64+0x10088f907)
                #7 FollyExecutorPool::FollyExecutorPool(unsigned long, ThreadPoolConfig::ThreadCount, ThreadPoolConfig::ThreadCount, unsigned long, unsigned long) folly_executorpool.cc:718 (memcached:x86_64+0x1006f51ee)
                #8 ExecutorPool::create(ExecutorPool::Backend, unsigned long, ThreadPoolConfig::ThreadCount, ThreadPoolConfig::ThreadCount, unsigned long, unsigned long) executorpool.cc:39 (memcached:x86_64+0x1006f4338)
                #9 memcached_main(int, char**) memcached.cc:980 (memcached:x86_64+0x1000d6aef)
                #10 main main.cc:30 (memcached:x86_64+0x100005223)
             
              Thread T35 (tid=783944, running) created by thread T26 at:
                #0 pthread_create <null>:3 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2cd8d)
                #1 cb_create_named_thread(_opaque_pthread_t**, void (*)(void*), void*, int, char const*) cb_pthreads.cc:102 (memcached:x86_64+0x10085eaed)
                #2 magma::TaskWorker::Start() worker.cc:124 (memcached:x86_64+0x1006b5da8)
                #3 magma::ThreadPool::addWorker(magma::ThreadPool::WorkerType) thread_pool.cc:122 (memcached:x86_64+0x1006b850b)
                #4 magma::ThreadPool::ThreadPool(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, unsigned long) thread_pool.cc:41 (memcached:x86_64+0x1006b7f93)
                #5 magma::ThreadPool::Create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, unsigned long) thread_pool.cc:54 (memcached:x86_64+0x1006b89a2)
                #6 magma::Magma::Impl::Open() db.cc:142 (memcached:x86_64+0x1005a48d4)
                #7 magma::Magma::Open() db.cc:174 (memcached:x86_64+0x1005a4dc8)
                #8 MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&) magma-kvstore.cc:622 (memcached:x86_64+0x100251fcf)
                #9 MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&) magma-kvstore.cc:498 (memcached:x86_64+0x100253610)
                #10 KVStoreFactory::create(KVStoreConfig&) kvstore.cc:204 (memcached:x86_64+0x10036473a)
                #11 KVShard::KVShard(Configuration&, unsigned short, unsigned short) kvshard.cc:44 (memcached:x86_64+0x100386af0)
                #12 KVShard::KVShard(Configuration&, unsigned short, unsigned short) kvshard.cc:30 (memcached:x86_64+0x100386e30)
                #13 VBucketMap::VBucketMap(KVBucket&) vbucketmap.cc:26 (memcached:x86_64+0x10033ef12)
                #14 VBucketMap::VBucketMap(KVBucket&) vbucketmap.cc:21 (memcached:x86_64+0x10033f260)
                #15 KVBucket::KVBucket(EventuallyPersistentEngine&) kv_bucket.cc:287 (memcached:x86_64+0x10036a3df)
                #16 EPBucket::EPBucket(EventuallyPersistentEngine&) ep_bucket.cc:244 (memcached:x86_64+0x10041654d)
                #17 EventuallyPersistentEngine::makeBucket(Configuration&) ep_engine.cc:6872 (memcached:x86_64+0x1004494aa)
                #18 EventuallyPersistentEngine::initialize(char const*) ep_engine.cc:2193 (memcached:x86_64+0x1004486bc)
                #19 BucketManager::create(Cookie&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, BucketType) buckets.cc:211 (memcached:x86_64+0x10015b528)
                #20 std::__1::__function::__func<CreateRemoveBucketCommandContext::create()::$_0, std::__1::allocator<CreateRemoveBucketCommandContext::create()::$_0>, void ()>::operator()() functional:1727 (memcached:x86_64+0x1000a4b4b)
                #21 OneShotTask::run() one_shot_task.h:50 (memcached:x86_64+0x1000a53c5)
                #22 FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()::operator()() const folly_executorpool.cc:189 (memcached:x86_64+0x10070521c)
                #23 void folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()>(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x100704e69)
                #24 folly::ThreadPoolExecutor::runTask(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ThreadPoolExecutor.cpp:97 (memcached:x86_64+0x1008b6ef8)
                #25 folly::CPUThreadPoolExecutor::threadRun(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>) CPUThreadPoolExecutor.cpp:265 (memcached:x86_64+0x100892240)
                #26 void folly::detail::function::FunctionTraits<void ()>::callSmall<std::__1::__bind<void (folly::ThreadPoolExecutor::*)(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*, std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>&> >(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x1008bc918)
                #27 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()::operator()() folly_executorpool.cc:47 (memcached:x86_64+0x1007024c4)
                #28 void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()>(folly::detail::function::Data&) Function.h:401 (memcached:x86_64+0x100702381)
                #29 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()> >(void*) thread:291 (memcached:x86_64+0x100702f3f)
             
            SUMMARY: ThreadSanitizer: data race worker.cc:63 in magma::TaskQueue::Enqueue(magma::Task const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >)
            ==================
            

            This was with magma SHA 77e2acf64.

            It seems that something to do with how the different parts of Magma are linked is resulting in debug information for line numbers not getting including in the final binary. I made the following modification (to just compile the affected source files again specifically for libmagma) to get the above backtrace:

            diff --git a/CMakeLists.txt b/CMakeLists.txt
            index e76fec30b..0bc0ce2d6 100644
            --- a/CMakeLists.txt
            +++ b/CMakeLists.txt
            @@ -202,10 +202,10 @@ cb_enable_unity_build(MAGMA_SYNC_LIB)
             cb_enable_unity_build(MAGMA_UTIL_LIB)
             cb_enable_unity_build(MAGMA_TEST_SUPPORT_LIB)
             
            -add_library(magma $<TARGET_OBJECTS:MAGMA_CORE_LIB> $<TARGET_OBJECTS:MAGMA_SYNC_LIB> $<TARGET_OBJECTS:MAGMA_UTIL_LIB>)
            +add_library(magma $<TARGET_OBJECTS:MAGMA_CORE_LIB> $<TARGET_OBJECTS:MAGMA_SYNC_LIB> ${MAGMA_UTIL_SRCS})
             
             target_compile_definitions(magma PRIVATE $<$<BOOL:${LZ4_FOUND}>:LZ4_SUPPORT> $<$<BOOL:${LIBURING_FOUND}>:LIBURING_SUPPORT>)
            -target_link_libraries(magma ${MAGMA_LINK_LIBRARIES})
            +target_link_libraries(magma ${MAGMA_LINK_LIBRARIES}  ep-engine_magma_common)
             
             if (INSTALL_HEADER_FILES)
                    INSTALL(FILES include/libmagma/magma.h DESTINATION include/libmagma)
            

            drigby Dave Rigby added a comment - - edited I've managed to get a backtrace with line numbers: sanitizers.log.59880 ================== WARNING: ThreadSanitizer: data race (pid=59861) Write of size 8 at 0x0001318d6b48 by thread T22 (mutexes: write M665969799437594240, write M605171204462813120): #0 magma::TaskQueue::Enqueue(magma::Task const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >) worker.cc:63 (memcached:x86_64+0x1006b49c2) #1 magma::TaskQueue::Enqueue(magma::Task const&) worker.cc:35 (memcached:x86_64+0x1006b4698) #2 magma::ThreadPool::Schedule(magma::ThreadPool::WorkerType, magma::Task const&) thread_pool.cc:146 (memcached:x86_64+0x1006b9ed1) #3 magma::KVStoreConfig::ScheduleTask(std::__1::function<void ()>, magma::WaitGroup&) config.cc:206 (memcached:x86_64+0x10058ba13) #4 magma::KVStore::flushMemTables(magma::WAL*, magma::WALOffset, magma::FlushMode, magma::BlockingMode) <null>:2 (memcached:x86_64+0x100662452) #5 magma::KVStore::FlushMemTables(magma::WAL*, magma::FlushMode, magma::BlockingMode) <null>:2 (memcached:x86_64+0x100662cdf) #6 magma::Magma::Impl::tryWriteCacheFlush(magma::BlockingMode, magma::FlushMode) db.cc:1129 (memcached:x86_64+0x1005a74b1) #7 magma::Magma::Impl::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) db_write.cc:50 (memcached:x86_64+0x1005c9455) #8 magma::Magma::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) db_write.cc:59 (memcached:x86_64+0x1005c9755) #9 MagmaMemoryTrackingProxy::WriteDocs(unsigned short, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const&, unsigned int, std::__1::function<void (magma::Magma::WriteOperation const&, bool, magma::Slice)>, std::__1::function<std::__1::pair<magma::Status, std::__1::vector<magma::Magma::WriteOperation, std::__1::allocator<magma::Magma::WriteOperation> > const*> ()>) magma-memory-tracking-proxy.cc:362 (memcached:x86_64+0x10025a370) #10 MagmaKVStore::saveDocs(MagmaKVStoreTransactionContext&, VB::Commit&, kvstats_ctx&) magma-kvstore.cc:1415 (memcached:x86_64+0x10025954a) #11 MagmaKVStore::commit(std::__1::unique_ptr<TransactionContext, std::__1::default_delete<TransactionContext> >, VB::Commit&) magma-kvstore.cc:711 (memcached:x86_64+0x100253e88) #12 EPBucket::commit(KVStoreIface&, std::__1::unique_ptr<TransactionContext, std::__1::default_delete<TransactionContext> >, VB::Commit&) ep_bucket.cc:939 (memcached:x86_64+0x10041afce) #13 EPBucket::flushVBucket_UNLOCKED(LockedVBucketPtr) ep_bucket.cc:811 (memcached:x86_64+0x10041984c) #14 EPBucket::flushVBucket(Vbid) ep_bucket.cc:376 (memcached:x86_64+0x1004182ba) #15 Flusher::flushVB() flusher.cc:285 (memcached:x86_64+0x1003e12ef) #16 Flusher::step(GlobalTask*) flusher.cc:200 (memcached:x86_64+0x1003e0d8d) #17 FlusherTask::run() tasks.cc:28 (memcached:x86_64+0x100354586) #18 FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()::operator()() const folly_executorpool.cc:189 (memcached:x86_64+0x10070521c) #19 void folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()>(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x100704e69) #20 folly::ThreadPoolExecutor::runTask(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ThreadPoolExecutor.cpp:97 (memcached:x86_64+0x1008b6ef8) #21 folly::CPUThreadPoolExecutor::threadRun(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>) CPUThreadPoolExecutor.cpp:265 (memcached:x86_64+0x100892240) #22 void folly::detail::function::FunctionTraits<void ()>::callSmall<std::__1::__bind<void (folly::ThreadPoolExecutor::*)(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*, std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>&> >(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x1008bc918) #23 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()::operator()() folly_executorpool.cc:47 (memcached:x86_64+0x1007024c4) #24 void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()>(folly::detail::function::Data&) Function.h:401 (memcached:x86_64+0x100702381) #25 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()> >(void*) thread:291 (memcached:x86_64+0x100702f3f)   Previous write of size 8 at 0x0001318d6b48 by thread T35 (mutexes: write M979814399056112480): #0 magma::TimedTask::Complete(bool) worker.cc:136 (memcached:x86_64+0x1006b5ffb) #1 std::__1::__shared_ptr_emplace<magma::TimedTask, std::__1::allocator<magma::TimedTask> >::__on_zero_shared() memory:3318 (memcached:x86_64+0x1006b7516) #2 magma::TaskWorker::loop(void*) worker.cc:160 (memcached:x86_64+0x1006b5f31) #3 platform_thread_wrap(void*) cb_pthreads.cc:64 (memcached:x86_64+0x10085ec2c)   Mutex M665969799437594240 is already destroyed.   Mutex M605171204462813120 is already destroyed.   Mutex M979814399056112480 is already destroyed.   Thread T22 (tid=777413, running) created by main thread at: #0 pthread_create <null>:3 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2cd8d) #1 std::__1::thread::thread<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'(), void>(folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()&&) thread:307 (memcached:x86_64+0x100702dd4) #2 folly::NamedThreadFactory::newThread(folly::Function<void ()>&&) NamedThreadFactory.h:37 (memcached:x86_64+0x100702920) #3 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&) folly_executorpool.cc:42 (memcached:x86_64+0x100702259) #4 folly::ThreadPoolExecutor::addThreads(unsigned long) ThreadPoolExecutor.cpp:215 (memcached:x86_64+0x1008b7c12) #5 folly::ThreadPoolExecutor::setNumThreads(unsigned long) ThreadPoolExecutor.cpp:197 (memcached:x86_64+0x1008b78f8) #6 folly::CPUThreadPoolExecutor::CPUThreadPoolExecutor(unsigned long, std::__1::shared_ptr<folly::ThreadFactory>) CPUThreadPoolExecutor.cpp:81 (memcached:x86_64+0x10088f907) #7 FollyExecutorPool::FollyExecutorPool(unsigned long, ThreadPoolConfig::ThreadCount, ThreadPoolConfig::ThreadCount, unsigned long, unsigned long) folly_executorpool.cc:718 (memcached:x86_64+0x1006f51ee) #8 ExecutorPool::create(ExecutorPool::Backend, unsigned long, ThreadPoolConfig::ThreadCount, ThreadPoolConfig::ThreadCount, unsigned long, unsigned long) executorpool.cc:39 (memcached:x86_64+0x1006f4338) #9 memcached_main(int, char**) memcached.cc:980 (memcached:x86_64+0x1000d6aef) #10 main main.cc:30 (memcached:x86_64+0x100005223)   Thread T35 (tid=783944, running) created by thread T26 at: #0 pthread_create <null>:3 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2cd8d) #1 cb_create_named_thread(_opaque_pthread_t**, void (*)(void*), void*, int, char const*) cb_pthreads.cc:102 (memcached:x86_64+0x10085eaed) #2 magma::TaskWorker::Start() worker.cc:124 (memcached:x86_64+0x1006b5da8) #3 magma::ThreadPool::addWorker(magma::ThreadPool::WorkerType) thread_pool.cc:122 (memcached:x86_64+0x1006b850b) #4 magma::ThreadPool::ThreadPool(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, unsigned long) thread_pool.cc:41 (memcached:x86_64+0x1006b7f93) #5 magma::ThreadPool::Create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, unsigned long) thread_pool.cc:54 (memcached:x86_64+0x1006b89a2) #6 magma::Magma::Impl::Open() db.cc:142 (memcached:x86_64+0x1005a48d4) #7 magma::Magma::Open() db.cc:174 (memcached:x86_64+0x1005a4dc8) #8 MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&) magma-kvstore.cc:622 (memcached:x86_64+0x100251fcf) #9 MagmaKVStore::MagmaKVStore(MagmaKVStoreConfig&) magma-kvstore.cc:498 (memcached:x86_64+0x100253610) #10 KVStoreFactory::create(KVStoreConfig&) kvstore.cc:204 (memcached:x86_64+0x10036473a) #11 KVShard::KVShard(Configuration&, unsigned short, unsigned short) kvshard.cc:44 (memcached:x86_64+0x100386af0) #12 KVShard::KVShard(Configuration&, unsigned short, unsigned short) kvshard.cc:30 (memcached:x86_64+0x100386e30) #13 VBucketMap::VBucketMap(KVBucket&) vbucketmap.cc:26 (memcached:x86_64+0x10033ef12) #14 VBucketMap::VBucketMap(KVBucket&) vbucketmap.cc:21 (memcached:x86_64+0x10033f260) #15 KVBucket::KVBucket(EventuallyPersistentEngine&) kv_bucket.cc:287 (memcached:x86_64+0x10036a3df) #16 EPBucket::EPBucket(EventuallyPersistentEngine&) ep_bucket.cc:244 (memcached:x86_64+0x10041654d) #17 EventuallyPersistentEngine::makeBucket(Configuration&) ep_engine.cc:6872 (memcached:x86_64+0x1004494aa) #18 EventuallyPersistentEngine::initialize(char const*) ep_engine.cc:2193 (memcached:x86_64+0x1004486bc) #19 BucketManager::create(Cookie&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, BucketType) buckets.cc:211 (memcached:x86_64+0x10015b528) #20 std::__1::__function::__func<CreateRemoveBucketCommandContext::create()::$_0, std::__1::allocator<CreateRemoveBucketCommandContext::create()::$_0>, void ()>::operator()() functional:1727 (memcached:x86_64+0x1000a4b4b) #21 OneShotTask::run() one_shot_task.h:50 (memcached:x86_64+0x1000a53c5) #22 FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()::operator()() const folly_executorpool.cc:189 (memcached:x86_64+0x10070521c) #23 void folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::'lambda0'()>(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x100704e69) #24 folly::ThreadPoolExecutor::runTask(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ThreadPoolExecutor.cpp:97 (memcached:x86_64+0x1008b6ef8) #25 folly::CPUThreadPoolExecutor::threadRun(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>) CPUThreadPoolExecutor.cpp:265 (memcached:x86_64+0x100892240) #26 void folly::detail::function::FunctionTraits<void ()>::callSmall<std::__1::__bind<void (folly::ThreadPoolExecutor::*)(std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*, std::__1::shared_ptr<folly::ThreadPoolExecutor::Thread>&> >(folly::detail::function::Data&) Function.h:387 (memcached:x86_64+0x1008bc918) #27 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()::operator()() folly_executorpool.cc:47 (memcached:x86_64+0x1007024c4) #28 void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()>(folly::detail::function::Data&) Function.h:401 (memcached:x86_64+0x100702381) #29 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::'lambda'()> >(void*) thread:291 (memcached:x86_64+0x100702f3f)   SUMMARY: ThreadSanitizer: data race worker.cc:63 in magma::TaskQueue::Enqueue(magma::Task const&, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000l> >) ================== This was with magma SHA 77e2acf64. It seems that something to do with how the different parts of Magma are linked is resulting in debug information for line numbers not getting including in the final binary. I made the following modification (to just compile the affected source files again specifically for libmagma ) to get the above backtrace: diff --git a/CMakeLists.txt b/CMakeLists.txt index e76fec30b..0bc0ce2d6 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -202,10 +202,10 @@ cb_enable_unity_build(MAGMA_SYNC_LIB) cb_enable_unity_build(MAGMA_UTIL_LIB) cb_enable_unity_build(MAGMA_TEST_SUPPORT_LIB) -add_library(magma $<TARGET_OBJECTS:MAGMA_CORE_LIB> $<TARGET_OBJECTS:MAGMA_SYNC_LIB> $<TARGET_OBJECTS:MAGMA_UTIL_LIB>) +add_library(magma $<TARGET_OBJECTS:MAGMA_CORE_LIB> $<TARGET_OBJECTS:MAGMA_SYNC_LIB> ${MAGMA_UTIL_SRCS}) target_compile_definitions(magma PRIVATE $<$<BOOL:${LZ4_FOUND}>:LZ4_SUPPORT> $<$<BOOL:${LIBURING_FOUND}>:LIBURING_SUPPORT>) -target_link_libraries(magma ${MAGMA_LINK_LIBRARIES}) +target_link_libraries(magma ${MAGMA_LINK_LIBRARIES} ep-engine_magma_common) if (INSTALL_HEADER_FILES) INSTALL(FILES include/libmagma/magma.h DESTINATION include/libmagma)

            Dave Rigby Not had much luck in progressing.

            On TAF front, I've been able to download/install virtualbox and vagrant. The directions what to do could be much clearer for someone who's never worked with this before. Also, its missing the step to update Security preferences to enable Oracle America to enable vagrant to work. Using vagrant/neo-testing/ubuntu18, I was able to get the nodes up. The directions you gave say I just run cluster_run. Well, again, having never run this, I know its much more than just going into ns_server directory and running cluster_run. I don't see in the step which config files to change to get cluster_run to work. 

            I'm no longer able to get a tsan build to work on mac. I do the following...

             mkdir build

             cd build

             cmake -G Ninja -DCB_THREADSANITIZER=1 -DCMAKE_INSTALL_PREFIX=../install ..

             ninja install

            But it won't compile.

            FAILED: cbft/CMakeFiles/cbft-bleve /Users/scottlashley/latest/build/cbft/CMakeFiles/cbft-bleve 

            cd /Users/scottlashley/latest/cbft && /usr/local/Cellar/cmake/3.21.4/bin/cmake -D GOEXE=/Users/scottlashley/.cbdepscache/exploded/x86_64/go-1.13.7/go/bin/go -D GOVERSION=1.13.7 -D GO_BINARY_DIR=/Users/scottlashley/latest/build/gopkg/go-1.13.7 -D CMAKE_C_COMPILER=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -D REPOSYNC=/Users/scottlashley/latest/tlm/cmake/Modules/../../.. -D CB_PRODUCTION_BUILD=OFF -D CGO_CFLAGS=-fsanitize=thread -D CGO_LDFLAGS=-fsanitize=thread -D GCFLAGS= -D "GOTAGS=server cbftx" -D GOBUILDMODE=default -D LDFLAGS= -D PACKAGE=github.com/couchbase/cbft/cmd/cbft-bleve -D OUTPUT=/Users/scottlashley/latest/build/cbft/cbft-bleve -D CGO_INCLUDE_DIRS= -D CGO_LIBRARY_DIRS= -D CB_GO_CODE_COVERAGE=0 -D CB_GO_RACE_DETECTOR=0 -D CB_ADDRESSSANITIZER=OFF -D CB_UNDEFINEDSANITIZER=OFF -D CB_THREADSANITIZER=1 -P /Users/scottlashley/latest/tlm/cmake/Modules/go-modbuild.cmake

            Error running go build for package github.com/couchbase/cbft/cmd/cbft-bleve!

            I believe I have a fix for this if your willing to try it.

            https://review.couchbase.org/c/magma/+/167289 

            scott.lashley Scott Lashley added a comment - Dave Rigby Not had much luck in progressing. On TAF front, I've been able to download/install virtualbox and vagrant. The directions what to do could be much clearer for someone who's never worked with this before. Also, its missing the step to update Security preferences to enable Oracle America to enable vagrant to work. Using vagrant/neo-testing/ubuntu18, I was able to get the nodes up. The directions you gave say I just run cluster_run. Well, again, having never run this, I know its much more than just going into ns_server directory and running cluster_run. I don't see in the step which config files to change to get cluster_run to work.  I'm no longer able to get a tsan build to work on mac. I do the following...  mkdir build  cd build  cmake -G Ninja -DCB_THREADSANITIZER=1 -DCMAKE_INSTALL_PREFIX=../install ..  ninja install But it won't compile. FAILED: cbft/CMakeFiles/cbft-bleve /Users/scottlashley/latest/build/cbft/CMakeFiles/cbft-bleve  cd /Users/scottlashley/latest/cbft && /usr/local/Cellar/cmake/3.21.4/bin/cmake -D GOEXE=/Users/scottlashley/.cbdepscache/exploded/x86_64/go-1.13.7/go/bin/go -D GOVERSION=1.13.7 -D GO_BINARY_DIR=/Users/scottlashley/latest/build/gopkg/go-1.13.7 -D CMAKE_C_COMPILER=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -D REPOSYNC=/Users/scottlashley/latest/tlm/cmake/Modules/../../.. -D CB_PRODUCTION_BUILD=OFF -D CGO_CFLAGS=-fsanitize=thread -D CGO_LDFLAGS=-fsanitize=thread -D GCFLAGS= -D "GOTAGS=server cbftx" -D GOBUILDMODE=default -D LDFLAGS= -D PACKAGE=github.com/couchbase/cbft/cmd/cbft-bleve -D OUTPUT=/Users/scottlashley/latest/build/cbft/cbft-bleve -D CGO_INCLUDE_DIRS= -D CGO_LIBRARY_DIRS= -D CB_GO_CODE_COVERAGE=0 -D CB_GO_RACE_DETECTOR=0 -D CB_ADDRESSSANITIZER=OFF -D CB_UNDEFINEDSANITIZER=OFF -D CB_THREADSANITIZER=1 -P /Users/scottlashley/latest/tlm/cmake/Modules/go-modbuild.cmake Error running go build for package github.com/couchbase/cbft/cmd/cbft-bleve! I believe I have a fix for this if your willing to try it. https://review.couchbase.org/c/magma/+/167289  

            Dave Rigby 

            With this patch, the only ThreadSanitizer error I see on Mac is from sigar_proc_time_get. I was never able to reproduce the WriteDocs -> taskmgr issue. Sarath made the following comment...
             
            Sarath Lakshman 
            @Scott Lashley Regarding MB-49832:
            I suspect the tsan is reporting an invalid case
            According to the trace, the race occurs between
            Previous write of size 8 at 0x00010e0f3318 by thread T38 (mutexes: write M547187359256359728):
            #0 magma::TimedTask::Complete(bool) worker.cc:139 (memcached:x86_64+0x1006b5feb)
            std::shared_ptr<Task> currTask;
            137 {
            138 std::lock_guard lock(mutex);
            -> 139 std::swap(currTask, task);
            140 }
            #0 magma::TaskQueue::Enqueue(magma::Task const&, std::_1::chrono::duration<long long, std::_1::ratio<1l, 1000l> >) worker.cc:66 (memcached:x86_64+0x1006b49b9)
            -> 66 auto timedTask = std::make_shared<TimedTask>(taskPtr, now + interval);
            We cannot have TimedTask in Complete() state even before it is allocated
            I wonder if the mutex held within the TimedTask is misleading tsan
            When we call complete, we have mutex held
            But when assigning the task through TimedTask constructor, mutex is not held
            At this point, I think the defect should be closed as a false positive TSAN error.

            scott.lashley Scott Lashley added a comment - Dave Rigby   With this patch , the only ThreadSanitizer error I see on Mac is from sigar_proc_time_get. I was never able to reproduce the WriteDocs -> taskmgr issue. Sarath made the following comment...   Sarath Lakshman   @Scott Lashley Regarding MB-49832 : I suspect the tsan is reporting an invalid case According to the trace, the race occurs between Previous write of size 8 at 0x00010e0f3318 by thread T38 (mutexes: write M547187359256359728): #0 magma::TimedTask::Complete(bool) worker.cc:139 (memcached:x86_64+0x1006b5feb) std::shared_ptr<Task> currTask; 137 { 138 std::lock_guard lock(mutex); -> 139 std::swap(currTask, task); 140 } #0 magma::TaskQueue::Enqueue(magma::Task const&, std::_ 1::chrono::duration<long long, std:: _1::ratio<1l, 1000l> >) worker.cc:66 (memcached:x86_64+0x1006b49b9) -> 66 auto timedTask = std::make_shared<TimedTask>(taskPtr, now + interval); We cannot have TimedTask in Complete() state even before it is allocated I wonder if the mutex held within the TimedTask is misleading tsan When we call complete, we have mutex held But when assigning the task through TimedTask constructor, mutex is not held At this point, I think the defect should be closed as a false positive TSAN error.

            Comments from Sarath...

            I suspect the tsan is reporting an invalid case
            According to the trace, the race occurs between
            Previous write of size 8 at 0x00010e0f3318 by thread T38 (mutexes: write M547187359256359728):
            #0 magma::TimedTask::Complete(bool) worker.cc:139 (memcached:x86_64+0x1006b5feb)
            std::shared_ptr<Task> currTask;
            137

            { 138 std::lock_guard lock(mutex); -> 139 std::swap(currTask, task); 140 }

            #0 magma::TaskQueue::Enqueue(magma::Task const&, std::{}1::chrono::duration<long long, std::{_}_1::ratio<1l, 1000l> >) worker.cc:66 (memcached:x86_64+0x1006b49b9)
            -> 66 auto timedTask = std::make_shared<TimedTask>(taskPtr, now + interval);
            We cannot have TimedTask in Complete() state even before it is allocated
            I wonder if the mutex held within the TimedTask is misleading tsan
            When we call complete, we have mutex held
            But when assigning the task through TimedTask constructor, mutex is not held
            At this point, I think the defect should be closed as a false positive TSAN error.

            scott.lashley Scott Lashley added a comment - Comments from Sarath... I suspect the tsan is reporting an invalid case According to the trace, the race occurs between Previous write of size 8 at 0x00010e0f3318 by thread T38 (mutexes: write M547187359256359728): #0 magma::TimedTask::Complete(bool) worker.cc:139 (memcached:x86_64+0x1006b5feb) std::shared_ptr<Task> currTask; 137 { 138 std::lock_guard lock(mutex); -> 139 std::swap(currTask, task); 140 } #0 magma::TaskQueue::Enqueue(magma::Task const&, std:: { }1::chrono::duration<long long, std::{_}_1::ratio<1l, 1000l> >) worker.cc:66 (memcached:x86_64+0x1006b49b9) -> 66 auto timedTask = std::make_shared<TimedTask>(taskPtr, now + interval); We cannot have TimedTask in Complete() state even before it is allocated I wonder if the mutex held within the TimedTask is misleading tsan When we call complete, we have mutex held But when assigning the task through TimedTask constructor, mutex is not held At this point, I think the defect should be closed as a false positive TSAN error.

            People

              scott.lashley Scott Lashley
              drigby Dave Rigby
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty