Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30148

TSan Intermittent error in 'mem stats' test, comp_active variant.

    XMLWordPrintable

Details

    • Task
    • Resolution: Fixed
    • Major
    • 5.5.0
    • 5.5.0
    • couchbase-bucket
    • None

    Description

      Seen during CV job: http://cv.jenkins.couchbase.com/job/kv_engine-threadsanitizer-master/1583/consoleFull#-90645809661882284-c5b1-40af-8076-4f8cb2d12fb1

      WARNING: ThreadSanitizer: heap-use-after-free (pid=5347)
        Read of size 1 at 0x7b7400040609 by main thread:
          #0 memcmp <null> (libtsan.so.0+0x000000043643)
          #1 check_key_value(engine_interface*, engine_interface_v1*, char const*, char const*, unsigned long, unsigned short) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/tests/ep_testsuite_common.cc:468 (ep_testsuite.so+0x0000000975ad)
          #2 test_mem_stats /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/tests/ep_testsuite.cc:2036 (ep_testsuite.so+0x00000002c5c6)
          #3 execute_test /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/programs/engine_testapp/engine_testapp.cc:1102 (engine_testapp+0x00000041b0ba)
          #4 main /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/programs/engine_testapp/engine_testapp.cc:1499 (engine_testapp+0x00000041c5b2)
       
        Previous write of size 8 at 0x7b7400040608 by thread T11 (mutexes: write M3231945710371016):
          #0 operator delete(void*) <null> (libtsan.so.0+0x00000006a7b4)
          #1 Blob::operator delete(void*) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/blob.h:124 (ep.so+0x0000001959c2)
          #2 Blob::Deleter::operator()(TaggedPtr<Blob>) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/blob.h:137 (ep.so+0x0000001959c2)
          #3 SingleThreadedRCPtr<Blob, TaggedPtr<Blob>, Blob::Deleter>::swap(TaggedPtr<Blob>) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/atomic.h:362 (ep.so+0x0000001959c2)
          #4 SingleThreadedRCPtr<Blob, TaggedPtr<Blob>, Blob::Deleter>::reset(TaggedPtr<Blob>) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/atomic.h:298 (ep.so+0x0000001959c2)
          #5 StoredValue::replaceValue(TaggedPtr<Blob>) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/stored-value.h:540 (ep.so+0x0000001959c2)
          #6 StoredValue::storeCompressedBuffer(cb::const_char_buffer) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/stored-value.cc:362 (ep.so+0x0000001959c2)
          #7 HashTable::storeCompressedBuffer(cb::const_char_buffer, StoredValue&) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/hash_table.cc:620 (ep.so+0x00000012be5c)
          #8 ItemCompressorVisitor::visit(HashTable::HashBucketLock const&, StoredValue&) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/item_compressor_visitor.cc:52 (ep.so+0x000000139780)
          #9 HashTable::pauseResumeVisit(HashTableVisitor&, HashTable::Position&) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/hash_table.cc:753 (ep.so+0x000000127b6c)
          #10 PauseResumeVBAdapter::visit(VBucket&) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/vb_visitors.cc:36 (ep.so+0x0000001a81b1)
          #11 KVBucket::pauseResumeVisit(PauseResumeVBVisitor&, KVBucketIface::Position&) /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/kv_bucket.cc:2177 (ep.so+0x000000158ec9)
          #12 ItemCompressorTask::run() /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/item_compressor.cc:70 (ep.so+0x000000138204)
          #13 ExecutorThread::run() /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/executorthread.cc:146 (ep.so+0x00000011da74)
          #14 launch_executor_thread /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/kv_engine/engines/ep/src/executorthread.cc:34 (ep.so+0x00000011e0ae)
          #15 CouchbaseThread::run() /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/platform/src/cb_pthreads.cc:59 (libplatform_so.so.0.1.0+0x000000009cc9)
          #16 platform_thread_wrap /home/couchbase/jenkins/workspace/kv_engine-threadsanitizer-master/platform/src/cb_pthreads.cc:72 (libplatform_so.so.0.1.0+0x000000009cc9)
          #17 <null> <null> (libtsan.so.0+0x000000024feb)
       
        Mutex M3231945710371016 is already destroyed.
      

      Issue appears to be in the test helper get_item_info - note that it returns the item_info via the info out param (which includes the pointer / size of the Blob); however the actual object which maintains a ref-count on the Blob (the return value from get() - EngineErrorItemPair) is deleted when get_item_info returns - as such any caller of get_item_info accessing the Blob is racy:

      bool get_item_info(ENGINE_HANDLE *h, ENGINE_HANDLE_V1 *h1, item_info *info,
                         const char* key, uint16_t vb) {
          auto ret = get(h, h1, NULL, key, vb);
          if (ret.first != cb::engine_errc::success) {
              return false;
          }
          if (!h1->get_item_info(h, ret.second.get(), info)) {
              fprintf(stderr, "get_item_info failed\n");
              return false;
          }
       
          return true;
      }
      

      This looks to be a long latent bug; however the introduction of active compression (which has a background thread reallocating Blobs) has triggered the issue.

      Attachments

        For Gerrit Dashboard: MB-30148
        # Subject Branch Project Status CR V

        Activity

          People

            drigby Dave Rigby (Inactive)
            drigby Dave Rigby (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty