Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-19280

ep-engine: data race on CouchKVStore::dbFileRevMap

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      As reported by ThreadSanitizer. CouchKVStore maintains a map of vBucketID to counter - dbFileRevMap. This is read by some of the stats functions (e.g. doDcpVbTakeoverStats) without a lock and hence there is a potential race.

      I'm not sure of the potential impact of this bug - the aforementioned stat is used to calculate on_disk_deletes and estimate stats in the dcp-vbtakeover stats group:

      5311
      size_t vb_items = vb->getNumItems(epstore->getItemEvictionPolicy());
      5312
      size_t del_items = epstore->getRWUnderlying(vbid)->
      5313
                                         getNumPersistedDeletes(vbid);
      5314
      size_t chk_items = vb_items > 0 ?
      5315
                         vb->checkpointManager.getNumOpenChkItems() : 0;
      5316
      add_casted_stat("status", "does_not_exist", add_stat, cookie);
      5317
      add_casted_stat("on_disk_deletes", del_items, add_stat, cookie);
      5318
      add_casted_stat("vb_items", vb_items, add_stat, cookie);
      5319
      add_casted_stat("chk_items", chk_items, add_stat, cookie);
      5320
      add_casted_stat("estimate", vb_items + del_items, add_stat, cookie);
      

      Those stats are read by ns_server during rebalance - I haven't traced them all the way through ns_server to see what the consequence of these stats being incorrect could be.

          WARNING: ThreadSanitizer: data race (pid=14070)
            Write of size 8 at 0x7d9000002000 by thread T7 (mutexes: write M12364):
              #0 CouchKVStore::saveDocs(unsigned short, unsigned long, _doc**, _docinfo**, unsigned long, KVStatsCtx&) ep-engine/src/couch-kvstore/couch-kvstore.cc:1932:9 (ep.so+0x000000146628)
              #1 CouchKVStore::commit2couchstore(Callback<KVStatsCtx>*) ep-engine/src/couch-kvstore/couch-kvstore.cc:1808:34 (ep.so+0x00000013fcb7)
              #2 CouchKVStore::commit(Callback<KVStatsCtx>*) ep-engine/src/couch-kvstore/couch-kvstore.cc:1095:13 (ep.so+0x00000013f941)
              #3 EventuallyPersistentStore::commit(unsigned short) ep-engine/src/ep.cc:3351:13 (ep.so+0x00000008a0f6)
          
            Previous read of size 8 at 0x7d9000002000 by main thread (mutexes: write M18926):
              #0 CouchKVStore::getNumPersistedDeletes(unsigned short) ep-engine/src/couch-kvstore/couch-kvstore.cc:2239:23 (ep.so+0x000000147e0f)
              #1 EventuallyPersistentEngine::doDcpVbTakeoverStats(void const*, void (*)(char const*, unsigned short, char const*, unsigned int, void const*), std::string&, unsigned short) ep-engine/src/ep_engine.cc:5721:28 (ep.so+0x0000000b196e)
              #2 EventuallyPersistentEngine::getStats(void const*, char const*, int, void (*)(char const*, unsigned short, char const*, unsigned int, void const*)) ep-engine/src/ep_engine.cc:4672:14 (ep.so+0x0000000b069f)
      

      Note: already fixed in watson: http://review.couchbase.org/55888

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-19280
          # Subject Branch Project Status CR V

          Activity

            People

              drigby Dave Rigby (Inactive)
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty