Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53922

Ephemeral purger can delete a StoredValue which is still referenced

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      One example of this issue occurs with durable writes (which come in via KVBucket::set)

      For the issue to occur, it requires a certain hash-table state on entry to the set path, which is something along the lines of both an existing prepare and committed item exist for the key. With the HT in that state the following then triggers the issue.

      At this point the problem is that a StoredValue has been marked as stale and at the same-time, htRes still references the stale object.

      Problems will now occur if EphTombstoneStaleItemDeleter runs during the lifetime of htRes.

      For example

      • Whilst the set of k1 is still executing, i.e. htRes has two non-owning pointers, the EphTombstoneStaleItemDeleter wakes up and runs.
      • The EphTombstoneStaleItemDeleter walks the ephemeral linked-list and looks for objects that are marked stale, if stale -> delete.
      • Now when htRes destructs it will use the deleted object causing a number of issues.

      Note this bug is quickly evident if we put something like the following code in after the call to processSet

      // note isStale1 is a temp function that exposes the stale bit to StoredValue
      if (htRes.pending && htRes.pending->isStale1()) {
          std::stringstream ss;
          ss << *htRes.pending;
          LOG_CRITICAL("StoredValue after processSet is stale {}", ss.str());
      }
      

      Running a 2 node cluster and a 100% durable write workload and this message is printed often.

      Note this bug has been seen to cause a crash of memcached because the following exception gets thrown:

      CRITICAL Caught unhandled std::exception-derived exception. what(): CollectionID: invalid value:2
      

      This is because the htRes destruct path tried to get the prefix of the deleted stored-value and found that the value is no longe valid.

      Attachments

        Activity

          People

            jwalker Jim Walker
            jwalker Jim Walker
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty