Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-54801

Ephemeral stale item purger pausing can lead to accessing dangling pointer

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • KV 2023-2

    Description

      TL;DR: the ephemeral stale item purger could pause in the "middle" of a chain of stale items, then purge a stale item which is still pointed-to by an older stale item.


      Scenario:

      • Store document with key A (version A1)
      • Stale item purger runs, and pauses at a seqno after A1
      • Backfill begins
      • Update A - A1 is covered by the backfill so:
        • A1 marked stale, points to A2
        • A2 appended to end of seqlist
      • Backfill ends
      • New Backfill begins
      • Update A again
        • A2 marked stale, points to A3
        • A3 appended to end of seqlist
          *Backfill ends
      • Stale item purger resumes, purges A2

      Purger paused somewhere between the seqnos of A1 and A2
                  v
      A1 (stale) -> A2 (stale) -> A3
      A2 is purged...
      A1 (stale) -> XXXXXXXXXX -> A3
      

      A1 now has a dangling replacement ptr.

      A future backfill scanning the seqlist may attempt to determine if the stale A1 should be included in the backfill. To do this, it checks if the replacement is also in the backfill range. If so A1 should not be included as A2 will also be present; must avoid having two versions of the same doc in a backfill. If not also in the backfill range, A1 is the only version of the document in the range, so must be included.

      This means reading the seqno of A2, potentially dereferencing a dangling ptr.

      This could lead to a crash, or an incorrect backfill (including too few or too many versions of a document).

      This is likely to date back to the introduction of ephemeral buckets.

      Small repro test, ASAN complaint:

       [ RUN      ] EphemeralVBucketTest.HopefullyNotASegfault
        OSV @0x6080000e86a0 ... W.R.Cm. temp:    seq:1 rev:1 cas:1670518613598076928 key:"cid:0x0:padding, size:8" exp:0 age:0 fc:4 vallen:8 val age:0 :"
        padding" prepareSeqno: 0
        =================================================================
        ==14680==ERROR: AddressSanitizer: heap-use-after-free on address 0x6080000e89b8 at pc 0x00000264f59a bp 0x7ffcbb8cc650 sp 0x7ffcbb8cc648
        READ of size 8 at 0x6080000e89b8 thread T0
            #0 0x264f599 in std::__atomic_base<long>::load(std::memory_order) const /opt/gcc-10.2.0/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../include/c++/10.2.0/bits/atomic_base.h:426:9
            #1 0x264f599 in cb::RelaxedAtomic<long>::load() const /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../platform/include/relaxed_atomic.h:42:22
            #2 0x2cab198 in BasicLinkedList::RangeIteratorLL::itrRangeContainsAnUpdatedVersion() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/linked_list.cc:587:52
            #3 0x2caa255 in BasicLinkedList::RangeIteratorLL::operator++() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/linked_list.cc:531:14
            #4 0x2cbb2a5 in SequenceList::RangeIterator::operator++() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/seqlist.cc:29:5
            #5 0x40ad240 in EphemeralVBucketTest_HopefullyNotASegfault_Test::TestBody() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/ephemeral_vb_test.cc:430:9
            #6 0x4d3095e in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2469:14
            #7 0x4cf4646 in testing::Test::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2508:5
            #8 0x4cf5fe9 in testing::TestInfo::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2684:11
            #9 0x4cf7366 in testing::TestSuite::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2816:28
            #10 0x4d0ffcb in testing::internal::UnitTestImpl::RunAllTests() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:5338:44
            #11 0x4d35abf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2469:14
            #12 0x4d0f18e in testing::UnitTest::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:4925:10
            #13 0x407953d in main /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/ep_unit_tests_main.cc:175:16
            #14 0x7fc4e3a06c86 in __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:310
            #15 0x25454f9 in _start (/home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/kv_engine/ep-engine_ep_unit_tests+0x25454f9)
        
        0x6080000e89b8 is located 24 bytes inside of 83-byte region [0x6080000e89a0,0x6080000e89f3)
        freed by thread T0 here:
            #0 0x25ed7a2 in operator delete(void*, unsigned long) (/home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/kv_engine/ep-engine_ep_unit_tests+0x25ed7a2)
            #1 0x2cae11d in std::unique_ptr<OrderedStoredValue, std::default_delete<OrderedStoredValue> >::~unique_ptr() /opt/gcc-10.2.0/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../include/c++/10.2.0/bits/unique_ptr.h:361:4
            #2 0x2ca5e71 in BasicLinkedList::purgeListElem(boost::intrusive::list_iterator<boost::intrusive::mhtraits<OrderedStoredValue, boost::intrusive::list_member_hook<>, &(OrderedStoredValue::seqno_hook)>, false>, bool) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/linked_list.cc:423:1
            #3 0x2ca4f00 in BasicLinkedList::purgeTombstones(long, std::function<bool (DocKey const&, long)>, std::function<bool ()>) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/linked_list.cc:278:18
            #4 0x2b35ffa in EphemeralVBucket::purgeStaleItems(std::function<bool ()>) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/ephemeral_vb.cc:374:35
            #5 0x40acea7 in EphemeralVBucketTest_HopefullyNotASegfault_Test::TestBody() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/ephemeral_vb_test.cc:422:5
            #6 0x4d3095e in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2469:14
            #7 0x4cf4646 in testing::Test::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2508:5
            #8 0x4cf5fe9 in testing::TestInfo::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2684:11
            #9 0x4cf7366 in testing::TestSuite::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2816:28
            #10 0x4d0ffcb in testing::internal::UnitTestImpl::RunAllTests() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:5338:44
            #11 0x4d35abf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2469:14
            #12 0x4d0f18e in testing::UnitTest::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:4925:10
            #13 0x407953d in main /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/ep_unit_tests_main.cc:175:16
            #14 0x7fc4e3a06c86 in __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:310
        
        previously allocated by thread T0 here:
            #0 0x25ecb3d in operator new(unsigned long) (/home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/kv_engine/ep-engine_ep_unit_tests+0x25ecb3d)
            #1 0x2cddb47 in OrderedStoredValueFactory::operator()(Item const&, std::unique_ptr<StoredValue, TaggedPtrDeleter<StoredValue, StoredValue::Deleter> >) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/stored_value_factories.cc:44:18
            #2 0x2ba7362 in HashTable::unlocked_addNewStoredValue(HashTable::HashBucketLock const&, Item const&) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/hash_table.cc:450:14
            #3 0x2b36acf in EphemeralVBucket::updateStoredValue(HashTable::HashBucketLock const&, StoredValue&, Item const&, VBQueueItemCtx const&, bool) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/ephemeral_vb.cc:439:24
            #4 0x2d25ad1 in VBucket::processSetInner(HashTable::FindUpdateResult&, StoredValue*&, Item&, unsigned long, bool, bool, VBQueueItemCtx const&, cb::StoreIfStatus, bool) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/vbucket.cc:3394:21
            #5 0x2d2129f in VBucket::processSet(HashTable::FindUpdateResult&, StoredValue*&, Item&, unsigned long, bool, bool, VBQueueItemCtx const&, cb::StoreIfStatus, bool) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/src/vbucket.cc
            #6 0x4bbe4f1 in VBucketTestBase::public_processSet(Item&, unsigned long, VBQueueItemCtx const&) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/vbucket_test.cc:230:15
            #7 0x4bbdf8a in VBucketTestBase::setOne(StoredDocKeyT<std::allocator> const&, int) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/vbucket_test.cc:171:12
            #8 0x40ac7f6 in EphemeralVBucketTest_HopefullyNotASegfault_Test::TestBody() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/ephemeral_vb_test.cc:413:9
            #9 0x4d3095e in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2469:14
            #10 0x4cf4646 in testing::Test::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2508:5
            #11 0x4cf5fe9 in testing::TestInfo::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2684:11
            #12 0x4cf7366 in testing::TestSuite::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2816:28
            #13 0x4d0ffcb in testing::internal::UnitTestImpl::RunAllTests() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:5338:44
            #14 0x4d35abf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:2469:14
            #15 0x4d0f18e in testing::UnitTest::Run() /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../third_party/googletest/googletest/src/gtest.cc:4925:10
            #16 0x407953d in main /home/couchbase/jenkins/workspace/v_engine.ASan-UBSan_cheshire-cat/build/../kv_engine/engines/ep/tests/module_tests/ep_unit_tests_main.cc:175:16
            #17 0x7fc4e3a06c86 in __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:310
        
        SUMMARY: AddressSanitizer: heap-use-after-free /opt/gcc-10.2.0/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../include/c++/10.2.0/bits/atomic_base.h:426:9 in std::__atomic_base<long>::load(std::memory_order) const
        Shadow bytes around the buggy address:
          0x0c10800150e0: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
          0x0c10800150f0: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 03 fa
          0x0c1080015100: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
          0x0c1080015110: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 02
          0x0c1080015120: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
        =>0x0c1080015130: fa fa fa fa fd fd fd[fd]fd fd fd fd fd fd fd fa
          0x0c1080015140: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fa
          0x0c1080015150: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 03 fa
          0x0c1080015160: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fa
          0x0c1080015170: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
          0x0c1080015180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        Shadow byte legend (one shadow byte represents 8 application bytes):
          Addressable:           00
          Partially addressable: 01 02 03 04 05 06 07 
          Heap left redzone:       fa
          Freed heap region:       fd
          Stack left redzone:      f1
          Stack mid redzone:       f2
          Stack right redzone:     f3
          Stack after return:      f5
          Stack use after scope:   f8
          Global redzone:          f9
          Global init order:       f6
          Poisoned by user:        f7
          Container overflow:      fc
          Array cookie:            ac
          Intra object redzone:    bb
          ASan internal:           fe
          Left alloca redzone:     ca
          Right alloca redzone:    cb
          Shadow gap:              cc
        ==14680==ABORTING
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            james.harrison James Harrison (Inactive)
            james.harrison James Harrison (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty