Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
7.2.0
-
7.2.0-5211
-
Untriaged
-
Centos 64-bit
-
-
0
-
Unknown
Description
Steps To Recreate:
- Create a 3 node cluster
- Create a magma buckets with (bucket_history_retention_seconds=86400,bucket_history_retention_bytes=99636764160,) (vbuckets = 16, replicas = 2)
- Create 14 collections(total collection count is 15, including default collection)
- After creating collection update the collection history setting to true
- Create 5 million docs in each of the collection
- Upsert all the document thrice
- Total data on disk is close to 300GB(hence history starts getting cleared)
- Now, Perform cont. dedupe mutations(for 10000 docs)(300 iterations)
- Keep killing memcached (sleep between two memcached kills is 60 to 90 seconds, Before next sigkill test waits for cluster warmup to finish)
- while data loading and sigkills are going on keep deleting and recreating five collections (recreation of collection with same name) (sleep between two deletes is 60 to 90 second)
- Observed memcached crashed in DefragmentVisitor::visit(HashTable::HashBucketLock const&, StoredValue&)
Below Core Dump was found on node 172.23.107.221 @
8:22:26 PM and before core dump memcached was SIGKILLed on this node at
8:19:52 PM
[Thu Mar 2 20:22:26 2023] NonIoPool1[49857]: segfault at 121e7be04 ip 000000000080fae0 sp 00007f9295fe8ee0 error 4 in memcached[400000+a77000]
BackTrace:
(gdb) bt full
|
#0 load (__m=std::memory_order_seq_cst, this=0x121e7be04)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/blob.h:63
|
No locals.
|
#1 operator std::__atomic_base<unsigned int>::__int_type (this=0x121e7be04)
|
at /opt/gcc-10.2.0/include/c++/10.2.0/bits/atomic_base.h:289
|
No locals.
|
#2 valueSize (this=0x121e7be00) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/blob.h:63
|
No locals.
|
#3 valuelen (this=0x7f923dd781c0)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/stored-value.h:575
|
No locals.
|
#4 DefragmentVisitor::visit(HashTable::HashBucketLock const&, StoredValue&) ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/defragmenter_visitor.cc:42
|
value_len = <optimized out>
|
#5 0x00000000006ee7cd in HashTable::pauseResumeVisit(HashTableVisitor&, HashTable::Position&) ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/hash_table.cc:1288
|
tmp = 0x48cdf9e8dd00
|
lh = {bucketNum = 1077, htLock = {_M_device = 0x7f92b474feb8, _M_owns = true}}
|
v = <optimized out>
|
paused = false
|
lh = {_M_device = 0x7f92b474f800, _M_owns = false}
|
lock = 43
|
hash_bucket = 1077
|
|
#6 0x000000000072ee6b in PauseResumeVBAdapter::visit (this=0x7f92943aa9d0, vb=...)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/vb_visitors.cc:51
|
ht_start = {ht_size = 0, lock = 0, hash_bucket = 0}
|
#7 0x00000000006fe682 in KVBucket::pauseResumeVisit(PauseResumeVBVisitor&, KVBucketIface::Position&, VBucketFilter*) ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/kv_bucket.cc:2380
|
paused = <optimized out>
|
vb = {<std::__shared_ptr<VBucket, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<VBucket, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7f92743cd700, _M_refcount = {
|
_M_pi = 0x7f92fc692c60}}, <No data fields>}
|
vbid = {vbid = 2}
|
#8 0x000000000080e795 in DefragmenterTask::defrag() ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/ep_engine.h:661
|
---Type <return> to continue, or q <return> to quit---
|
currentFragStats = {allocatedBytes = 685207056, residentBytes = 967065600}
|
visitor = @0x7f92fc23b2e0: <error reading variable>
|
start = {__d = {__r = 43046279973754380}}
|
end = <optimized out>
|
completed = <optimized out>
|
#9 0x000000000080f8c8 in DefragmenterTask::run() ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/defragmenter.cc:56
|
phosphor_internal_category_enabled_53 = {_M_b = {_M_p = 0x0}, static is_always_lock_free = <optimized out>}
|
phosphor_internal_category_enabled_temp_53 = <optimized out>
|
phosphor_internal_tpi_53 = {category = 0x0, name = 0x0, type = phosphor::AsyncStart, argument_names = {_M_elems = {
|
0x0, 0x0}}, argument_types = {_M_elems = {phosphor::is_bool, phosphor::is_bool}}}
|
phosphor_internal_guard_53 = {tpi = 0x107a840 <DefragmenterTask::run()::phosphor_internal_tpi_53>, enabled = true,
|
arg1 = {<No data fields>}, arg2 = {<No data fields>}, start = {__d = {__r = 43046279973612654}}}
|
sleepTime = <optimized out>
|
#10 0x0000000000ab63e9 in GlobalTask::execute(std::basic_string_view<char, std::char_traits<char> >) ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/globaltask.cc:98
|
|
guard = {previous = 0x0}
|
start = <optimized out>
|
runAgain = <optimized out>
|
#11 0x0000000000aafaaa in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const (
|
__closure=0x7f9295fe9650)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:309
|
runAgain = <optimized out>
|
proxy = @0x7f92fc45b230: <error reading variable>
|
#12 0x0000000000ab779e in operator() (this=0x7f9295fe9650)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/cancellable_cpu_executor.cc:42
|
fn = @0x7f9295fe9650: {<folly::detail::function::FunctionTraits<void()>> = {<No data fields>}, data_ = {
|
big = 0x7f92fc45b230, tiny = {
|
__data = "0\262E\374\222\177\000\000@\227\376\225\222\177", '\000' <repeats 11 times>, "\320M\374\222\177\000\000/\000\000\000\000\000\000\000\355a\300\000\000\000\000", __align = {<No data fields>}}},
|
call_ = 0xaaffe0 <folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}>(folly::detail::function::Data&)>,
|
exec_ = 0xaae4d0 <folly::detail::function::execSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)>}
|
#13 CancellableCPUExecutor::add(GlobalTask*, folly::Function<void ()>)::{lambda()#1}::operator()() const ()
|
---Type <return> to continue, or q <return> to quit---
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/cancellable_cpu_executor.cc:42
|
task = {storage_ = {{emptyState = -48 '\320', value = {task = 0x7f9294683dd0,
|
func = {<folly::detail::function::FunctionTraits<void()>> = {<No data fields>}, data_ = {big = 0x7f92fc45b230,
|
tiny = {
|
__data = "0\262E\374\222\177\000\000@\227\376\225\222\177", '\000' <repeats 11 times>, "\320M\374\222\177\000\000/\000\000\000\000\000\000\000\355a\300\000\000\000\000", __align = {<No data fields>}}},
|
call_ = 0xaaffe0 <folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}>(folly::detail::function::Data&)>,
|
exec_ = 0xaae4d0 <folly::detail::function::execSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)>}}}, hasValue = true}}
|
|
this = <optimized out>
|
#14 0x0000000000c157c0 in operator() (this=0x7f9295fe9840)
|
at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:416
|
fn = @0x7f9295fe9840: {<folly::detail::function::FunctionTraits<void()>> = {<No data fields>}, data_ = {
|
big = 0x7f930814ac00, tiny = {
|
__data = "\000\254\024\b\223\177\000\000\320wG\f\223\177\000\000\060\000\000\000\000\000\000\000\301\302\000\000\000\000\000\000\020\000\000\000\000\000\000\000\020\231\376\225\222\177\000", __align = {<No data fields>}}},
|
call_ = 0xab7b20 <folly::detail::function::FunctionTraits<void()>::callSmall<CancellableCPUExecutor::add(GlobalTask*, folly::Func)::<lambda()> >(folly::detail::function::Data &)>,
|
exec_ = 0xab70d0 <folly::detail::function::execSmall<CancellableCPUExecutor::add(GlobalTask*, folly::Func)::<lambda()> >(folly::detail::function::Op, folly::detail::function::Data *, folly::detail::function::Data *)>}
|
|
#15 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) (this=this@entry=0x7f930814ad00, thread=...,
|
task=task@entry=<unknown type in /usr/lib/debug/opt/couchbase/bin/memcached-7.2.0-5211.x86_64.debug, CU 0xa8cf48a, DIE 0xa9533bf>)
|
at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/ThreadPoolExecutor.cpp:97
|
rctx = {
|
prev_ = {<std::__shared_ptr<folly::RequestContext, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<folly::RequestContext, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x0, _M_refcount = {
|
_M_pi = 0x0}}, <No data fields>}}
|
startTime = {__d = {__r = 43046279973607835}}
|
stats = {expired = false, waitTime = {__r = 581136}, runTime = {__r = 0}, enqueueTime = {__d = {
|
__r = 43046279973026699}}, requestId = 0}
|
---Type <return> to continue, or q <return> to quit---
|
#16 0x0000000000c0025a in folly::CPUThreadPoolExecutor::threadRun (this=0x7f930814ad00, thread=...)
|
at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/CPUThreadPoolExecutor.cpp:265
|
task = {storage_ = {{emptyState = 0 '\000', value = {<folly::ThreadPoolExecutor::Task> = {
|
func_ = {<folly::detail::function::FunctionTraits<void()>> = {<No data fields>}, data_ = {
|
big = 0x7f930814ac00, tiny = {
|
__data = "\000\254\024\b\223\177\000\000\320wG\f\223\177\000\000\060\000\000\000\000\000\000\000\301\302\000\000\000\000\000\000\020\000\000\000\000\000\000\000\020\231\376\225\222\177\000", __align = {<No data fields>}}},
|
call_ = 0xab7b20 <folly::detail::function::FunctionTraits<void()>::callSmall<CancellableCPUExecutor::add(GlobalTask*, folly::Func)::<lambda()> >(folly::detail::function::Data &)>,
|
exec_ = 0xab70d0 <folly::detail::function::execSmall<CancellableCPUExecutor::add(GlobalTask*, folly::Func)::<lambda()> >(folly::detail::function::Op, folly::detail::function::Data *, folly::detail::function::Data *)>}, enqueueTime_ = {
|
__d = {__r = 43046279973026699}}, expiration_ = {__r = 0},
|
expireCallback_ = {<folly::detail::function::FunctionTraits<void()>> = {<No data fields>}, data_ = {
|
big = 0xc2c1, tiny = {
|
__data = "\301\302\000\000\000\000\000\000\273g\246", '\000' <repeats 13 times>, "_\276&\f\223\177\000\000p\312~\266\222\177\000\000@VG\f\223\177\000", __align = {<No data fields>}}}, call_ = 0x46752f
|
<folly::detail::function::FunctionTraits<void ()>::uninitCall(folly::detail::function::Data&)>, exec_ = 0x0},
|
context_ = {<std::__shared_ptr<folly::RequestContext, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<folly::RequestContext, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x0, _M_refcount = {
|
_M_pi = 0x0}}, <No data fields>}}, poison = false, priority_ = 0 '\000',
|
queueObserverPayload_ = 140269492333952}}, hasValue = true}}
|
guard = {list_ = {forbid = true, prev = 0x0, curr = {name = {static npos = <optimized out>,
|
b_ = 0xcdaacb "CPUThreadPoolExecutor", e_ = 0xcdaae0 ""}}}}
|
#17 0x0000000000c18779 in __invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__t=<optimized out>,
|
__f=<optimized out>) at /usr/local/include/c++/7.3.0/bits/invoke.h:73
|
No locals.
|
#18 __invoke<void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__fn=<optimized out>)
|
at /usr/local/include/c++/7.3.0/bits/invoke.h:95
|
No locals.
|
#19 __call<void, 0, 1> (__args=<optimized out>, this=<optimized out>) at /usr/local/include/c++/7.3.0/functional:467
|
No locals.
|
#20 operator()<> (this=<optimized out>) at /usr/local/include/c++/7.3.0/functional:551
|
---Type <return> to continue, or q <return> to quit---
|
No locals.
|
#21 folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) (p=...)
|
at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:401
|
fn = <optimized out>
|
#22 0x0000000000aaf7a4 in operator() (this=0x7f93085f1fc0)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:49
|
fn = @0x7f93085f1fc0: <error reading variable>
|
#23 operator() (__closure=0x7f93085f1fc0)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:49
|
threadNameOpt = {storage_ = {{emptyState = -96 '\240', value = {static npos = 18446744073709551615,
|
_M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7f9295fe99a0 "NonIoPool1"}, _M_string_length = 10, {_M_local_buf = "NonIoPool1\000\000\000\000\000",
|
_M_allocated_capacity = 8029725099528449870}}}, hasValue = true}}
|
func = <error reading variable func (Cannot access memory at address 0x7f93085f1fc0)>
|
#24 folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) (p=...)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/server_build/tlm/deps/folly.exploded/include/folly/Function.h:401
|
fn = @0x7f93085f1fc0: <error reading variable>
|
#25 0x00007f930a032d40 in execute_native_thread_routine ()
|
at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/src/c++11/thread.cc:80
|
No locals.
|
#26 0x00007f930be3aea5 in start_thread () from /lib64/libpthread.so.0
|
No symbol table info available.
|
#27 0x00007f930977bb0d in clone () from /lib64/libc.so.6
|
No symbol table info available.
|
QE-TEST:
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/temp_vol.ini -p bucket_storage=magma,bucket_ram_quota=1024,init_loading=True,bucket_eviction_policy=fullEviction,rerun=False -t storage.magma.magma_crash_recovery.MagmaCrashTests.test_crash_during_dedupe,nodes_init=3,skip_cleanup=True,num_items=5000000,doc_size=1024,batch_size=100,sdk_timeout=60,log_level=info,infra_log_level=info,key_size=12,num_collections=15,ops_rate=20000,key_type=RandomKey,vbuckets=16,replicas=2,test_itr=3,bucket_history_retention_seconds=86400,bucket_history_retention_bytes=99636764160,standard_buckets=1,magma_buckets=1,num_scopes=1,autoCompactionDefined=true,meta_purge_interval=120,randomize_value=True,num_collections_to_drop=5 -m rest'
|
Attachments
Issue Links
- duplicates
-
MB-55709 [CDC] Memcached crashed in Cookie::initialize(cb::mcbp::Header const&, bool) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/daemon/cookie.cc
- Closed