Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.6.0
-
Triaged
-
0
-
Unknown
Description
I'm on cluster_run and encountering this crash on a simple data loading workload. This is happening on both a Magma and Couchstore bucket. I've attached the cbcollect with the Couchstore bucket repro.
Note I encountered this during some fusion testing. But fusion code only makes changes to Magma, not to Couchstore. As I'm seeing this crash on a Couchstore bucket as well, I decided to log this issue.
These are the relevant commits I'm on:
root@ubu20-se39:~/rohan/kv_engine# git log -1
|
WARNING: terminal is not fully functional
|
commit 19335ef69c8a8a0dbc4c8d0eeb5b2bf21ceb48a7 (HEAD -> master, m/master, couchbase/master)
|
Merge: f35208740 8f2389b3c
|
Author: Gerrit Code Review <gerrit@4edf0475e841>
|
Date: Fri Jan 26 15:41:05 2024 +0000
|
|
|
Merge "Merge commit trinity/67934e940 into master"
|
|
|
root@ubu20-se39:~/rohan/couchstore# git log -1
|
WARNING: terminal is not fully functional
|
commit 86f3d74bd0017d48aedb77051ad386d9082dfeaf (HEAD, m/master, couchbase/trinity, couchbase/master)
|
Author: Trond Norbye <trond.norbye@gmail.com>
|
Date: Tue Jan 2 08:22:33 2024 +0100
|
|
|
MB-59041: Verify the existence rather than size
|
To further make sure that no Magma changes are coming in the way, I checked out the master branch for Magma as well. I still see the issue.
The crash (see frame 9) where bytesToEvict is too large:
Core was generated by `/root/rohan/install/bin/memcached -C /data/rohan/cluster_run/data/n_0/config/me'.
|
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
|
[Current thread is 22228 (LWP 3095395)]
|
(gdb) bt
|
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#1 0x00007f4212c7e859 in __GI_abort () at abort.c:79
|
#2 0x00007f42130588d1 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
|
#3 0x00000000011c629e in backtrace_terminate_handler ()
|
at /root/rohan/kv_engine/utilities/terminate_handler.cc:88
|
#4 0x00007f421306437c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
|
#5 0x00007f42130643e7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
|
#6 0x00007f4213064699 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
|
#7 0x00000000012fb396 in __cxxabiv1::__cxa_throw (thrownException=0x7f3e100019c0,
|
type=0x1aa77b8 <typeinfo for gsl::narrowing_error>, destructor=
|
0x702bec <gsl::narrowing_error::~narrowing_error()>)
|
at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly.debug-prefix/src/folly.debug/folly/experimental/exception_tracer/ExceptionTracerLib.cpp:106
|
#8 0x0000000000cd7382 in gsl::narrow<long, unsigned long> (u=18446744055833456664)
|
at /root/rohan/third_party/gsl-lite/include/gsl/gsl-lite.hpp:2215
|
#9 0x0000000000d26f22 in StrictQuotaItemPager::getEvictionRatios (this=0x7f41e436c910,
|
kvBuckets=std::vector of length 1, capacity 1 = {...}, bytesToEvict=18446744055833456664)
|
at /root/rohan/kv_engine/engines/ep/src/item_pager.cc:111
|
#10 0x0000000000d27b98 in StrictQuotaItemPager::schedulePagingVisitors (this=0x7f41e436c910,
|
bytesToEvict=18446744055833456664) at /root/rohan/kv_engine/engines/ep/src/item_pager.cc:291
|
#11 0x0000000000d27377 in ItemPager::runPager (this=0x7f41e436c970, manuallyNotified=true)
|
at /root/rohan/kv_engine/engines/ep/src/item_pager.cc:180
|
#12 0x0000000000d499a9 in StrictQuotaItemPager::runInner (this=0x7f41e436c910, manuallyNotified=true)
|
at /root/rohan/kv_engine/engines/ep/src/item_pager.h:146
|
#13 0x0000000000da914a in EpNotifiableTask::run (this=0x7f41e436c910)
|
at /root/rohan/kv_engine/engines/ep/src/ep_task.cc:56
|
--Type <RET> for more, q to quit, c to continue without paging--
|
#14 0x00000000010b2981 in GlobalTask::execute (this=0x7f41e436c910, threadName="NonIoPool0")
|
at /root/rohan/kv_engine/executor/globaltask.cc:79
|
#15 0x0000000000da8fec in EpTask::execute (this=0x7f41e436c910, threadName="NonIoPool0")
|
at /root/rohan/kv_engine/engines/ep/src/ep_task.cc:43
|
#16 0x00000000010b8526 in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const (
|
__closure=0x7f3f097ee4d0) at /root/rohan/kv_engine/executor/folly_executorpool.cc:163
|
#17 0x00000000010c3ed5 in folly::detail::function::FunctionTraits<void ()>::callSmall<FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}>(folly::detail::function::Data&) (p=...)
|
at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:363
|
#18 0x00000000010ba697 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f3f097ee4d0)
|
at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:392
|
#19 0x00000000010a343e in operator() (__closure=0x7f3f097ee700) at /root/rohan/kv_engine/executor/cancellable_cpu_executor.cc:42
|
#20 0x00000000010b4b61 in folly::detail::function::FunctionTraits<void()>::callSmall<CancellableCPUExecutor::add(GlobalTask*, folly::Func)::<lambda()> >(folly::detail::function::Data &) (p=...)
|
at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:363
|
#21 0x00000000010ba697 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f3f097ee700)
|
at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:392
|
#22 0x00000000012efcd4 in folly::ThreadPoolExecutor::runTask (this=0x7f420f4f9500,
|
thread=<error reading variable: Cannot access memory at address 0x7f40c290fbe8>, task=...)
|
at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly.debug-prefix/src/folly.debug/folly/executors/ThreadPoolExecutor.cpp:98
|
#23 0x00000000012c5fb1 in folly::CPUThreadPoolExecutor::threadRun (this=0x7f420f4f9500,
|
thread=<error reading variable: Cannot access memory at address 0x7f40c290fbe8>)
|
at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly.debug-prefix/src/folly.debug/folly/executors/CPUThreadPoolExecutor.cpp:306
|
#24 0x00000000012f8dde in std::__invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__f=<error reading variable>,
|
__t=<error reading variable>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:73
|
#25 0x00000000012f82c9 in std::__invoke<void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__fn=<error reading variable>)
|
at /opt/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:95
|
#26 0x00000000012f7475 in std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) (this=0x7f40c2998c40, __args=...) at /opt/gcc-10.2.0/include/c++/10.2.0/functional:416
|
#27 0x00000000012f630e in std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>::operator()<, void>() (this=0x7f40c2998c40) at /opt/gcc-10.2.0/include/c++/10.2.0/functional:499
|
#28 0x00000000012f500e in folly::detail::function::FunctionTraits<void ()>::callSmall<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) (p=...) at /home/couchbase/jenkins/cbdeps-ws/deps/packages/build/folly/folly.debug-prefix/src/folly.debug/folly/Function.h:363
|
#29 0x00000000010ba697 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f40c2998c40) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:392
|
#30 0x00000000010b7cab in CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}::operator()() (__closure=0x7f40c2998c40) at /root/rohan/kv_engine/executor/folly_executorpool.cc:49
|
#31 0x00000000010c3b7a in folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) (p=...) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:377
|
#32 0x00000000010ba697 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f40c290fe70) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:392
|
#33 0x00000000010b7a45 in folly::PriorityThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}::operator()() (__closure=0x7f40c290fe60) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/executors/thread_factory/PriorityThreadFactory.h:52
|
#34 0x00000000010c3a9d in folly::detail::function::FunctionTraits<void ()>::callBig<folly::PriorityThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) (p=...) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:377
|
#35 0x00000000010ba697 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f40c28c7590) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/Function.h:392
|
#36 0x00000000010b6711 in folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}::operator()() (__closure=0x7f40c28c7590) at /root/rohan/build/tlm/deps/folly.exploded/include/folly/executors/thread_factory/NamedThreadFactory.h:40
|
#37 0x00000000010e9a44 in std::__invoke_impl<void, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(std::__invoke_other, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}&&) (__f=...) at /usr/include/c++/10/bits/invoke.h:60
|
#38 0x00000000010e99ed in std::__invoke<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(std::__invoke_result&&, (folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}&&)...) (__fn=...) at /usr/include/c++/10/bits/invoke.h:95
|
#39 0x00000000010e998e in std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0x7f40c28c7590) at /usr/include/c++/10/thread:264
|
#40 0x00000000010e98aa in std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}> >::operator()() (this=0x7f40c28c7590) at /usr/include/c++/10/thread:271
|
#41 0x00000000010e97b2 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}> > >::_M_run() (this=0x7f40c28c7580) at /usr/include/c++/10/thread:215
|
#42 0x00007f4213090df4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
|
#43 0x00007f4213694609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#44 0x00007f4212d7b353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
|
|
Steps to repro:
- Start cluster_run as usual
- Load some data. In less than a minute, memcached will crash.
I'm skeptical if the issue is something on my setup since this is too easy to reproduce and QE/kv team would've caught it in their testing. But would appreciate if someone from KV team can take a look as it is blocking some local cluster_run testing for me.
I was able to repro this on an official build as well 2 days ago, but no longer able to. I tried 7.6.1-3062.
Attachments
Issue Links
- is caused by
-
MB-60687 Wrong initial value type can result in std::accumulate overflow
- Closed