Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
7.1.0
-
Build: 7.1.0-1277
-
1
Description
Description:
Core dumps were found at the end of two failed tests, investigate why KVEngine crashed. The logs for these have been collected here:
Commentary:
This may be a duplicate of MB-48384.
Steps to reproduce:
The 2 tests, in which KV Engine crashes on, consist of 3 KV (only) nodes with the magma storage backend.
From the names of the tests it appears as if the tests load data to collections while performing various cluster operations.
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.38389.ini GROUP=P0_failover_and_recovery_dgm,rerun=False,get-cbcollect-info=True,infra_log_level=critical,log_level=error,bucket_storage=magma,enable_dp=True,upgrade_version=7.1.0-1277 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_hard_failover_recovery,nodes_init=3,nodes_failover=1,recovery_type=full,bucket_spec=dgm.buckets_for_rebalance_tests,data_load_stage=during,dgm=40,skip_validations=False,GROUP=P0_failover_and_recovery_dgm'
|
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.38389.ini GROUP=P0_failover_and_recovery_dgm,rerun=False,get-cbcollect-info=True,infra_log_level=critical,log_level=error,bucket_storage=magma,enable_dp=True,upgrade_version=7.1.0-1277 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_hard_failover_recovery,nodes_init=3,nodes_failover=1,recovery_type=delta,bucket_spec=dgm.buckets_for_rebalance_tests,data_load_stage=during,dgm=40,skip_validations=False,GROUP=P0_failover_and_recovery_dgm'
|
What's the problem?
The logs show that KV Engine crashes and produces a mini dump in both cases in the middle of compaction (AFAIK).
Appendix:
Here are some extracts from test.log contains the analysis of 2 minidumps:
test.log |
running: //opt/couchbase/bin/minidump-2-core /opt/couchbase/var/lib/couchbase/crash/f344d4be-47fa-4521-4c835ca8-226ee566.dmp > /opt/couchbase/var/lib/couchbase/crash/f344d4be-47fa-4521-4c835ca8-226ee566.core
|
running: gdb --batch /opt/couchbase/bin/memcached -c /opt/couchbase/var/lib/couchbase/crash/f344d4be-47fa-4521-4c835ca8-226ee566.core -ex "bt full" -ex quit
|
172.23.123.119: Stack Trace of first crash - f344d4be-47fa-4521-4c835ca8-226ee566.dmp
|
Core was generated by `/opt/couchbase/bin/memcached -C /opt/couchbase/var/lib/couchbase/config/memcach'.
|
#0 0x00007f7d3c685337 in raise () from /lib64/libc.so.6
|
#0 0x00007f7d3c685337 in raise () from /lib64/libc.so.6
|
No symbol table info available.
|
#1 0x00007f7d3c686a28 in abort () from /lib64/libc.so.6
|
No symbol table info available.
|
#2 0x00007f7d3cfd063c in __gnu_cxx::__verbose_terminate_handler () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/vterminate.cc:95
|
terminating = false
|
t = <optimized out>
|
#3 0x0000000000a99a3b in backtrace_terminate_handler() ()
|
No symbol table info available.
|
#4 0x00007f7d3cfdb8f6 in __cxxabiv1::__terminate(void (*)()) () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:48
|
No locals.
|
#5 0x00007f7d3cfdb961 in std::terminate () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:58
|
No locals.
|
#6 0x00007f7d3cfdbc46 in __cxxabiv1::__cxa_rethrow () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_throw.cc:133
|
globals = <optimized out>
|
header = <optimized out>
|
#7 0x00000000004c3efe in EPBucket::compactionCompletionCallback(CompactionContext&) [clone .cold] ()
|
No symbol table info available.
|
#8 0x00000000008524f2 in MagmaKVStore::compactDBInternal(std::unique_lock<std::mutex>&, std::shared_ptr<CompactionContext>) ()
|
No symbol table info available.
|
#9 0x0000000000852f76 in MagmaKVStore::compactDB(std::unique_lock<std::mutex>&, std::shared_ptr<CompactionContext>) ()
|
No symbol table info available.
|
#10 0x00000000007eb3ba in EPBucket::compactInternal(LockedVBucketPtr&, CompactionConfig&) ()
|
No symbol table info available.
|
#11 0x00000000007ec981 in EPBucket::doCompact(Vbid, CompactionConfig&, std::vector<CookieIface const*, std::allocator<CookieIface const*> >&) ()
|
No symbol table info available.
|
#12 0x0000000000706666 in CompactTask::run() ()
|
No symbol table info available.
|
#13 0x0000000000a0ae52 in GlobalTask::execute() ()
|
No symbol table info available.
|
#14 0x0000000000a07f75 in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const ()
|
No symbol table info available.
|
#15 0x0000000000b59b30 in folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ()
|
No symbol table info available.
|
#16 0x0000000000b418ea in folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) ()
|
No symbol table info available.
|
#17 0x0000000000b5cae9 in void folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) ()
|
No symbol table info available.
|
#18 0x0000000000a07c04 in void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) ()
|
No symbol table info available.
|
#19 0x00007f7d3d004d40 in execute_native_thread_routine () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/src/c++11/thread.cc:80
|
No locals.
|
#20 0x00007f7d3ee24e65 in start_thread () from /lib64/libpthread.so.0
|
No symbol table info available.
|
#21 0x00007f7d3c74d88d in clone () from /lib64/libc.so.6
|
No symbol table info available.
|
|
##############################
|
running: gdb -p `(pidof memcached)` -ex "thread apply all bt" -ex detach -ex quit
|
[Thread debugging using libthread_db enabled]
|
Using host libthread_db library "/lib64/libthread_db.so.1".
|
...
|
test.log |
172.23.121.222: 1 core dump seen
|
running: //opt/couchbase/bin/minidump-2-core /opt/couchbase/var/lib/couchbase/crash/53b2e9b9-7a98-45a5-ad5431aa-ef85228a.dmp > /opt/couchbase/var/lib/couchbase/crash/53b2e9b9-7a98-45a5-ad5431aa-ef85228a.core
|
running: gdb --batch /opt/couchbase/bin/memcached -c /opt/couchbase/var/lib/couchbase/crash/53b2e9b9-7a98-45a5-ad5431aa-ef85228a.core -ex "bt full" -ex quit
|
172.23.121.222: Stack Trace of first crash - 53b2e9b9-7a98-45a5-ad5431aa-ef85228a.dmp
|
Core was generated by `/opt/couchbase/bin/memcached -C /opt/couchbase/var/lib/couchbase/config/memcach'.
|
#0 0x00007fa8df4f9337 in raise () from /lib64/libc.so.6
|
#0 0x00007fa8df4f9337 in raise () from /lib64/libc.so.6
|
No symbol table info available.
|
#1 0x00007fa8df4faa28 in abort () from /lib64/libc.so.6
|
No symbol table info available.
|
#2 0x00007fa8dfe4463c in __gnu_cxx::__verbose_terminate_handler () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/vterminate.cc:95
|
terminating = false
|
t = <optimized out>
|
#3 0x0000000000a99a3b in backtrace_terminate_handler() ()
|
No symbol table info available.
|
#4 0x00007fa8dfe4f8f6 in __cxxabiv1::__terminate(void (*)()) () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:48
|
No locals.
|
#5 0x00007fa8dfe4f961 in std::terminate () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:58
|
No locals.
|
#6 0x00007fa8dfe4fc46 in __cxxabiv1::__cxa_rethrow () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_throw.cc:133
|
globals = <optimized out>
|
header = <optimized out>
|
#7 0x00000000004c3efe in EPBucket::compactionCompletionCallback(CompactionContext&) [clone .cold] ()
|
No symbol table info available.
|
#8 0x00000000008524f2 in MagmaKVStore::compactDBInternal(std::unique_lock<std::mutex>&, std::shared_ptr<CompactionContext>) ()
|
No symbol table info available.
|
#9 0x0000000000852f76 in MagmaKVStore::compactDB(std::unique_lock<std::mutex>&, std::shared_ptr<CompactionContext>) ()
|
No symbol table info available.
|
#10 0x00000000007eb3ba in EPBucket::compactInternal(LockedVBucketPtr&, CompactionConfig&) ()
|
No symbol table info available.
|
#11 0x00000000007ec981 in EPBucket::doCompact(Vbid, CompactionConfig&, std::vector<CookieIface const*, std::allocator<CookieIface const*> >&) ()
|
No symbol table info available.
|
#12 0x0000000000706666 in CompactTask::run() ()
|
No symbol table info available.
|
#13 0x0000000000a0ae52 in GlobalTask::execute() ()
|
No symbol table info available.
|
#14 0x0000000000a07f75 in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const ()
|
No symbol table info available.
|
#15 0x0000000000b59b30 in folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ()
|
No symbol table info available.
|
#16 0x0000000000b418ea in folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) ()
|
No symbol table info available.
|
#17 0x0000000000b5cae9 in void folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) ()
|
No symbol table info available.
|
#18 0x0000000000a07c04 in void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) ()
|
No symbol table info available.
|
#19 0x00007fa8dfe78d40 in execute_native_thread_routine () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/src/c++11/thread.cc:80
|
No locals.
|
#20 0x00007fa8e1c98e65 in start_thread () from /lib64/libpthread.so.0
|
No symbol table info available.
|
#21 0x00007fa8df5c188d in clone () from /lib64/libc.so.6
|
No symbol table info available.
|
|
##############################
|
running: gdb -p `(pidof memcached)` -ex "thread apply all bt" -ex detach -ex quit
|
[Thread debugging using libthread_db enabled]
|
Using host libthread_db library "/lib64/libthread_db.so.1".
|
...
|
Attachments
Issue Links
- duplicates
-
MB-48384 [Magma] PurgeSeqno Caught unhandled std::exception-derived exception. what(): Monotonic<m> (unlabelled) invariant failed: new value (0) breaks invariant on current value
- Closed