Details
-
Bug
-
Resolution: Duplicate
-
Major
-
7.1.0
-
7.1.0-1458
-
Untriaged
-
Centos 64-bit
-
1
-
No
Description
Script to Repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.49251.ini GROUP=swap_rebalance_P0_set0,rerun=False,disk_optimized_thread_settings=True,get-cbcollect-info=True,upgrade_version=7.1.0-1458 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_swap_rebalance,nodes_init=5,nodes_swap=2,compaction=True,bucket_spec=magma_dgm.5_percent_dgm.5_node_2_replica_magma_512,doc_size=512,randomize_value=True,data_load_stage=during,skip_validations=False,GROUP=swap_rebalance_P0_set0'
|
Steps to Repro
1. Create a 5 node cluster.
2021-10-10 01:13:26,608 | test | INFO | pool-5-thread-8 | [table_view:display:72] Rebalance Overview
|
+----------------+----------+-----------------------+---------------+--------------+
|
| Nodes | Services | Version | CPU | Status |
|
+----------------+----------+-----------------------+---------------+--------------+
|
| 172.23.106.130 | kv | 7.1.0-1458-enterprise | 2.21216691805 | Cluster node |
|
| 172.23.104.232 | None | | | <--- IN --- |
|
| 172.23.104.252 | None | | | <--- IN --- |
|
| 172.23.104.76 | None | | | <--- IN --- |
|
| 172.23.104.216 | None | | | <--- IN --- |
|
+----------------+----------+-----------------------+---------------+--------------+
|
2. Create buckets/scopes/collections/data
2021-10-10 01:40:06,456 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
|
+---------+-----------+-----------------+----------+------------+-----+----------+-----------+------------+------------+---------------+
|
| Bucket | Type | Storage Backend | Replicas | Durability | TTL | Items | RAM Quota | RAM Used | Disk Used | ARR |
|
+---------+-----------+-----------------+----------+------------+-----+----------+-----------+------------+------------+---------------+
|
| bucket1 | couchbase | couchstore | 2 | none | 0 | 50000 | 9.77 GiB | 250.87 MiB | 319.09 MiB | 100 |
|
| bucket2 | couchbase | magma | 2 | none | 0 | 50000 | 4.88 GiB | 525.89 MiB | 557.91 MiB | 100 |
|
| default | couchbase | magma | 2 | none | 0 | 32575000 | 2.50 GiB | 1.77 GiB | 39.71 GiB | 3.42467843438 |
|
+---------+-----------+-----------------+----------+------------+-----+----------+-----------+------------+------------+---------------+
|
3. Add 2 nodes(172.23.106.129 and 172.23.104.15) , Remove 2 nodes(172.23.104.216 and 172.23.104.76) and start a swap rebalance
2021-10-10 01:40:19,831 | test | INFO | pool-5-thread-6 | [table_view:display:72] Rebalance Overview
|
+----------------+----------+-----------------------+---------------+--------------+
|
| Nodes | Services | Version | CPU | Status |
|
+----------------+----------+-----------------------+---------------+--------------+
|
| 172.23.104.15 | kv | 7.1.0-1458-enterprise | 0 | Cluster node |
|
| 172.23.104.232 | kv | 7.1.0-1458-enterprise | 8.14217292664 | Cluster node |
|
| 172.23.106.129 | kv | 7.1.0-1458-enterprise | 0 | Cluster node |
|
| 172.23.104.216 | kv | 7.1.0-1458-enterprise | 9.16887375457 | --- OUT ---> |
|
| 172.23.104.252 | kv | 7.1.0-1458-enterprise | 8.40590685346 | Cluster node |
|
| 172.23.106.130 | kv | 7.1.0-1458-enterprise | 10.6821921276 | Cluster node |
|
| 172.23.104.76 | kv | 7.1.0-1458-enterprise | 8.25444907232 | --- OUT ---> |
|
+----------------+----------+-----------------------+---------------+--------------+
|
At this point we see 9e3e32c-fd4c-4398-418dc5bf-529aecdc.dmp on 172.23.104.232
grep CRITICAL on 172.23.104.232
Balakumarans-MacBook-Pro-2:cbcollect_info_ns_1@172.23.104.232_20211010-085015 balakumaran.g$ grep CRITICAL memcached.log
|
2021-10-10T01:49:05.460178-07:00 CRITICAL Detected previous crash
|
2021-10-10T01:49:05.460225-07:00 CRITICAL Breakpad caught a crash (Couchbase version 7.1.0-1458). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/29e3e32c-fd4c-4398-418dc5bf-529aecdc.dmp before terminating.
|
2021-10-10T01:49:05.460236-07:00 CRITICAL Stack backtrace of crashed thread:
|
2021-10-10T01:49:05.460237-07:00 CRITICAL #0 /opt/couchbase/bin/memcached() [0x400000+0x6d5fc8]
|
2021-10-10T01:49:05.460239-07:00 CRITICAL #1 /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x7260da]
|
2021-10-10T01:49:05.460241-07:00 CRITICAL #2 /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x726418]
|
2021-10-10T01:49:05.460242-07:00 CRITICAL #3 /lib64/libpthread.so.0() [0x7fb3560a8000+0xf630]
|
2021-10-10T01:49:05.460244-07:00 CRITICAL #4 /opt/couchbase/bin/memcached() [0x400000+0x56f450]
|
2021-10-10T01:49:05.460245-07:00 CRITICAL #5 /opt/couchbase/bin/memcached() [0x400000+0x5663cf]
|
2021-10-10T01:49:05.460257-07:00 CRITICAL #6 /opt/couchbase/bin/memcached() [0x400000+0x5675e4]
|
2021-10-10T01:49:05.460259-07:00 CRITICAL #7 /opt/couchbase/bin/memcached() [0x400000+0x567fc8]
|
2021-10-10T01:49:05.460260-07:00 CRITICAL #8 /opt/couchbase/bin/memcached() [0x400000+0x5687cd]
|
2021-10-10T01:49:05.460264-07:00 CRITICAL #9 /opt/couchbase/bin/memcached() [0x400000+0x568cc8]
|
2021-10-10T01:49:05.460294-07:00 CRITICAL #10 /opt/couchbase/bin/memcached(_ZN5magma5Magma4Impl11syncKVStoreEtb+0x215) [0x400000+0x511885]
|
2021-10-10T01:49:05.460296-07:00 CRITICAL #11 /opt/couchbase/bin/memcached() [0x400000+0x511a60]
|
2021-10-10T01:49:05.460297-07:00 CRITICAL #12 /opt/couchbase/bin/memcached() [0x400000+0x51838e]
|
2021-10-10T01:49:05.460306-07:00 CRITICAL #13 /opt/couchbase/bin/memcached(_ZN5magma5Magma4Impl14CompactKVStoreEtNS0_9StoreTypeESt8functionIFSt10unique_ptrINS0_18CompactionCallbackESt14default_deleteIS5_EEtEE+0x36e) [0x400000+0x511f0e]
|
2021-10-10T01:49:05.460308-07:00 CRITICAL #14 /opt/couchbase/bin/memcached(_ZN5magma5Magma4Impl14CompactKVStoreEtNS0_9StoreTypeESt8functionIFSt10unique_ptrINS0_18CompactionCallbackESt14default_deleteIS5_EEtEE+0xff) [0x400000+0x511c9f]
|
2021-10-10T01:49:05.460310-07:00 CRITICAL #15 /opt/couchbase/bin/memcached(_ZN5magma5Magma4Impl14CompactKVStoreEtRKNS_5SliceES4_St8functionIFSt10unique_ptrINS0_18CompactionCallbackESt14default_deleteIS7_EEtEE+0x65) [0x400000+0x512095]
|
2021-10-10T01:49:05.460312-07:00 CRITICAL #16 /opt/couchbase/bin/memcached(_ZN5magma5Magma14CompactKVStoreEtRKNS_5SliceES3_St8functionIFSt10unique_ptrINS0_18CompactionCallbackESt14default_deleteIS6_EEtEE+0x6d) [0x400000+0x51239d]
|
2021-10-10T01:49:05.460336-07:00 CRITICAL #17 /opt/couchbase/bin/memcached() [0x400000+0x47766e]
|
2021-10-10T01:49:05.460340-07:00 CRITICAL #18 /opt/couchbase/bin/memcached() [0x400000+0x46e780]
|
2021-10-10T01:49:05.460341-07:00 CRITICAL #19 /opt/couchbase/bin/memcached() [0x400000+0x46ed06]
|
2021-10-10T01:49:05.460345-07:00 CRITICAL #20 /opt/couchbase/bin/memcached() [0x400000+0x406eca]
|
2021-10-10T01:49:05.460370-07:00 CRITICAL #21 /opt/couchbase/bin/memcached() [0x400000+0x4089b1]
|
2021-10-10T01:49:05.460372-07:00 CRITICAL #22 /opt/couchbase/bin/memcached() [0x400000+0x314096]
|
2021-10-10T01:49:05.460374-07:00 CRITICAL #23 /opt/couchbase/bin/memcached() [0x400000+0x654332]
|
2021-10-10T01:49:05.460397-07:00 CRITICAL #24 /opt/couchbase/bin/memcached() [0x400000+0x6514d5]
|
2021-10-10T01:49:05.460426-07:00 CRITICAL #25 /opt/couchbase/bin/memcached() [0x400000+0x7a51e0]
|
2021-10-10T01:49:05.460427-07:00 CRITICAL #26 /opt/couchbase/bin/memcached() [0x400000+0x78cf9a]
|
2021-10-10T01:49:05.460429-07:00 CRITICAL #27 /opt/couchbase/bin/memcached() [0x400000+0x7a8199]
|
2021-10-10T01:49:05.460430-07:00 CRITICAL #28 /opt/couchbase/bin/memcached() [0x400000+0x651164]
|
2021-10-10T01:49:05.460431-07:00 CRITICAL #29 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fb3541da000+0xcdd40]
|
2021-10-10T01:49:05.460432-07:00 CRITICAL #30 /lib64/libpthread.so.0() [0x7fb3560a8000+0x7ea5]
|
2021-10-10T01:49:05.460434-07:00 CRITICAL #31 /lib64/libc.so.6(clone+0x6d) [0x7fb3538f2000+0xfe8dd]
|
Balakumarans-MacBook-Pro-2:cbcollect_info_ns_1@172.23.104.232_20211010-085015 balakumaran.g$
|
bt of 9e3e32c-fd4c-4398-418dc5bf-529aecdc.dmp on 172.23.104.232
(gdb) bt
|
#0 __atomic_add (__val=1, __mem=0x235ca) at /opt/gcc-10.2.0/include/c++/10.2.0/ext/atomicity.h:96
|
#1 __atomic_add_dispatch (__val=1, __mem=0x235ca) at /opt/gcc-10.2.0/include/c++/10.2.0/ext/atomicity.h:96
|
#2 _M_add_ref_copy (this=0x235c2) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:142
|
#3 __shared_count (__r=..., this=0x7fb2c5feb548) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:740
|
#4 __shared_ptr (this=0x7fb2c5feb540) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr_base.h:1181
|
#5 shared_ptr (this=0x7fb2c5feb540) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/shared_ptr.h:149
|
#6 Checkpoint (this=0x7fb2c5feb540) at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/lsm/checkpoint.h:46
|
#7 construct<magma::KVStoreCheckpoint, magma::Checkpoint&, magma::Checkpoint&, magma::Checkpoint&> (this=<optimized out>, __p=0x7fb2a098af40) at /opt/gcc-10.2.0/include/c++/10.2.0/ext/new_allocator.h:150
|
#8 construct<magma::KVStoreCheckpoint, magma::Checkpoint&, magma::Checkpoint&, magma::Checkpoint&> (__p=0x7fb2a098af40, __a=...) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/alloc_traits.h:512
|
#9 std::vector<magma::KVStoreCheckpoint, std::allocator<magma::KVStoreCheckpoint> >::_M_realloc_insert<magma::Checkpoint&, magma::Checkpoint&, magma::Checkpoint&> (this=this@entry=0x7fb2c5feb700, __position=__position@entry=...)
|
at /opt/gcc-10.2.0/include/c++/10.2.0/bits/vector.tcc:449
|
#10 0x00000000009663cf in emplace_back<magma::Checkpoint&, magma::Checkpoint&, magma::Checkpoint&> (this=0x7fb2c5feb700) at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:1436
|
#11 magma::KVStore::getAllCheckpoints(std::vector<magma::KVStoreCheckpoint, std::allocator<magma::KVStoreCheckpoint> >&) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:1436
|
#12 0x00000000009675e4 in magma::KVStore::verifyCheckpoints() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:1482
|
#13 0x0000000000967fc8 in magma::KVStore::flushMemTables(magma::WAL*, magma::WALOffset, magma::FlushMode, magma::BlockingMode)::{lambda()#2}::operator()() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:513
|
#14 0x00000000009687cd in magma::KVStore::flushMemTables(magma::WAL*, magma::WALOffset, magma::FlushMode, magma::BlockingMode) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:524
|
#15 0x0000000000968cc8 in magma::KVStore::FlushMemTables (this=<optimized out>, wal=<optimized out>, flushMode=<optimized out>, blockMode=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/kvstore/kvstore.cc:343
|
#16 0x0000000000911885 in magma::Magma::Impl::syncKVStore(unsigned short, bool) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:1133
|
#17 0x0000000000911a60 in std::_Function_handler<void (), magma::Magma::Impl::CompactKVStore(unsigned short, magma::Magma::StoreType, std::function<std::unique_ptr<magma::Magma::CompactionCallback, std::default_delete<magma::Magma::CompactionCallback> > (unsigned short)>)::{lambda()#2}>::_M_invoke(std::_Any_data const&) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:740
|
#18 0x000000000091838e in magma::defer::~defer (this=0x7fb2c5febbb0, __in_chrg=<optimized out>) at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:248
|
#19 0x0000000000911f0e in magma::Magma::Impl::CompactKVStore(unsigned short, magma::Magma::StoreType, std::function<std::unique_ptr<magma::Magma::CompactionCallback, std::default_delete<magma::Magma::CompactionCallback> > (unsigned short)>) ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:753
|
#20 0x0000000000911c9f in magma::Magma::Impl::CompactKVStore(unsigned short, magma::Magma::StoreType, std::function<std::unique_ptr<magma::Magma::CompactionCallback, std::default_delete<magma::Magma::CompactionCallback> > (unsigned short)>) ()
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:721
|
#21 0x0000000000912095 in magma::Magma::Impl::CompactKVStore(unsigned short, magma::Slice const&, magma::Slice const&, std::function<std::unique_ptr<magma::Magma::CompactionCallback, std::default_delete<magma::Magma::CompactionCallback> > (unsigned short)>) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/magma/magma/db.cc:773
|
#22 0x000000000091239d in magma::Magma::CompactKVStore(unsigned short, magma::Slice const&, magma::Slice const&, std::function<std::unique_ptr<magma::Magma::CompactionCallback, std::default_delete<magma::Magma::CompactionCallback> > (unsigned short)>)
|
() at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:248
|
#23 0x000000000087766e in MagmaMemoryTrackingProxy::CompactKVStore(unsigned short, magma::Slice const&, magma::Slice const&, std::function<std::unique_ptr<magma::Magma::CompactionCallback, std::default_delete<magma::Magma::CompactionCallback> > (unsigned short)>) () at /opt/gcc-10.2.0/include/c++/10.2.0/bits/std_function.h:248
|
#24 0x000000000086e780 in MagmaKVStore::compactDBInternal(std::unique_lock<std::mutex>&, std::shared_ptr<CompactionContext>) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/include/memcached/vbucket.h:62
|
#25 0x000000000086ed06 in MagmaKVStore::compactDB(std::unique_lock<std::mutex>&, std::shared_ptr<CompactionContext>) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/kvstore/magma-kvstore/magma-kvstore.cc:2058
|
#26 0x0000000000806eca in EPBucket::compactInternal(LockedVBucketPtr&, CompactionConfig&) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/vbucket.h:2599
|
#27 0x00000000008089b1 in EPBucket::doCompact(Vbid, CompactionConfig&, std::vector<CookieIface const*, std::allocator<CookieIface const*> >&) () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/ep_bucket.cc:1359
|
#28 0x0000000000714096 in CompactTask::run() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/tasks.cc:73
|
#29 0x0000000000a54332 in GlobalTask::execute() () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/globaltask.cc:68
|
#30 0x0000000000a514d5 in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const (__closure=0x7fb2c5fec840) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:189
|
#31 0x0000000000ba51e0 in operator() (this=0x7fb2c5fec840) at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:416
|
#32 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) (this=0x7fb352299400, thread=...,
|
task=<unknown type in /usr/lib/debug/opt/couchbase/bin/memcached.debug, CU 0x5f9cf74, DIE 0x5fe260e>)
|
at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/ThreadPoolExecutor.cpp:97
|
#33 0x0000000000b8cf9a in folly::CPUThreadPoolExecutor::threadRun (this=0x7fb352299400, thread=...)
|
at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/CPUThreadPoolExecutor.cpp:265
|
#34 0x0000000000ba8199 in __invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__t=<optimized out>,
|
__f=<optimized out>) at /usr/local/include/c++/7.3.0/bits/invoke.h:73
|
#35 __invoke<void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__fn=<optimized out>)
|
at /usr/local/include/c++/7.3.0/bits/invoke.h:95
|
#36 __call<void, 0, 1> (__args=<optimized out>, this=<optimized out>) at /usr/local/include/c++/7.3.0/functional:467
|
#37 operator()<> (this=<optimized out>) at /usr/local/include/c++/7.3.0/functional:551
|
#38 folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) (p=...) at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:401
|
#39 0x0000000000a51164 in operator() (this=0x7fb3527dd180) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:47
|
#40 operator() (__closure=0x7fb3527dd180) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:47
|
#41 folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) (p=...)
|
at /home/couchbase/jenkins/workspace/couchbase-server-unix/server_build/tlm/deps/folly.exploded/include/folly/Function.h:401
|
#42 0x00007fb3542a7d40 in execute_native_thread_routine () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/src/c++11/thread.cc:80
|
#43 0x00007fb3560afea5 in start_thread (arg=0x7fb2c5ffe700) at pthread_create.c:307
|
#44 0x00007fb3539f08dd in ioperm () at ../sysdeps/unix/syscall-template.S:81
|
#45 0x0000000000000000 in ?? ()
|
(gdb)
|
cbcollect_info attached. We are still doing experiments with these tests so we don't have an exact baseline as to when these last passed.
Attachments
Issue Links
- duplicates
-
MB-48707 [Magma] - Minidumps seen during graceful failover + rebalance out + CRUD on collections
- Closed