Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Cheshire-Cat
-
Centos 7; Couchbase Enterprise Build 7.0.0-2351
-
Triaged
-
Centos 64-bit
-
1
-
No
-
KV Sprint 2020-July, KV Sprint 2020-Oct
Description
Script to Repo:
./testrunner -i /tmp/testexec.28032.ini GROUP=multi_node_auto_failover,rerun=False -t failover.MultiNodeAutoFailoverTests.MultiNodeAutoFailoverTests.test_autofailover_during_rebalance,get-cbcollect-info=True,replicas=Bucket.ReplicaNum.TWO,log_level=error,nodes_in=0,failover_action=stop_server,maxCount=2,timeout=5,rerun=False,nodes_init=5,GROUP=multi_node_auto_failover,nodes_out=1,num_node_failures=2,bucket_spec=single_bucket.def_scope_fifty_collections,override_spec_params=replicas,infra_log_level=critical |
I see a lot coredumps on the following nodes:
2020-06-14 06:17:31,540 | test | ERROR | MainThread | [basetestcase:check_coredump_exist:612] Node 172.23.123.163 - Core dump seen: 1014 |
2020-06-14 06:17:32,543 | test | ERROR | MainThread | [basetestcase:check_coredump_exist:612] Node 172.23.105.75 - Core dump seen: 40 |
2020-06-14 06:17:33,676 | test | ERROR | MainThread | [basetestcase:check_coredump_exist:612] Node 172.23.105.79 - Core dump seen: 2 |
2020-06-14 06:17:34,244 | test | ERROR | MainThread | [basetestcase:check_coredump_exist:612] Node 172.23.121.94 - Core dump seen: 1016 |
From .94, an example from mecached.log :
2020-06-14T06:16:55.128094-07:00 CRITICAL Caught unhandled std::exception-derived exception. what(): ThrowExceptionUnderflowPolicy current:0 arg:-91 |
2020-06-14T06:16:55.135448-07:00 CRITICAL *** Fatal error encountered during exception handling *** |
2020-06-14T06:16:55.292697-07:00 CRITICAL Breakpad caught a crash (Couchbase version 7.0.0-2351). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/5e845848-e183-447d-64450dbe-4e37ddcb.dmp before terminating. |
2020-06-14T06:16:55.292711-07:00 CRITICAL Stack backtrace of crashed thread: |
2020-06-14T06:16:55.292900-07:00 CRITICAL /opt/couchbase/bin/memcached() [0x400000+0x13b82d] |
2020-06-14T06:16:55.292911-07:00 CRITICAL /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x150b3a] |
2020-06-14T06:16:55.292920-07:00 CRITICAL /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x150e78] |
2020-06-14T06:16:55.292927-07:00 CRITICAL /lib64/libpthread.so.0() [0x7fc922748000+0xf5f0] |
2020-06-14T06:16:55.293345-07:00 CRITICAL /lib64/libc.so.6(gsignal+0x37) [0x7fc92237a000+0x36337] |
2020-06-14T06:16:55.293515-07:00 CRITICAL /lib64/libc.so.6(abort+0x148) [0x7fc92237a000+0x37a28] |
2020-06-14T06:16:55.293555-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7fc922e7d000+0x91195] |
2020-06-14T06:16:55.295327-07:00 CRITICAL /opt/couchbase/bin/memcached() [0x400000+0x14c032] |
2020-06-14T06:16:55.295378-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fc922e7d000+0x8ef86] |
2020-06-14T06:16:55.295394-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fc922e7d000+0x8efd1] |
2020-06-14T06:16:55.295408-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fc922e7d000+0x8f213] |
2020-06-14T06:16:55.295430-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x58344] |
2020-06-14T06:16:55.295439-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x22c06e] |
2020-06-14T06:16:55.295455-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x1f3952] |
2020-06-14T06:16:55.295462-07:00 CRITICAL /opt/couchbase/bin/../lib/libcouchstore.so() [0x7fc926364000+0x11a5b] |
2020-06-14T06:16:55.295467-07:00 CRITICAL /opt/couchbase/bin/../lib/libcouchstore.so() [0x7fc926364000+0x11431] |
2020-06-14T06:16:55.295471-07:00 CRITICAL /opt/couchbase/bin/../lib/libcouchstore.so() [0x7fc926364000+0x11431] |
2020-06-14T06:16:55.295474-07:00 CRITICAL /opt/couchbase/bin/../lib/libcouchstore.so() [0x7fc926364000+0x123c9] |
2020-06-14T06:16:55.295492-07:00 CRITICAL /opt/couchbase/bin/../lib/libcouchstore.so(couchstore_save_documents_and_callback+0x850) [0x7fc926364000+0x26490] |
2020-06-14T06:16:55.295500-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x1ffa78] |
2020-06-14T06:16:55.295506-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x2009aa] |
2020-06-14T06:16:55.295511-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x201045] |
2020-06-14T06:16:55.295518-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0xeaa22] |
2020-06-14T06:16:55.295531-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0xee559] |
2020-06-14T06:16:55.295538-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x1434cc] |
2020-06-14T06:16:55.295543-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x144699] |
2020-06-14T06:16:55.295547-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x147643] |
2020-06-14T06:16:55.295551-07:00 CRITICAL /opt/couchbase/bin/../lib/libep.so() [0x7fc9265c7000+0x13de5f] |
2020-06-14T06:16:55.295558-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7fc925102000+0x10777] |
2020-06-14T06:16:55.295566-07:00 CRITICAL /lib64/libpthread.so.0() [0x7fc922748000+0x7e65] |
2020-06-14T06:16:55.295598-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7fc92237a000+0xfe88d] |
|
|
bt full from one of the coredumps on .94 is attached: bt_full.txt
Unsure if this is a dup of https://issues.couchbase.com/browse/MB-39864 or other MBs tracking underflow issues; From #8 of bt we can see:
#8 0x00007f1253e5c06e in fetch_add (arg=-91, this=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/platform/include/platform/non_negative_counter.h:125 current = <optimized out> desired = 18446744073709551525 |
Have attached cb-collect-info.
Attachments
Issue Links
- is duplicated by
-
MB-42829 [Collections] Flush::saveCollectionStats items:1 + -1, highSeq:1882, diskSize:91 + -92
- Closed
-
MB-42893 [System Test] : Memcached crashed continously on multiple KV nodes - Flush::saveCollectionStats caught exception ThrowExceptionUnderflowPolicy current:0 arg:-1
- Closed
-
MB-39950 [Jepsen] Underflow of diskSize - total size (bytes) of items in this collection on disk
- Closed
-
MB-40041 [Jepsen] Crash in memcached due to "std::exception-derived exception. what(): ThrowExceptionUnderflowPolicy current:0 arg:-1"
- Closed
-
MB-40928 [Jepsen] Linearizability failure during disk-failure tests
- Closed
-
MB-40929 [Jepsen] Crash due to "ThrowExceptionUnderflowPolicy current:0 arg:-1"
- Closed