Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.6.2
-
7.6.2-3570
-
Untriaged
-
Linux x86_64
-
0
-
Unknown
-
March-June 24
Description
Script to Repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-collections-vset00-00-auto_reprovision_7.0_P1/testexec.27972.ini -p GROUP=auto_reprovision,rerun=False,upgrade_version=7.6.2-3570,sirius_url=http://172.23.120.103:4000 -t failover.AutoFailoverTests.AutoFailoverTests.test_autofailover_during_rebalance,timeout=1,num_node_failures=1,nodes_in=1,nodes_out=0,auto_reprovision=True,failover_action=restart_server,nodes_init=4,bucket_spec=single_bucket.buckets_all_ephemeral_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,GROUP=auto_reprovision'
|
Steps to Repro
1. Setup a 4 node cluster.
2024-04-23 14:09:32,375 | test | INFO | MainThread | [table_view:display:72] Cluster statistics
|
+----------------+---------+----------+--------+-----------+----------+-------------------------+-------------------+---------------------------------+
|
| Nodes | Zone | Services | CPU | Mem_total | Mem_free | Swap_mem_used | Active / Replica | Version / Config |
|
+----------------+---------+----------+--------+-----------+----------+-------------------------+-------------------+---------------------------------+
|
| 172.23.216.115 | Group 1 | kv | 20.300 | 3.81 GiB | 2.96 GiB | 512.00 KiB / 980.00 MiB | 0 / 0 | 7.6.2-3570-enterprise / default |
|
| 172.23.217.92 | Group 1 | kv | 22.218 | 3.80 GiB | 2.76 GiB | 43.25 MiB / 976.00 MiB | 0 / 0 | 7.6.2-3570-enterprise / default |
|
| 172.23.216.183 | Group 1 | kv | 17.825 | 3.81 GiB | 2.85 GiB | 512.00 KiB / 976.00 MiB | 0 / 0 | 7.6.2-3570-enterprise / default |
|
| 172.23.216.76 | Group 1 | kv | 18.674 | 3.81 GiB | 2.89 GiB | 0.0 Byte / 976.00 MiB | 0 / 0 | 7.6.2-3570-enterprise / default |
|
+----------------+---------+----------+--------+-----------+----------+-------------------------+-------------------+---------------------------------+
|
2. Create epehmeral bucket/scopes/collections and load data.
2024-04-23 14:11:14,052 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
|
+---------+----------------+----------+------+----------+------------+-----+--------+-----------------------+-----------+-----+
|
| Bucket | Type / Storage | Replicas | Rank | Vbuckets | Durability | TTL | Items | RAM Quota / Used | Disk Used | ARR |
|
+---------+----------------+----------+------+----------+------------+-----+--------+-----------------------+-----------+-----+
|
| default | ephemeral / - | 3 | 0 | - | none | 0 | 500000 | 7.81 GiB / 772.92 MiB | 0.0 Byte | - |
|
+---------+----------------+----------+------+----------+------------+-----+--------+-----------------------+-----------+-----+
|
3. Induce restart_server on node 172.23.216.115
2024-04-23 14:11:44,674 | test | INFO | MainThread | [AutoFailoverTests:test_autofailover_during_rebalance:178] Inducing failure restart_server on nodes [ip:172.23.216.115 port:18091 ssh_username:root]
|
4. Add a node(172.23.218.121) and rebalance.
2024-04-23 14:11:52,444 | test | INFO | pool-7-thread-2 | [table_view:display:72] Rebalance Overview
|
+----------------+---------+----------+---------------------------------+---------------+--------------+-----------------------+
|
| Nodes | Zone | Services | Version / Config | CPU | Status | Membership / Recovery |
|
+----------------+---------+----------+---------------------------------+---------------+--------------+-----------------------+
|
| 172.23.216.115 | Group 1 | kv | 7.6.2-3570-enterprise / default | 16.875 | Cluster node | active / none |
|
| 172.23.217.92 | Group 1 | kv | 7.6.2-3570-enterprise / default | 32.6749999996 | Cluster node | active / none |
|
| 172.23.216.183 | Group 1 | kv | 7.6.2-3570-enterprise / default | 16.5249999985 | Cluster node | active / none |
|
| 172.23.216.76 | Group 1 | kv | 7.6.2-3570-enterprise / default | 24.2500000005 | Cluster node | active / none |
|
| 172.23.218.121 | None | kv | | | <--- IN --- | |
|
+----------------+---------+----------+---------------------------------+---------------+--------------+-----------------------+
|
grep CRITICAL on 172.23.216.115
cbcollect_info_ns_1@172.23.216.115_20240423-211916/78bc26f4-9921-4179-6ddaa497-3dae607f.dmp
|
2024-04-23T13:22:33.202102-07:00 CRITICAL *** Fatal error encountered during exception handling ***
|
2024-04-23T13:22:33.202135-07:00 CRITICAL Caught unhandled std::exception-derived exception. what(): GSL: Precondition failure: 'runtimeSecs != 0.0' at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:2200
|
2024-04-23T13:22:33.481210-07:00 CRITICAL Detected previous crash
|
2024-04-23T13:22:33.481244-07:00 CRITICAL Breakpad caught a crash (Couchbase version 7.6.2-3570). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/78bc26f4-9921-4179-6ddaa497-3dae607f.dmp before terminating. Writing dump succeeded: yes
|
2024-04-23T13:22:33.481247-07:00 CRITICAL Stack backtrace of crashed thread:
|
2024-04-23T13:22:33.481247-07:00 CRITICAL #0 /opt/couchbase/bin/memcached() [0x400000+0x836081]
|
2024-04-23T13:22:33.481247-07:00 CRITICAL #1 /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x385) [0x400000+0x854105]
|
2024-04-23T13:22:33.481248-07:00 CRITICAL #2 /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0x9f) [0x400000+0x85444f]
|
2024-04-23T13:22:33.481250-07:00 CRITICAL #3 /lib/x86_64-linux-gnu/libpthread.so.0() [0x7f3f38534000+0x13140]
|
2024-04-23T13:22:33.481250-07:00 CRITICAL #4 /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141) [0x7f3f37c98000+0x38ce1]
|
2024-04-23T13:22:33.481251-07:00 CRITICAL #5 /lib/x86_64-linux-gnu/libc.so.6(abort+0x123) [0x7f3f37c98000+0x22537]
|
2024-04-23T13:22:33.481251-07:00 CRITICAL #6 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f3f37fd7000+0xa89ab]
|
2024-04-23T13:22:33.481252-07:00 CRITICAL #7 /opt/couchbase/bin/memcached() [0x400000+0x846226]
|
2024-04-23T13:22:33.481252-07:00 CRITICAL #8 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f3f37fd7000+0xb82fa]
|
2024-04-23T13:22:33.481255-07:00 CRITICAL #9 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f3f37fd7000+0xb8365]
|
2024-04-23T13:22:33.481255-07:00 CRITICAL #10 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f3f37fd7000+0xb85b7]
|
2024-04-23T13:22:33.481255-07:00 CRITICAL #11 /opt/couchbase/bin/memcached() [0x400000+0x92d285]
|
2024-04-23T13:22:33.481255-07:00 CRITICAL #12 /opt/couchbase/bin/memcached() [0x400000+0x61f72]
|
2024-04-23T13:22:33.481256-07:00 CRITICAL #13 /opt/couchbase/bin/memcached() [0x400000+0xe40ca]
|
2024-04-23T13:22:33.481257-07:00 CRITICAL #14 /opt/couchbase/bin/memcached() [0x400000+0x492d8b]
|
2024-04-23T13:22:33.481258-07:00 CRITICAL #15 /opt/couchbase/bin/memcached() [0x400000+0x493cae]
|
2024-04-23T13:22:33.481258-07:00 CRITICAL #16 /opt/couchbase/bin/memcached() [0x400000+0x490bb6]
|
2024-04-23T13:22:33.481260-07:00 CRITICAL #17 /opt/couchbase/bin/memcached() [0x400000+0x5c59e6]
|
2024-04-23T13:22:33.481260-07:00 CRITICAL #18 /opt/couchbase/bin/memcached() [0x400000+0x5c5c22]
|
2024-04-23T13:22:33.481261-07:00 CRITICAL #19 /opt/couchbase/bin/memcached() [0x400000+0x78d76f]
|
2024-04-23T13:22:33.481261-07:00 CRITICAL #20 /opt/couchbase/bin/memcached() [0x400000+0x4f8a32]
|
2024-04-23T13:22:33.481262-07:00 CRITICAL #21 /opt/couchbase/bin/memcached() [0x400000+0x78b50a]
|
2024-04-23T13:22:33.481263-07:00 CRITICAL #22 /opt/couchbase/bin/memcached() [0x400000+0x78e8c3]
|
2024-04-23T13:22:33.481263-07:00 CRITICAL #23 /opt/couchbase/bin/memcached() [0x400000+0x92927f]
|
2024-04-23T13:22:33.481264-07:00 CRITICAL #24 /opt/couchbase/bin/memcached() [0x400000+0x911da7]
|
2024-04-23T13:22:33.481265-07:00 CRITICAL #25 /opt/couchbase/bin/memcached() [0x400000+0x92be1a]
|
2024-04-23T13:22:33.481266-07:00 CRITICAL #26 /opt/couchbase/bin/memcached() [0x400000+0x787850]
|
2024-04-23T13:22:33.481266-07:00 CRITICAL #27 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f3f37fd7000+0xe4aa3]
|
2024-04-23T13:22:33.481267-07:00 CRITICAL #28 /lib/x86_64-linux-gnu/libpthread.so.0() [0x7f3f38534000+0x7ea7]
|
2024-04-23T13:22:33.481268-07:00 CRITICAL #29 /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f3f37c98000+0xfba2f]
|
cbcollect_info attached. Had some trouble getting bt's on the new boxes we had. Working on it.
Attachments
Issue Links
- relates to
-
MB-54485 Unit test failure in `AllBucketTypes/DurabilityActiveStreamTest.BackfillEmptySnapshotAfterCursorDroppingNoSyncWriteSupport_Delete_Majority/ephemeral_fail_new_data`
- Closed
-
MB-55938 Unit test failure in AllBucketTypes/DurabilityActiveStreamTest.BackfillEmptySnapshotAfterCursorDroppingNoSyncWriteSupport_Alive_Majority/ephemeral_fail_new_data
- Closed