Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
Cheshire-Cat
-
6.6.2-9588 -> 7.0.0-5141
-
Untriaged
-
Centos 64-bit
-
1
-
Yes
Description
Scripts to Repro
1. Run the 6.6.2 longevity test for 3 days.
./sequoia -client 172.23.96.162:2375 -provider file:centos_third_cluster.yml -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml -scale 3 -repeat 0 -log_level 0 -version 6.6.2-9588 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
|
2. It had 27 nodes at the end of the test.
3. Added 6 7.0.0(172.23.105.102,172.23.105.62,172.23.106.232,172.23.106.239,172.23.106.37, 172.23.106.246) nodes and removed 6 node from 6.6.2(172.23.110.75,172.23.110.76,172.23.105.61,172.23.106.191,172.23.106.209,172.23.106.70)
to do a swap rebalance of all the services(1 of each kind).
4. Failed over 6 nodes - one of which is 172.23.105.29(eventing) all of which were 6.6.2. See . Stopped couchbase server and upgraded to 7.0.0-5141 did a recovery and did a rebalance. It failed as shown below
Rebalance exited with reason {service_rebalance_failed,eventing,
|
{agent_died,<31275.26925.17>,
|
{lost_connection,
|
{'ns_1@172.23.105.29',shutdown}}}}.
|
Rebalance Operation Id = d5decdeac0700bd3bd8609dafc785a5c
|
Repeatedly retried rebalance. It kept on failing with the same error.
See https://issues.couchbase.com/browse/MB-46198?focusedCommentId=500912&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-500912 for more details.
attaching cbcollect_info. This test last passed when we upgraded from 6.6.2-9588 to 7.0.0-5033. Looking for a workaround so that we can get out of this sticky situation.
It took good 4 days to reach this stage of upgrade.
172.23.105.102 : rebalance
[user:error,2021-05-11T07:59:12.657-07:00,ns_1@172.23.105.102:<0.31265.5>:ns_orchestrator:log_rebalance_completion:1405]Rebalance exited with reason {service_rebalance_failed,eventing,
|
[user:error,2021-05-11T08:36:07.522-07:00,ns_1@172.23.105.102:<0.31265.5>:ns_orchestrator:log_rebalance_completion:1405]Rebalance exited with reason {service_rebalance_failed,eventing,
|
[user:error,2021-05-11T09:00:44.444-07:00,ns_1@172.23.105.102:<0.31265.5>:ns_orchestrator:log_rebalance_completion:1405]Rebalance exited with reason {service_rebalance_failed,eventing,
|
172.23.105.29 : crash
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T07:46:53.936-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T07:53:03.145-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T07:59:12.653-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:05:21.990-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:11:31.097-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:17:40.313-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:23:49.122-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:29:58.156-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:36:07.517-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:42:16.661-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:48:25.870-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T08:54:35.387-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T09:00:44.442-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T09:06:54.248-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T09:13:03.517-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T09:19:12.815-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|
/opt/couchbase/var/lib/couchbase/logs/info.log:[user:info,2021-05-11T09:25:22.441-07:00,ns_1@172.23.105.29:<0.1059.0>:ns_log:crash_consumption_loop:63]Service 'eventing' exited with status 1. Restarting. Messages:
|