Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
7.2.4
-
Operating System : Debian GNU/Linux 10 (buster)
Couchbase Enterprise Edition build 7.2.4-7059
-
Untriaged
-
Linux x86_64
-
-
0
-
Unknown
Description
Steps to reproduce
- Created a 4 node kv cluster 172.23.121.194, 172.23.121.203, 172.23.121.160, 172.23.121.199
- Created an ephemeral bucket named 'default' with replicas=3 and loaded some docs onto it
- Disabled auto-failover
- Enabled auto-reprovision
- Induced a failure "restart_machine" in node 172.23.121.194
- A few minutes later, added node 172.23.121.198
- Started a rebalance
Rebalance fails with
2023-12-10T22:09:08.502-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.121.199) - Rebalance exited with reason {pre_rebalance_janitor_run_failed,"default", {error,wait_for_memcached_failed, ['ns_1@172.23.121.194']}}.Rebalance Operation Id = dab11fbacca77867586d2306c382740f |
|
The issue is not very consistent, cannot comment if this is a regression.
TAF script to reproduce
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-collections-vset00-00-auto_reprovision_7.0_P1/testexec.8970.ini GROUP=auto_reprovision,rerun=False,get-cbcollect-info=True,log_level=info,upgrade_version=7.2.4-7059,sirius_url=http://172.23.120.103:4000 -t failover.AutoFailoverTests.AutoFailoverTests.test_rebalance_after_autofailover,timeout=5,num_node_failures=1,nodes_in=1,nodes_out=0,auto_reprovision=True,failover_action=restart_machine,nodes_init=4,can_abort_rebalance=False,bucket_spec=single_bucket.buckets_all_ephemeral_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,GROUP=auto_reprovision' |
Job name : debian-collections-auto_reprovision_7.0_P1