Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Morpheus
-
Enterprise Edition 8.0.0 build 1763
-
Untriaged
-
-
0
-
Unknown
Description
Steps to repro
1. create a 4 data node cluster
172.23.136.111 172.23.136.112 172.23.136.113 172.23.136.114
|
2. create an ephemeral bucket default and load some data
3. enable AFO with
Enabled auto-failover with timeout 1 and max count 1 and allowFailoverEphemeralNoReplica = true
|
4. Restart node 172.23.136.112 to trigger AFO
Failover completed successfully.
|
Rebalance Operation Id = b5ee6dce9946a8e7e0ff82e1aaa483eb
|
5. trigger rebalance where .114 is rebalanced out and .112(failed over) is ejected
from .111
[ns_server:info,2024-08-07T10:44:43.927-07:00,ns_1@172.23.136.111:<0.12099.0>:ns_orchestrator:idle:970]Starting rebalance, KeepNodes = ['ns_1@172.23.136.111','ns_1@172.23.136.113'], EjectNodes = ['ns_1@172.23.136.114'], Failed over and being ejected nodes = ['ns_1@172.23.136.112']; no delta recovery nodes; Operation Id = 9ef7e798834079a3e18560db8b3e152b
|
[user:info,2024-08-07T10:44:43.928-07:00,ns_1@172.23.136.111:<0.12099.0>:ns_orchestrator:idle:973]Starting rebalance, KeepNodes = ['ns_1@172.23.136.111','ns_1@172.23.136.113'], EjectNodes = ['ns_1@172.23.136.114'], Failed over and being ejected nodes = ['ns_1@172.23.136.112']; no delta recovery nodes; Operation Id = 9ef7e798834079a3e18560db8b3e152b
|
[rebalance:info,2024-08-07T10:44:43.929-07:00,ns_1@172.23.136.111:<0.35271.0>:ns_rebalancer:drop_old_2i_indexes:1395]Going to drop possible old 2i indexes on nodes []
|
6. Rebalance exits
[user:error,2024-08-07T10:45:01.697-07:00,ns_1@172.23.136.111:<0.12099.0>:ns_orchestrator:log_rebalance_completion:1704]Rebalance exited with reason {{badmatch,
|
{leader_activities_error,
|
{default,rebalance},
|
{quorum_lost,
|
{lease_lost,'ns_1@172.23.136.114'}}}},
|
[{ns_rebalancer,rebalance,2,
|
[{file,
|
"/home/couchbase/jenkins/workspace/couchbase-server-unix/ns_server/apps/ns_server/src/ns_rebalancer.erl"},
|
{line,496}]},
|
{proc_lib,init_p_do_apply,3,
|
[{file,"proc_lib.erl"},{line,241}]}]}.
|
Rebalance Operation Id = 9ef7e798834079a3e18560db8b3e152b
|
[ns_server:warn,2024-08-07T10:45:01.697-07:00,ns_1@172.23.136.111:users_replicator<0.8698.0>:doc_replicator:loop:110
|
------------------------------------------
same test is working for 7.6.2
here are logs for the working version
from .111
[ns_server:info,2024-08-07T11:22:16.586-07:00,ns_1@172.23.136.111:<0.7066.0>:ns_orchestrator:idle:927]Starting rebalance, KeepNodes = ['ns_1@172.23.136.111','ns_1@172.23.136.113'], EjectNodes = ['ns_1@172.23.136.114'], Failed over and being ejected nodes = ['ns_1@172.23.136.112']; no delta recovery nodes; Operation Id = 85b1d115d1a0339aabca08835ceeb212
|
[user:info,2024-08-07T11:22:16.587-07:00,ns_1@172.23.136.111:<0.7066.0>:ns_orchestrator:idle:930]Starting rebalance, KeepNodes = ['ns_1@172.23.136.111','ns_1@172.23.136.113'], EjectNodes = ['ns_1@172.23.136.114'], Failed over and being ejected nodes = ['ns_1@172.23.136.112']; no delta recovery nodes; Operation Id = 85b1d115d1a0339aabca08835ceeb212
|
[user:info,2024-08-07T11:22:35.305-07:00,ns_1@172.23.136.111:<0.7066.0>:ns_orchestrator:log_rebalance_completion:1661]Rebalance completed successfully.
|
Rebalance Operation Id = 85b1d115d1a0339aabca08835ceeb212
|
https://cb-engineering.s3.amazonaws.com/test/collectinfo-2024-08-07T182413-ns_1%40172.23.136.111.zip
https://cb-engineering.s3.amazonaws.com/test/collectinfo-2024-08-07T182413-ns_1%40172.23.136.112.zip
https://cb-engineering.s3.amazonaws.com/test/collectinfo-2024-08-07T182413-ns_1%40172.23.136.113.zip
https://cb-engineering.s3.amazonaws.com/test/collectinfo-2024-08-07T182413-ns_1%40172.23.136.114.zip
Script to repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops-temp_rebalance_magma1.ini -t failover.AutoFailoverTests.AutoFailoverTests.test_autofailover_during_rebalance,timeout=1,num_node_failures=1,nodes_in=0,nodes_out=1,auto_reprovision=False,failover_action=restart_server,nodes_init=4,override_spec_params=replicas,replicas=0,bucket_spec=single_bucket.buckets_all_ephemeral_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,failover_ephemeral_no_replicas=True,wait_before_failure_induction=0,allow_ephemeral_failover_with_no_replicas=True'
|