Details
-
Bug
-
Resolution: Not a Bug
-
Critical
-
7.6.0
-
Operating System : Debian
Couchbase Enterprise Edition build 7.6.0-1813
-
Untriaged
-
Linux x86_64
-
-
0
-
Unknown
-
KV 2023-4, Magma-Jan18-2024
Description
Steps to reproduce
1. Created a 7 node kv cluster
2. Created a magma bucket named default with 2 replicas
3. Loaded 100000 items onto it
4. Rebalanced out 5 nodes
Rebalance fails
2023-11-19T20:25:03.690-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.123.44) - Rebalance exited with reason {pre_rebalance_janitor_run_failed,"default", {error,wait_for_memcached_failed, ['ns_1@172.23.107.26']}}.Rebalance Operation Id = 58aed04cccddc2abfde88ea0fabf15ac |
Observing these messages on ns_server.debug.log
[ns_server:info,2023-11-19T20:25:03.689-08:00,ns_1@172.23.123.44:rebalance_agent<0.11108.5>:rebalance_agent:handle_down:290]Rebalancer process <0.26893.5> died (reason {pre_rebalance_janitor_run_failed, "default", {error, wait_for_memcached_failed, ['ns_1@172.23.107.26']}}).[ns_server:debug,2023-11-19T20:25:03.689-08:00,ns_1@172.23.123.44:leader_activities<0.10959.5>:leader_activities:handle_activity_down:457]Activity terminated with reason {shutdown, {async_died, {raised, {exit, {pre_rebalance_janitor_run_failed, "default", {error,wait_for_memcached_failed, ['ns_1@172.23.107.26']}}, [{ns_rebalancer, run_janitor_pre_rebalance,1, [{file,"src/ns_rebalancer.erl"}, {line,700}]}, {lists,foreach_1,2, [{file,"lists.erl"},{line,1442}]}, {ns_rebalancer,rebalance_body,7, [{file,"src/ns_rebalancer.erl"}, {line,483}]}, {async,'-async_init/4-fun-1-',3, [{file,"src/async.erl"}, {line,199}]}]}}}}. Activity:{activity,<0.26892.5>,#Ref<0.1637097028.2274623492.193245>,default, <<"6e25e56994664fe45c0efcad77caed30">>, [rebalance], majority,[]}[error_logger:error,2023-11-19T20:25:03.690-08:00,ns_1@172.23.123.44:<0.26889.5>:ale_error_logger_handler:do_log:101]=========================CRASH REPORT========================= crasher: initial call: erlang:apply/2 pid: <0.26889.5> registered_name: [] exception exit: {pre_rebalance_janitor_run_failed,"default", {error,wait_for_memcached_failed, ['ns_1@172.23.107.26']}} in function ns_rebalancer:run_janitor_pre_rebalance/1 (src/ns_rebalancer.erl, line 700) in call from lists:foreach_1/2 (lists.erl, line 1442) in call from ns_rebalancer:rebalance_body/7 (src/ns_rebalancer.erl, line 483) in call from async:'-async_init/4-fun-1-'/3 (src/async.erl, line 199) ancestors: [<0.11062.5>,ns_orchestrator_child_sup,ns_orchestrator_sup, mb_master_sup,mb_master,leader_registry_sup, leader_services_sup,<0.10956.5>,ns_server_sup, ns_server_nodes_sup,<0.10486.5>,ns_server_cluster_sup, root_sup,<0.155.0>] message_queue_len: 0 messages: [] links: [<0.11062.5>] dictionary: [] trap_exit: false status: running heap_size: 28690 stack_size: 28 reductions: 3104 neighbours: |
[user:error,2023-11-19T20:25:03.690-08:00,ns_1@172.23.123.44:<0.11062.5>:ns_orchestrator:log_rebalance_completion:1660]Rebalance exited with reason {pre_rebalance_janitor_run_failed,"default", {error,wait_for_memcached_failed, ['ns_1@172.23.107.26']}}.Rebalance Operation Id = 58aed04cccddc2abfde88ea0fabf15ac |
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-durability-vset00-00-rebalance_out_persist_majority_6.5_P1/testexec.55849.ini num_items=100000,GROUP=P1;durability,durability=PERSIST_TO_MAJORITY,upgrade_version=7.6.0-1813,sirius_url=http://172.23.120.103:4000 -t rebalance_new.rebalance_out.RebalanceOutTests.rebalance_out_with_warming_up,max_verify=100000,value_size=1024,get-cbcollect-info=True,replicas=2,durability=PERSIST_TO_MAJORITY,log_level=info,upgrade_version=7.6.0-1813,GROUP=P1;durability,nodes_init=7,nodes_out=5,num_items=100000,sirius_url=http://172.23.120.103:4000,infra_log_level=info'
Job name : debian-durability_rebalance_out_persist_majority_6.5_P1
Job ref link : http://cb-logs-qe.s3-website-us-west-2.amazonaws.com/7.6.0-1813/jenkins_logs/test_suite_executor-TAF/287297/