Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60085

[Rebalance] : Rebalance exited with reason {pre_rebalance_janitor_run_failed,"default",{error,wait_for_memcached_failed,['ns_1@172.23.121.194']}}

    XMLWordPrintable

Details

    Description

      Steps to reproduce

      1. Created a 4 node kv cluster 172.23.121.194, 172.23.121.203, 172.23.121.160, 172.23.121.199
      2. Created an ephemeral bucket named 'default' with replicas=3 and loaded some docs onto it
      3. Disabled auto-failover
      4. Enabled auto-reprovision
      5. Induced a failure "restart_machine" in node 172.23.121.194
      6. A few minutes later, added node 172.23.121.198
      7. Started a rebalance

      Rebalance fails with

       

      2023-12-10T22:09:08.502-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.121.199) - Rebalance exited with reason {pre_rebalance_janitor_run_failed,"default",                                 {error,wait_for_memcached_failed,                                     ['ns_1@172.23.121.194']}}.Rebalance Operation Id = dab11fbacca77867586d2306c382740f 
      
      

       

      The issue is not very consistent, cannot comment if this is a regression.

       


       

      TAF script to reproduce

       

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-collections-vset00-00-auto_reprovision_7.0_P1/testexec.8970.ini GROUP=auto_reprovision,rerun=False,get-cbcollect-info=True,log_level=info,upgrade_version=7.2.4-7059,sirius_url=http://172.23.120.103:4000 -t failover.AutoFailoverTests.AutoFailoverTests.test_rebalance_after_autofailover,timeout=5,num_node_failures=1,nodes_in=1,nodes_out=0,auto_reprovision=True,failover_action=restart_machine,nodes_init=4,can_abort_rebalance=False,bucket_spec=single_bucket.buckets_all_ephemeral_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,GROUP=auto_reprovision'

       

       

      Job name : debian-collections-auto_reprovision_7.0_P1

      Job ref : http://cb-logs-qe.s3-website-us-west-2.amazonaws.com/7.2.4-7059/jenkins_logs/test_suite_executor-TAF/294830/

      Attachments

        Activity

          People

            raghav.sk Raghav S K
            raghav.sk Raghav S K
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty