Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58080

Indexer rebalance failed during PrepareTopologyChange after multiple prior failures due to storage corruption and previous cleanups.

    XMLWordPrintable

Details

    Description

      Continuing the test from MB-58071:

      1. Stop all the workload and let the rebalance to finish from last step in MB-58071
      2. Scale up cluster to 5-KV, 5-GSI, 2-N1QL - Rebalance failed

      Rebalance exited with reason {service_rebalance_failed,index,
      {worker_died,
      {'EXIT',<0.24002.381>,
      {{badmatch,
      {error,
      {bad_nodes,index,prepare_rebalance,
      [{'ns_1@svc-i-node-012.z5lvxwpjbkjx2wk3.sandbox.nonprod-project-avengers.com',
      {error,
      {unknown_error,
      <<"indexer rebalance failure - cleanup pending from previous failed/aborted rebalance/failover/move index. please retry the request later.">>}}}]}}},
      [{service_rebalancer,rebalance_worker,1,
      [{file,"src/service_rebalancer.erl"},
      {line,158}]},
      {proc_lib,init_p,3,
      [{file,"proc_lib.erl"},{line,211}]}]}}}}.
      Rebalance Operation Id = d6ca839404b00435ad91621474505656
      

      Rebalance exited with reason {service_rebalance_failed,index,
      {agent_died,<35111.7809.34>,
      {linked_process_died,<35111.936.36>,
      {'ns_1@svc-i-node-012.z5lvxwpjbkjx2wk3.sandbox.nonprod-project-avengers.com',
      {timeout,
      {gen_server,call,
      [<35111.8041.34>,
      {call,"ServiceAPI.PrepareTopologyChange",
      #Fun<json_rpc_connection.0.69248800>,
      #{timeout => 60000}},
      60000]}}}}}}.
      Rebalance Operation Id = dadc40efa696ff1fae30409ce5a68786
      

      Some indexes the in moving state forever:

      QE Test

      sudo guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/couchbase_capella_volume_2_new.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.hostedHospital.Murphy.test_rebalance,num_items=100000000,num_buckets=1,bucket_names=GleamBook,bucket_type=membase,iterations=3,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,maxttl=10,pc=20,gsi_nodes=3,cbas_nodes=3,fts_nodes=3,kv_nodes=3,n1ql_nodes=2,kv_disk=1000,n1ql_disk=50,gsi_disk=500,fts_disk=1000,cbas_disk=1000,kv_compute=c5.2xlarge,gsi_compute=c5.2xlarge,n1ql_compute=c5.2xlarge,fts_compute=c5.2xlarge,cbas_compute=c5.2xlarge,mutation_perc=20,key_type=CircularKey,capella_run=true,services=data-query-index,rebl_services=data-index,max_rebl_nodes=27,provider=AWS,region=us-east-1,type=GP3,size=1000,collections=10,ops_rate=100000,skip_teardown_cleanup=true,wait_timeout=14400,index_timeout=28800,runtype=dedicated,skip_init=true,rebl_ops_rate=10000,collections=10,expiry=true,vh_scaling=true,horizontal_scale=1 -m rest'
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ritesh.agarwal Ritesh Agarwal
            ritesh.agarwal Ritesh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty