Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50172

[4TB, Magma, 2i]: Graceful failover KV node followed by full recovery results in rebalance failed. service_rebalance_failed,index, agent_died

    XMLWordPrintable

Details

    Description

      1. Create a 4 node cluster
      2. Create required buckets and collections.
      3. Create 20000000 items sequentially
      4. Update 20000000 RandonKey keys to create 50 percent fragmentation
      5. Create 20000000 items sequentially
      6. Update 20000000 RandonKey keys to create 50 percent fragmentation
      7. Rebalance in with Loading of docs
      8. Rebalance Out with Loading of docs
      9. Rebalance In_Out with Loading of docs
      10. Swap with Loading of docs
      11. Failover 2 node and RebalanceOut those nodes with loading in
      12. Failover a node and FullRecovery that node

        Rebalance exited with reason {service_rebalance_failed,index,
        {agent_died,<26041.6511.0>,
        {linked_process_died,<26041.21454.362>,
        {'ns_1@172.23.106.251',
        {timeout,
        {gen_server,call,
        [<26041.6610.0>,
        {call,"ServiceAPI.PrepareTopologyChange",
        #Fun<json_rpc_connection.0.86436583>,
        #{timeout => 60000}},
        60000]}}}}}}.
        Rebalance Operation Id = f089afb7c34a8a1699b9d5adff62129b
        

      QE Test

      git fetch "https://review.couchbase.org/TAF" refs/changes/79/167879/1 && git checkout FETCH_HEAD
      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job1.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.ClusterOpsVolume,nodes_init=4,graceful=True,skip_cleanup=True,num_items=20000000,num_buckets=1,bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,iterations=1,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=5,pc=25,index_nodes=2,cbas_nodes=0,fts_nodes=0,ops_rate=60000,ramQuota=10240,doc_ops=create:update:delete:read,rebl_ops_rate=20000,key_type=RandomKey,vbuckets=1024,mutation_perc=30 -m rest'
      

      Attachments

        1. indexer_pprof.log
          87 kB
        2. indexer_pprof2.log
          16.84 MB
        3. metakvcalls_goxdcr_jsevaluator.png
          metakvcalls_goxdcr_jsevaluator.png
          349 kB
        4. req.png
          req.png
          35 kB
        5. test_logs.txt
          16.02 MB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty