Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50172

[4TB, Magma, 2i]: Graceful failover KV node followed by full recovery results in rebalance failed. service_rebalance_failed,index, agent_died

    XMLWordPrintable

Details

    Description

      1. Create a 4 node cluster
      2. Create required buckets and collections.
      3. Create 20000000 items sequentially
      4. Update 20000000 RandonKey keys to create 50 percent fragmentation
      5. Create 20000000 items sequentially
      6. Update 20000000 RandonKey keys to create 50 percent fragmentation
      7. Rebalance in with Loading of docs
      8. Rebalance Out with Loading of docs
      9. Rebalance In_Out with Loading of docs
      10. Swap with Loading of docs
      11. Failover 2 node and RebalanceOut those nodes with loading in
      12. Failover a node and FullRecovery that node

        Rebalance exited with reason {service_rebalance_failed,index,
        {agent_died,<26041.6511.0>,
        {linked_process_died,<26041.21454.362>,
        {'ns_1@172.23.106.251',
        {timeout,
        {gen_server,call,
        [<26041.6610.0>,
        {call,"ServiceAPI.PrepareTopologyChange",
        #Fun<json_rpc_connection.0.86436583>,
        #{timeout => 60000}},
        60000]}}}}}}.
        Rebalance Operation Id = f089afb7c34a8a1699b9d5adff62129b
        

      QE Test

      git fetch "https://review.couchbase.org/TAF" refs/changes/79/167879/1 && git checkout FETCH_HEAD
      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job1.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.ClusterOpsVolume,nodes_init=4,graceful=True,skip_cleanup=True,num_items=20000000,num_buckets=1,bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,iterations=1,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=5,pc=25,index_nodes=2,cbas_nodes=0,fts_nodes=0,ops_rate=60000,ramQuota=10240,doc_ops=create:update:delete:read,rebl_ops_rate=20000,key_type=RandomKey,vbuckets=1024,mutation_perc=30 -m rest'
      

      Attachments

        1. indexer_pprof.log
          87 kB
          Yogendra Acharya
        2. indexer_pprof2.log
          16.84 MB
          Yogendra Acharya
        3. metakvcalls_goxdcr_jsevaluator.png
          349 kB
          Abhishek Jindal
        4. req.png
          35 kB
          Artem Stemkovski
        5. test_logs.txt
          16.02 MB
          Ritesh Agarwal

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty