Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51247

[Magma, 2 replicas, 1%DGM]: XDCR Replication seems to be stuck on rebalance-in being aborted due to oom killed by kernel on scr cluster.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • XDCR
    • 7.1.0-2388

    Description

      Found this while verifying MB-50240.

      1. Create a 4 node cluster
      2. Create a 2 node XDCR remote cluster
      3. Create buckets with 2 replica and collections on source cluster.
      4. Create buckets and collections on XDCR remote.
      5. Create 10000000 items sequentially
      6. Update 10000000 RandonKey keys to create 50 percent fragmentation
      7. Create 10000000 items sequentially
      8. Update 10000000 RandonKey keys to create 50 percent fragmentation
      9. Rebalance in with Loading of docs. Rebalance operation on scr cluster is aborted as memcached on one node is oom killed by kernel. At this time there were doc ops coming in on dstn cluster but there is no change in the items count on dstn cluster.
      10. Restarted rebalance in on scr cluster manually and that is finished successfully and replication started for a while and then completely stopped.

      QE Test

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job1.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.ClusterOpsVolume,nodes_init=3,graceful=True,skip_cleanup=True,num_items=10000000,num_buckets=1,bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,iterations=1,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=5,pc=25,index_nodes=0,xdcr_collections=50,xdcr_remote_nodes=3,cbas_nodes=0,fts_nodes=0,ops_rate=80000,ramQuota=10240,doc_ops=create:update:delete:read,rebl_ops_rate=20000,key_type=RandomKey,vbuckets=1024,mutation_perc=30,replicas=2 -m rest'
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty