Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14662

[system tests]Replication run slow. scr cluster generates a lot of errors: Replication ../AbRegNums/AbRegNums failed. err=map[../AbRegNums/AbRegNums_172.23.105.160:11210_1:Xmem is stuck]

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • 4.0.0
    • 4.0.0
    • XDCR
    • Security Level: Public
    • None
    • 4.0.0-1869
    • Untriaged
    • Centos 64-bit
    • Unknown

    Description

      steps:
      1. 3 nodes in cluster, 4 buckets. run data loader more then a day
      2. setup replication from SRC to cluster DEST for all buckets.
      3. rebalance in at SRC cluster
      rebalance in at DEST cluster

      Then I've found that not all items are replicated on Dest cluster. I continued to execute next steps and to watch the replication

      4. Graceful Fail Over(rebalance) for node in SRC cluster, add back(Delta Recovery)
      5. click failover, Hard Fail Over for node in SRC cluster A, add back(Full Recovery) and rebalance
      6. remove node in SRC cluster, stop rebalance. Cancel removing node and rebalance
      7. rebalance out 1 node on SRC cluster

      I stopped loader on SRC and even during hour items are still loaded on Dest cluster.
      in general, it took more than 5 hours, when I set up replication, but half of the elements have not yet loaded. It usually takes about 30 minutes.

      logs spam with

      Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB_172.23.105.206:11210_1:Xmem is stuck] xdcr000 ns_1@172.23.105.158 10:55:48 - Wed Apr 22, 2015
      Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB started running. xdcr000 ns_1@172.23.105.22 10:55:46 - Wed Apr 22, 2015
      Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB_172.23.105.160:11210_1:Xmem is stuck] xdcr000 ns_1@172.23.105.157 10:55:39 - Wed Apr 22, 2015
      Replication 062defc4d1809a1a8e7572418efdfcca/AbRegNums/AbRegNums failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/AbRegNums/AbRegNums_172.23.105.159:11210_0:Xmem is stuck] xdcr000 ns_1@172.23.105.22 10:55:30 - Wed Apr 22, 2015
      Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB_172.23.105.206:11210_0:Xmem is stuck] xdcr000 ns_1@172.23.105.22 10:55:07 - Wed Apr 22, 2015

      src:
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.156-20150422-1010-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.157-20150422-1018-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.158-20150422-1014-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.22-20150422-1023-diag.zip
      dest:
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.159-20150422-1021-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.160-20150422-1028-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.206-20150422-1026-diag.zip
      https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.207-20150422-1030-diag.zip

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              xiaomei Xiaomei Zhang (Inactive)
              andreibaranouski Andrei Baranouski
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty