Details
Description
steps:
1. 3 nodes in cluster, 4 buckets. run data loader more then a day
2. setup replication from SRC to cluster DEST for all buckets.
3. rebalance in at SRC cluster
rebalance in at DEST cluster
Then I've found that not all items are replicated on Dest cluster. I continued to execute next steps and to watch the replication
4. Graceful Fail Over(rebalance) for node in SRC cluster, add back(Delta Recovery)
5. click failover, Hard Fail Over for node in SRC cluster A, add back(Full Recovery) and rebalance
6. remove node in SRC cluster, stop rebalance. Cancel removing node and rebalance
7. rebalance out 1 node on SRC cluster
I stopped loader on SRC and even during hour items are still loaded on Dest cluster.
in general, it took more than 5 hours, when I set up replication, but half of the elements have not yet loaded. It usually takes about 30 minutes.
logs spam with
Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB_172.23.105.206:11210_1:Xmem is stuck] xdcr000 ns_1@172.23.105.158 10:55:48 - Wed Apr 22, 2015
Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB started running. xdcr000 ns_1@172.23.105.22 10:55:46 - Wed Apr 22, 2015
Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB_172.23.105.160:11210_1:Xmem is stuck] xdcr000 ns_1@172.23.105.157 10:55:39 - Wed Apr 22, 2015
Replication 062defc4d1809a1a8e7572418efdfcca/AbRegNums/AbRegNums failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/AbRegNums/AbRegNums_172.23.105.159:11210_0:Xmem is stuck] xdcr000 ns_1@172.23.105.22 10:55:30 - Wed Apr 22, 2015
Replication 062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB failed. err=map[xmem_062defc4d1809a1a8e7572418efdfcca/RevAB/RevAB_172.23.105.206:11210_0:Xmem is stuck] xdcr000 ns_1@172.23.105.22 10:55:07 - Wed Apr 22, 2015
src:
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.156-20150422-1010-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.157-20150422-1018-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.158-20150422-1014-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.22-20150422-1023-diag.zip
dest:
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.159-20150422-1021-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.160-20150422-1028-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.206-20150422-1026-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-14661/d002455d/172.23.105.207-20150422-1030-diag.zip
Attachments
Issue Links
- duplicates
-
MB-14514 GoXDCR: Pipeline gets reconstructed frequently after rebalance-in, delaying replication
- Closed