This is the expected behavior.
As of now, when XDCR detects topology changes on either source or target side, it continues running for a pre-defined period, which is 5 minutes by default, performs a checkpoint, and then restarts. The rationale for this approach is:
1. Replication currently does not have an easy way to handle topology changes without restarting
2. Replication restart is a resource heavy operation – checkpoint needs to be done, dcp connections and memcached connections need to be closed and then re-setup, etc. If we restart right away after detecting topology changes, we may be restarting so frequently that there is not much resource left for actual data replication.
The time window where replication continues running after detecting topology changes is intentionally put in to ensure that we get a reasonable amount of data replication done each time before we restart. Replication behavior in this window is less than optimal, though, since replication is using the same source and target topology that replication is started with, which has become stale.
1. If a vbucket has been moved around in the target cluster, replication will continue to send the mutations in that vbucket to the original owner of the vbucket. This will result in NOT_MY_VBUCKET errors and mutations in that vbucket will not get replicated. This is the behavior seen in this MB,
MB-20337. The vbuckets that get moved around in the time window is typically a very small portion of all vbuckets, and replication on vbuckets that have not been moved around can still proceed. This is the reason that we think keeping replication running is a worthwhile effort.
2. XDCR runtime statistics can become inaccurate when source topology changes since statistics collection is based on the stale source topology. This is the behavior seen in
Both these issues will disappear after replication get restarted. They may re-surface if topology change is still ongoing and replication needs to get restarted again. After topology changes have stopped and replication gets restarted for the last time, things will be consistent and there will not be data losses. This is what we observed in our tests.
We understand that such behavior may seem counter-intuitive and problematic to some customers. We have made several parameters configurable to allow customers to tune this behavior if that helps. The most useful parameters are:
TopologyChangeCheckInterval – the interval which we check for source and target topology changes. The default setting is 10 seconds
MaxTopologyChangeCountBeforeRestart - the number of topology changes seen before replication is restarted. The default setting is 30.
In default settings, after we see a source or target topology change, we will wait for 30 * 10 = 300 seconds = 5 minutes before we restart replication. This is the reason that "Three to four minutes after the rebalance finishes there is massive spike in XDCR operations where it seems to start working again” in this MB. If needed the second parameter, and even the first parameter, can be tuned down to reduce the length of the time window where we delay replication restart. Just bear in mind that this can have a negative impact on replication throughput due to the overhead of replication restarts.
An example command to change the parameters is as follows:
curl -X POST -u Administrator:welcome http://127.0.0.1:13000/xdcr/internalSettings -d TopologyChangeCheckInterval=5 -d MaxTopologyChangeCountBeforeRestart=10
Note that GOXDCR process will get automatically restarted after such command is issued. This command should be issued before replication is started to avoid interference.