Details
Description
Test setup:
– 5 to 5 nodes, unidir, 2 buckets x 500M x 1KB items, 1 replica.
– bucket-1 replicates to bucket-1, bucket-2 replicates to bucket-2.
– 10K mixed (50/50) ops/sec per cluster per bucket.
Initial replication passed well, however some weird thing happened later:
1. 500 items from bucket-2 on node 172.23.100.19 never get replicated.
2. Non-zero number of items replicated optimistically is reported for both buckets. However xdcrOptimisticReplicationThreshold is set to 0.
3. There are weird replication errors like "Error with their keys:1 keys with not_my_vbucket errors". No rebalance, no failover at the same time.
Netem magic in use on both sides:
"netem delay 50ms 2ms loss 0.005% duplicate 0.005% corrupt 0.005%"