ns_server has merged MB-30732 (up to 4 parallel backfills per source node + consider backfill done when persistence has been completed at destination).
This is the new baseline for any Backfill performance test.
couchbase/vulcan tests (toy-build before merging, on Hera):
http://perf.jenkins.couchbase.com/job/hera-hidd/279/ (old backfill)
http://perf.jenkins.couchbase.com/job/hera-hidd/278/ (new backfill)
couchbase/master tests (merged on master, on Titan):
http://perf.jenkins.couchbase.com/job/titan-reb/482/console (old backfill)
http://perf.jenkins.couchbase.com/job/titan-reb/491/console (new backfill)
Note that we have:
- relevant Rebalance speedup on Hera
- no improvement on Titan
My hypothesis is that the speedup on Hera (where the DGM ratio is higher that on Titan) is given by "ns_server waiting for persistence seqno at destination before starting the next vbucket-move" rather than "4 parallel Backfills". That is because:
- the potential of parallel Backfills cannot be unleashed because of https://issues.couchbase.com/browse/MB-31972
- "waiting for persistence seqno at destination" gives: lower DWQ -> lower mem_used -> ReplicationThrottle triggers less often at Consumer -> Backfill is paused less often at Producer
On Titan (where we have more memory available) the impact of (2) is null.