What you said makes perfect sense. The reason we use rev_diff is to avoid sending a big doc which eventually fails conflict solution and is discarded at destination. For deletion, there isn't any benefit to do that, in contrast, we need to send (key+meta) twice for any deletion replicated to remote.
We can introduce a threshold parameter to quantify the body size, that is, for any doc smaller than that threshold, we skip revs_diff and send it directly to remote side, for any doc bigger than that we still send revs_diff first.
If the parameter is 0, that means, other than deletion (whose body size is 0), we need send revs_diff for all mutations. By this way, deletion is naturally encoded into that parameter, and we do not need to differentiate deletion from other mutations.
If the parameter is infinity (or substantially big), that is equivalent to "optimistic XDCR", meaning we optimistically send all docs without revs_diff, regardless of its size. We can even retire the parameter "optimistic XDCR" by this way.
Only two questions,
1. How to get the doc size? Is it encoded in the doc_info? If not, we may want to do that otherwise we have to pay another lookup to merely get doc size.
2, What is the reasonable number of that threshold? I will probably start with number like 1-2K.
Hey Junyi, I just realized that we can perform a optimization in XDCR to make the replication of deletion records faster and more efficient.
I might be wrong, but I believe that XDCR encounters a deletion record, it sends it in the _revs_diff call and if the target doesn't have it, we then send over all the same informtation in the bulk docs POST. But this is unnecessarily inefficient.
If we skipped the _revs_diff call for deletion records, and instead sent it unconditionally on the bulk post, we'd send half as much information and do much less background fetches for unidirectional sand the first half of bidirectional replication, and do the same amount of data sending work for the "bounce back" step of bidirectional replication. This is because the deleted documents should never have a body, so the amount of information sent in the _revs_diff and bulk post is the same.
If we also unconditionally sent very small bodies for non-deleted documents (say less than 100bytes), we'd also be sending less total information for the unidirectional replication and performing much less work, and only slightly more information on the second part of bidirectional replication with less work and information on the first part. It should also be a nice win for small documents.
This optimization can be made without sacrificing backwards compatibility or correctness. Anyway, wanted to just get this idea out there to you. Let me know if I didn't do a good job describing this, or you see something wrong in my reasoning. Thanks!