After seeing XDCR in action, would like to propose a few enhancements:
-Put certain statistics in the XDCR screen as well as on the graph page:
-Percentage complete/caught up. While backfilling replication this would describe the number of items already sent to the remote side out of the total in the bucket. Once running, it would show whether there is a significant amount of backup in the queue
-Items per second to see speed of each stream and in total
-Bandwidth in use. As per a customer, the most important thing with XDCR is going to be the possibly cross-country internet bandwidth and will need to monitor that for each replication stream and in total
-On the graph page of outgoing, I would recommend removing "mutations checked", "mutations replicated", "data replication", "active vb reps", "waiting vb reps", "secs in replicating", "secs in checkpointing", "checkpoints issued" and "checkpoints failed". These stats really aren't useful from the perspective of someone trying to monitor or troubleshoot the current state of their cluster.
-On the graph page of outbound, there's a bit of confusion over the difference between "mutations to replicate", "mutations in queue" and "queue size". Unless they are showing significantly (and usefully) different metrics, recommend to remove all but one
-On the graph page of incoming, recommend to put "total ops/sec" on the far left to line up with the "ops/sec" in the summary section
-"XDCR dest ops per sec" is confusing because this cluster is the "destination" yet the stat implies the other way around. Recommend "Incoming XDCR ops per sec"
-"XDCR docs to replicate" is a little confusing because it doesn't match the same stat in the "outbound". Recommend to change "mutations to replicate" to "XDCR docs to replicate"
-Would also be good to see outbound ops/sec in the summary section alongside the number remaining to replicate
|For Gerrit Dashboard: &For+MB-7432=message:MB-7432|
|24541,3||bp: MB-7432: add replicate rate in outbound XDCR stats||ns_server||Status: ABANDONED||+1||+1|
|24542,1||MB-7432: add latency stat, WIP no review please||ns_server||Status: ABANDONED||0||0|
|24557,1||Merge remote-tracking branch 'couchbase/master' into b202||ns_server||Status: MERGED||+2||+1|
|24647,1||MB-7432: clear latency stats when vb replicator wakes up||ns_server||Status: ABANDONED||+1||+1|
|26278,1||Merge branch 'master' into 2.0.2||ns_server||Status: MERGED||+2||+1|
|40094,5||MB-7432: reimplemented xdcr stats||ns_server||Status: MERGED||+2||+1|