Details
-
Bug
-
Resolution: Done
-
Major
-
6.5.0
-
Triaged
-
Yes
Description
We are observing a decrease in enterprise cbbackupmgr merge throughput (MB/sec) by 60%
Test
EE merge throughput (Avg. MB/sec). 4 nodes. 100M x 100M docs (overlapping keys)
Results
6.5.0-3197: 122 MB/sec
6.5.0-3198: 54 MB/sec
Report
Logs for 6.5.0-3197
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9901/leto-srv-01.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9901/leto-srv-02.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9901/leto-srv-03.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9901/leto-srv-04.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9901/tools.zip
Logs for 6.5.0-3198**
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9900/leto-srv-01.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9900/leto-srv-02.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9900/leto-srv-03.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9900/leto-srv-04.perf.couchbase.com.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-leto-9900/tools.zip
Changelog
http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-3197&tob=6.5.0-3198
Comment
It looks like the behavior of backup and merge have changed significantly in this commit:
https://github.com/couchbase/backup/commit/3d59d52be05a411f437c375bd7235638666bd2d4
Looking at the report graphs, we see that all 4 nodes in the cluster are utilizing their disks during merge in 3198 whereas the previous merge (pre-3198)only utilized disk resources on a single node. With such a significant change, perhaps we need to review the test to align it with the new merge behavior. Otherwise, it seems like the merge operation is now routing data to the appropriate vbucket directly instead of just a single node.
Could you let us know what the new intended merge behavior is?