Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
Untriaged
-
Centos 64-bit
-
1
-
Unknown
Description
There was a previous issue opened https://issues.couchbase.com/browse/MB-49301 but resolved as not a bug. Maybe more details of the CPU throttle algorithm could help understand the result better and how its intended to work. When I set the throttle level to 0.95, query throughput gets reduced by 40%. The performance test pushes indexer cpu normally close to 100%. If the throttle level is set to 0.95 why does cpu usage get throttled down to 60% instead of right below 95%. It seems there is a lot of wasted cpu cycles here.
Comparing these two test on 7.1.0-1650:
http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/Q5_Q7/all
Avg. Query Throughput (queries/sec), CI6, Group By Query (1K matches), MOI, not_bounded, s=1 c=1 i=1
http://perf.jenkins.couchbase.com/job/iris-multi-client/12973/ - 15280
Avg. Query Throughput (queries/sec), CI6, Group By Query (1K matches), MOI, not_bounded, Indexer CPU Throttle 0.95, s=1 c=1 i=1
http://perf.jenkins.couchbase.com/job/iris-multi-client/13027/ - 7651
graph comparison: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=iris_710-1650_access_0436&snapshot=iris_710-1650_access_afdf
Indexer cpu drops from 4800% to 3000%
Logs from throttle run (.45 is the index node):
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.45.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.55.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.70.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.71.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.72.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.73.zip
https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-iris-multi-client-13027/172.23.100.9.zip
Is this really the intended behaviour? This seems like it could have negative side effects when upgrading a cluster if this is the default setting.