Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not a Bug
-
7.1.0
-
Triaged
-
1
-
Unknown
-
KV 2021-Nov
Description
Note: Initially opened on performance variation between, that has been addressed, but now tracking the issue where replica item count does not reach active
—
In Magma insert only tests, we see high performance variation.
http://172.23.123.237/#/timeline/Linux/hidd/S0/all
In the latest runs with build 7.1.0-1558, the throughput changed from 122K to 226K.
Build | Throughput | Job |
---|---|---|
7.1.0-1558 | 122,966 | http://perf.jenkins.couchbase.com/job/rhea-5node2/1538/ |
7.1.0-1558 | 226,157 | http://perf.jenkins.couchbase.com/job/rhea-5node2/1539/ |
In the run having higher throughput, replica sync rate can't catch up.
There are more sync write flushes after a certain point.
Sarath Lakshman
Please take a look. Is there a way we can change checkpoint settings? It looks like the runs can go to different modes (or code paths), even with the same build.
Attachments
Issue Links
- relates to
-
MB-49262 Checkpoint expel stops before low mark
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Assignee | Sarath Lakshman [ sarath ] | Bo-Chun Wang [ bo-chun.wang ] |
Attachment | Screen Shot 2021-10-28 at 6.14.34 PM.png [ 166442 ] |
Assignee | Bo-Chun Wang [ bo-chun.wang ] | Sarath Lakshman [ sarath ] |
Attachment | Screenshot 2021-11-09 at 2.32.41 PM.png [ 168138 ] |
Attachment | Screenshot 2021-11-09 at 2.35.58 PM.png [ 168139 ] |
Component/s | couchbase-bucket [ 10173 ] |
Assignee | Sarath Lakshman [ sarath ] | Daniel Owen [ owend ] |
Attachment | Unknown.png [ 168141 ] |
Attachment | Unknown.png [ 168141 ] |
Attachment | 2c9be7495d498bf4ae151733781c8069.png [ 168142 ] |
Attachment | fc98c640dcf022cd0036bb64ca36e284.png [ 168143 ] |
Rank | Ranked higher |
Rank | Ranked lower |
Assignee | Daniel Owen [ owend ] | Bo-Chun Wang [ bo-chun.wang ] |
Rank | Ranked higher |
Attachment | Screen Shot 2021-11-09 at 1.57.13 PM.png [ 168257 ] |
Attachment | Screen Shot 2021-11-09 at 2.05.11 PM.png [ 168258 ] |
Assignee | Bo-Chun Wang [ bo-chun.wang ] | Daniel Owen [ owend ] |
Resolution | Fixed [ 1 ] | |
Status | Open [ 1 ] | Resolved [ 5 ] |
Assignee | Daniel Owen [ owend ] | Bo-Chun Wang [ bo-chun.wang ] |
Attachment | Screenshot 2021-11-10 at 15.30.33.png [ 168358 ] |
Attachment | Screenshot 2021-11-10 at 15.41.23.png [ 168360 ] |
Attachment | Screenshot 2021-11-10 at 15.46.40.png [ 168361 ] |
Assignee | Bo-Chun Wang [ bo-chun.wang ] | Daniel Owen [ owend ] |
Assignee | Daniel Owen [ owend ] | Dave Rigby [ drigby ] |
Resolution | Fixed [ 1 ] | |
Status | Resolved [ 5 ] | Reopened [ 4 ] |
Description |
In Magma insert only tests, we see high performance variation.
[http://172.23.123.237/#/timeline/Linux/hidd/S0/all] !Screen Shot 2021-10-26 at 11.17.30 AM.png|width=1000! In the latest runs with build 7.1.0-1558, the throughput changed from 122K to 226K. ||Build||Throughput||Job|| |7.1.0-1558|122,966|[http://perf.jenkins.couchbase.com/job/rhea-5node2/1538/]| |7.1.0-1558|226,157|[http://perf.jenkins.couchbase.com/job/rhea-5node2/1539/]| [http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=rhea_710-1558_access_key_prefix_e3f5&&label=high_insert_rate&snapshot=rhea_710-1558_access_key_prefix_3994&label=low_insert_rate] In the run having higher throughput, replica sync rate can't catch up. !Screen Shot 2021-10-26 at 11.40.04 AM.png|width=600! !Screen Shot 2021-10-26 at 11.40.28 AM.png|width=600! There are more sync write flushes after a certain point. !Screen Shot 2021-10-26 at 11.46.41 AM.png|width=600! [~sarath] Please take a look. Is there a way we can change checkpoint settings? It looks like the runs can go to different modes (or code paths), even with the same build. |
_Note: Initially opened on performance variation between, that has been addressed, but now tracking the issue where replica item count does not reach active_
--- In Magma insert only tests, we see high performance variation. [http://172.23.123.237/#/timeline/Linux/hidd/S0/all] !Screen Shot 2021-10-26 at 11.17.30 AM.png|width=1000! In the latest runs with build 7.1.0-1558, the throughput changed from 122K to 226K. ||Build||Throughput||Job|| |7.1.0-1558|122,966|[http://perf.jenkins.couchbase.com/job/rhea-5node2/1538/]| |7.1.0-1558|226,157|[http://perf.jenkins.couchbase.com/job/rhea-5node2/1539/]| [http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=rhea_710-1558_access_key_prefix_e3f5&&label=high_insert_rate&snapshot=rhea_710-1558_access_key_prefix_3994&label=low_insert_rate] In the run having higher throughput, replica sync rate can't catch up. !Screen Shot 2021-10-26 at 11.40.04 AM.png|width=600! !Screen Shot 2021-10-26 at 11.40.28 AM.png|width=600! There are more sync write flushes after a certain point. !Screen Shot 2021-10-26 at 11.46.41 AM.png|width=600! [~sarath] Please take a look. Is there a way we can change checkpoint settings? It looks like the runs can go to different modes (or code paths), even with the same build. |
Summary | High performance variation in Magma insert only tests | Replica item count lagging active in Magma insert test |
Assignee | Dave Rigby [ drigby ] | Paolo Cocchi [ paolo.cocchi ] |
Epic Link |
|
Sprint | KV 2021-Nov [ 1866 ] |
Status | Reopened [ 4 ] | In Progress [ 3 ] |
Rank | Ranked lower |
Attachment | MB-49170_build-1729.png [ 170525 ] |
Attachment |
|
Attachment | MB-49170_build-1729.png [ 170526 ] |
Triage | Untriaged [ 10351 ] | Triaged [ 10350 ] |
Resolution | Not a Bug [ 10200 ] | |
Status | In Progress [ 3 ] | Resolved [ 5 ] |
Assignee | Paolo Cocchi [ paolo.cocchi ] | Sarath Lakshman [ sarath ] |
Assignee | Sarath Lakshman [ sarath ] | Bo-Chun Wang [ bo-chun.wang ] |
Status | Resolved [ 5 ] | Closed [ 6 ] |
Could you bisect this to the last good build where the numbers are stable?
It is not very obvious is any magma or kv change lead to inconsistent throughput.
The NumSyncFlushes is a side effect of large writes in active buckets and larger write queue when replica didn't catch up.