Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: Morpheus
Affects Version/s: 7.1.3
Component/s: couchbase-bucket
Labels:
- candidate-for-trinity

Triage:
Untriaged
Story Points:
0
Is this a Regression?:
Yes

Description

While measuring the latency of SyncWrites on modest node sizes (EC2 r5.2xlarge - 8 CPU cores), it was observed that there were periodic jumps in the worst-case (p100) SyncWrite latency every 10mins:

Looking at tasks which run every 10mins, we can see a very direct correlation with when the ExpiryPager is scheduled to run (for the 7 buckets on this cluster):

i.e. when the ExpiryPager starts to run for a bucket, the maximum SyncWrite latency suffers.

This appears to be due to contention on the NonIO thread pool - on an 8-core system we create 2 nonIO threads, and the ExpiryPager runs 2 tasks per Bucket.

Indeed, the latency increase is (almost) entirely eliminated if the number of NonIO threads is increased from 2 to 3 - so there's still a "spare" NonIO thread when the ExpiryPager tasks are running - note threads were changed at the dotted blue line:

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Screenshot 2023-01-09 at 14.09.43.png
154 kB
09/Jan/23 6:11 AM
Screenshot 2023-01-09 at 14.10.50.png
328 kB
09/Jan/23 6:13 AM
Screenshot 2023-01-09 at 14.17.26.png
415 kB
09/Jan/23 6:17 AM

Issue Links

relates to

MB-41403 Investigate/Improve performance of document expiry throughput

Closed

MB-55086 Increase default number of NonIO threads

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Daniel Owen

Reporter:: Dave Rigby (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 09/Jan/23 6:11 AM

Updated:: 04/Dec/23 6:53 AM

Gerrit Reviews

There are no open Gerrit changes

ExpiryPager runs causing spikes in p100 SyncWrite latency

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty