Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Done
Priority: Major
Fix Version/s: 7.1.0
Affects Version/s: 6.5.1, 6.6.0, 6.5.0
Component/s: couchbase-bucket
Labels:
- neo-committed
- request-dev-verify

Epic Link:
KV: Cloud Free Tier
Story Points:
1
Sprint:
KV Sprint 2020-July, KV Sprint 2020-Oct, KV-Engine-Sept-21, KV 2021-Oct-21

Description

Investigation of ~~MB-39618~~ highlighted that the current busy-polling implementation of SyncWrite timeout checking is very costly (approx 3.5% CPU per bucket) - a 10 bucket node (with zero op/s) consumes 35% CPU for memcached process:

top - 13:35:30 up 80 days, 22:37,  9 users,  load average: 0.81, 0.64, 0.43

Tasks: 471 total,   1 running, 300 sleeping,   6 stopped,   2 zombie

%Cpu(s):  2.8 us,  0.9 sy,  0.0 ni, 96.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem : 13197644+total, 15423808 free,  3459340 used, 11309330+buff/cache

KiB Swap: 13416857+total, 13416651+free,     2060 used. 12749011+avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

 1654 daver     20   0 1978268 260788  13232 S  35.1  0.2 600:52.51 memcached

The issue is that currently we schedule (and run) a DurabilityTimeoutTask per bucket every 25ms. When this runs it needs to iterate through all VBuckets, checking for any SyncWrites which have timed out. This is the case even if the cluster has zero SyncWrites in progress (!)

A more efficient solution would be a DurabilityTimeoutTask per VBucket, which is scheduled to run when the next SyncWrite in that vBucket will expire. If no SyncWrites are outstanding on that vBucket then no task would be scheduled (and nothing would need to wake up. This should reduce the idle CPU to close to zero (or at least not be a function of the number of Buckets).

However, such a scheme isn't feasible with our current Executor / scheduler implementation, as I don't believe it would scale to 10,000s of tasks (10 buckets would require up to 10240 tasks - and that's just for DurabilityTimeoutTask).

Facebook's Folly library does have an Executor which claims to scale to this level (using a hashed-hierarchal wheel timer). We should investigate if that is suitable or not.

Attachments

Issue Links

depends on

MB-36956 Migrate to Facebook Folly executors for CPU & IO background tasks

Closed

relates to

MB-47920 Cost Effective Low-End Clusters improve resource utilization

Open

MB-42346 CC: Very high CPU usage while under low load (memcached)

Resolved

split from

MB-39618 Memcached is CPU hungry when HPET clock source used

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

For Gerrit Dashboard: MB-39815
#	Subject	Branch	Project	Status	CR	V
130419,27	MB-39815: Add event-driven SyncWrite timeout handling	master	kv_engine	Status: MERGED	+2	+1
162085,13	MB-39815: Add basic SyncWrite timeout test (ep_testsuite)	master	kv_engine	Status: MERGED	+2	+1
162102,23	MB-39815: Change durability_timeout_mode to event-driven	master	kv_engine	Status: MERGED	+2	+1
163571,4	MB-39815: Tighten argument checks in PDM::addSyncWrite	master	kv_engine	Status: MERGED	+2	+1
163596,2	MB-39815: Fix typos / missing @param documentation	master	kv_engine	Status: MERGED	+2	+1
165826,2	MB-39815: Adjust VBucketSyncWriteTimeoutTask expected duration	master	kv_engine	Status: MERGED	+2	+1
179209,2	Cleanup: remove 'polling' durability timeout mode	master	kv_engine	Status: MERGED	+2	+1
198493,3	MB-59022: Set engine correctly for VBucketSyncWriteTimeoutTask	master	kv_engine	Status: MERGED	+2	+1