Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Unresolved
Priority: Major
Fix Version/s: Morpheus
Affects Version/s: 7.6.0, 7.0.0, 7.1.0, 7.2.0
Component/s: couchbase-bucket
Labels:
None

Story Points:
0

Description

There are several places in KV-Engine where we want to distribute vBuckets into equal-sized sets.

KV shards - #CPUs
ConnStore::vbConnLocks - 32 mutexes
EPBucket::getFlusher - #CPU flushers
EPBucket::getBgFetcher - #CPU BGFetchers
CheckpointMemRecoveryTask::getVbucketsSortedByChkMem - checkpoint removers
... (might be others)

What we have in all of those places is a finite resource which we want to share between vBuckets in a uniform way. For example, if we have 256 vBuckets on a node, we want 1 vbConnLock per 8 vBuckets.

The way we've done this is by taking vbid % numTasks. This is problematic, because the vBuckets IDs on a node are not consequitive. ns_server computes the vBucket map and decides which vBucket IDs to assign to a given node. In practice, there are gaps in the ranges of vBuckets on a node.

Example

Here's an example to illustrate how that affects the task distribution.

Assuming the following 256 active vBuckets are on a node, such that we have sequences of 16 consequitive numbers, then 48 vbids which are on the other nodes in the cluster, then more vbids. This repeats at interval of 64 vbids.

0-15, 64-79, 128-143, 192-207, 256-271 ... 1008-1023

And we look at the ConnStore::vbConnLocks example where we have 32 locks and distribute them across vBuckets.

Given the vbids above, any vbid % 32 will be less than 16. This means that we will only ever use 16 of the 32 locks, because there might be no vBuckets whose number divided by 32 has a remainer above 15.

This problem would become worse with a bucket configured with 128 vBuckets.

Solution

We should make it such that we equally distribute work regardless of the exact vbids assigned to a node.

We can do this in several ways, here are two:

We could track the "index" of the vBucket in the set of vBuckets, such that the lowest vbid is always index 0, the second lowest is index 1, etc, and we can use those indexes in place of the vbid. This seems like it needs manual bookkeeping as vBuckets are created/destroyed, and results in vBucket's assignment changing at runtime.
We could use a hash function - we only require every output value to be ~ equally likely given an input. This will give a static allocation of vBucket -> hash.

I think we should use the second option - a hash function. We can use crc32(vbid) as we already have that in the code base and crc32c's output is uniform across the 32-bit output space. By pre-computing the crc32c for the known vbids, we can optimise away the computation and crc32(vbid) will turn into a table lookup (1024 entries).

'std::hash<Vbid>' could return this value.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Vesko Karaganev

Reporter:: Vesko Karaganev

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 26/Jul/24 9:16 AM

Updated:: 29/Jul/24 2:03 AM

Gerrit Reviews

There are no open Gerrit changes

Work distribution is non-uniform for per-vBucket tasks

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty