Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
2.0-beta
-
None
-
Security Level: Public
-
None
Description
256 vbuckets won't scale well at large clusters and data set.
Per email thread we had, we should try 1260 vbuckets:
>> sizes_cleanly(1260, 30)
=> [2, 3, 4, 5, 6, 7, 9, 10, 12, 14, 15, 18, 20, 21, 28, 30]
>> sizes_cleanly(1680, 30)
=> [2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 15, 16, 20, 21, 24, 28, 30]
>> sizes_cleanly(1440, 30)
=> [2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 30]
So 1260 is divisible by 2, 3, 4, 5, 6, 7, 8, 10 etc. They are "happy" cluster sizes because they divide count of vbuckets cleanly so it's possible to perfectly balance vbuckets for them.
1440 doesn't have 7 but does have 9 instead. And 1440 does not have 14 but it has 16 instead.
So they're quite close but 1260 has more "happy" cluster sizes below 30 than 1440. And happy sizes are a bit smaller. And because smaller installations are more frequent it makes sense to prefer 1260.
Interestingly, up to 100 and 50 nodes they have same count of "happy" sizes. In general, with larger cluster sizes happy sizes are rarer anyways.
If 1260 is a bit too large for us we can consider other values in table above. And of course we can rank them better by multiplying happy cluster sizes by probability of seeing them in real world.