Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
CheshireCat.Next
-
None
-
1
Description
This MB is a placeholder for a potential Planner sample-based size estimation feature.
Planner's size estimation of new indexes currently uses a heuristic that they will each take all of the available memory on indexer nodes that are not already above some memory saturation threshold. A better estimate should be possible via statistical sampling of documents. The sample size needed is independent of the population size (number of documents):
https://en.wikipedia.org/wiki/Standard_error#Standard_error_of_the_mean
For most purposes, a sample size of 1,000 is sufficient for statistical significance at high confidence.