Details
-
Bug
-
Resolution: Unresolved
-
Major
-
Cheshire-Cat
-
Untriaged
-
1
-
Unknown
Description
request_handler.go consolideIndexStatus() consolidates status across all partitions of a given index instance (and also confounds this with consolidating source and target of an index that is in the process of moving). As it does this it intends to compute the consolidated IndexStatus fields Completion and Progress as the unweighted average across all partitions (and sources + targets of a move), but the computations are incorrect:
s2.Completion = (s2.Completion + status.Completion) / 2
|
s2.Progress = (s2.Progress + status.Progress) / 2.0
|
These always weight the most recently seen partition at 50% in the running average instead of equally weighted across all partitions. The earlier a partition is seen, the more exponentially discounted its weight will be. For example, in the final average:
- 3 partitions will be weighted 1/4, 1/4, 1/2
- 4 partitions will be weighted 1/8, 1/8, 1/4, 1/2
- 5 partitions will be weighted 1/16, 1/16, 1/8, 1/4, 1/2
et cetera. Computing the correct unweighted average cannot be done incrementally – instead the sum and count must both be captured and the average computed at the end.
2021-09-21 update: The same problem exists for the avgScanRate (avg_scan_rate) statistic:
scan_coordinator.go handleStats()
for _, pid := range partitions {
|
...
|
scanRate := float64(numRowsScanned-partnStats.lastNumRowsScanned.Value()) / elapsed
|
partnStats.avgScanRate.Set(int64((scanRate + float64(partnStats.avgScanRate.Value())) / 2))
|
...
|
}
|