Details
-
Bug
-
Resolution: Unresolved
-
Major
-
6.0.0
-
None
-
Untriaged
-
Unknown
Description
If there are doc updates or deletions, scorch currently always performs an "AndNot" operation between roaring bitmaps to compute a third roaring bitmap that represents a logical "all - excluded" (all the docs minus the excluded (obsoleted) docs).
https://github.com/blevesearch/bleve/blob/master/index/scorch/segment/zap/posting.go#L253
But, if the cardinality of the excluded roaring bitmap (or obsoleted docNum's) is low, then it might not be worth allocating and populating a brand-new, ephemeral roaring bitmap, due to large memory overhead, but instead keep the excluded bitmap as a separate lookup.
Experiments would have to lead us to the right heuristics to see if there are performance improvements here.
Related, roaring bitmaps should have some "copy-on-write" capabilities, so those should be explored, too – where structural sharing of existing bitmap instances might be a big win.