Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
0
Description
We allow a user to SetSeqTreeDataBlockSize upto 128KB. This means, Magma should accumulate data upto 128KB before forming a block and then apply block compression to it. For example, if we write two documents whose size is 64KB each (post snappy per-document compression), then we should place them together in a single block and then apply block compression to it.
However, there's another setting that kind of overrides SetSeqTreeDataBlockSize from taking effect. It is the MinValueBlockSizeThreshold, which is internally set to 64KB. The impact is that if a document (post snappy per-doc compression) is larger than this, then it goes into its own block with a single item, with no block compression applied at all. So in case of the previous example, due to MinValueBlockSizeThreshold, the two documents will go into their own separate blocks and the intended effect of SetSeqTreeDataBlockSize won't be applied.
We should adjust the MinValueBlockSizeThreshold to max(64KB, SetSeqTreeDataBlockSize).