Description
As introduced by MB-9467, there are two limits on the size of documents during indexing:
1) indexer_max_doc_size - documents larger then this value are skipped by the
indexer. A message is logged (with document ID, its size, bucket name, view name, etc)
when such a document is encountered. A value of 0 means no limit (like what it used to
be before). Current default value is 1048576 bytes (1Mb). This is already a very large
value, such large documents take a long time to process, slowing down rebalance, etc.
2) max_kv_size_per_doc - maximum total size (bytes) of KV pairs that can be emitted for
a single document for a single view. When such limit is passed, message is logged (with
document ID, its size, bucket name, view name, etc). A value of 0 means no limit (like what
it used to be before). Current default value is 1048576 bytes (1Mb), which is already a
too large value, that makes everything far from efficient.
There is no mention of these anywhere in the documentation at present, and so they can be confusing to users who find that certain (large) documents are inexplicably not indexed.
I note there is an outstanding 3.0 bug (MB-9713) to add REST endpoints for these - currently you have to use a magic diag/eval to change them - but we should at least mention their existance and default values even if the REST API isn't ready yet.
- -
We probably should also document this:
3) function_timeout - maximum time mapreduce functions can take for any one document.
If it's taking longer than this, the function invocation is aborted. The default limit is 10 seconds.
Setting it to 0 will disable it (not recommended).