Description
Currently, we inject PROJECT operators greedily between operators to reduce the size of intermediate results; however, this might still not efficient enough. To illustrate, let's consider the following query:
SELECT MAX(r.temp) |
FROM ExperDatasetRow e, e.readings r |
Plan:
distribute result [$$48] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- DISTRIBUTE_RESULT |UNPARTITIONED|
|
exchange [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- ONE_TO_ONE_EXCHANGE |UNPARTITIONED|
|
project ([$$48]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- STREAM_PROJECT |UNPARTITIONED|
|
assign [$$48] <- [{\"$1\": $$50}] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- ASSIGN |UNPARTITIONED|
|
aggregate [$$50] <- [agg-global-sql-max($$53)] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- AGGREGATE |UNPARTITIONED|
|
exchange [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- RANDOM_MERGE_EXCHANGE |PARTITIONED|
|
aggregate [$$53] <- [agg-local-sql-max($$46)] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- AGGREGATE |PARTITIONED|
|
project ([$$46]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- STREAM_PROJECT |PARTITIONED|
|
assign [$$46] <- [$$r.getField(\"temp\")] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- ASSIGN |PARTITIONED|
|
project ([$$r]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- STREAM_PROJECT |PARTITIONED|
|
unnest $$r <- scan-collection($$51) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- UNNEST |PARTITIONED|
|
project ([$$51]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- STREAM_PROJECT |PARTITIONED|
|
assign [$$51] <- [$$e.getField(\"readings\")] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- ASSIGN |PARTITIONED|
|
project ([$$e]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- STREAM_PROJECT |PARTITIONED|
|
exchange [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
|
data-scan []<-[$$49, $$e] <- FullDataverse.ExperDatasetRow [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- DATASOURCE_SCAN |PARTITIONED|
|
exchange [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
|
empty-tuple-source [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- EMPTY_TUPLE_SOURCE |PARTITIONED|
|
We see that the UNNEST operator is followed by a PROJECT operator:
project ([$$r]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- STREAM_PROJECT |PARTITIONED|
|
unnest $$r <- scan-collection($$51) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]
|
-- UNNEST |PARTITIONED|
|
Let's say the frame size is 32KB (the current default) and the size of the array "readings" is close to the frame size itself (i.e., a little bit less than 32KB). Then, UNNEST would flush output frames rapidly as the frame size can only accommodate the array + a one or two elements of the unnested elements. Now, the array itself is not needed at all as it is projected out by project ([$$r]) and only the unnested elements are needed after UNNEST. This is wasteful and is one of the main reasons that unnesting large arrays is too slow.
If the unnest itself does not output the array (i.e., only produces the unnested elements), those unnested elements will have the full 32KB for themselves – reducing the number of frames produced by the UNNEST operator.