Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.2.0
-
Untriaged
-
0
-
Unknown
-
Analytics CBO Sprint 18, Analytics CBO Sprint 19, Analytics CBO Sprint 20, Analytics CBO Sprint 21, Analytics Sprint 24
Description
For the following CH2 query fragment:
SELECT count(*) as revenue
|
FROM orders o, o.o_orderline ol
|
WHERE ol.ol_delivery_d >= '2016-01-01 00:00:00.000000' |
AND ol.ol_delivery_d < '2017-01-01 00:00:00.000000'; |
Cardinality of orders is 300K, there are an average of 10 orderlines per orders, so the number of orderlines is ~3M.
This query returns 320378 orderlines, so the selectivity (actual, not estimated) of the predicate is 320378/3M = 0.106
When we estimate the selectivity using samples, the sample query returns 922 docs, but since we create the sample on orders (whose sample size is 1063), we compute the selectivity as 922/1063 = 0.867, which is clearly incorrect. Since the selectivity is too high, we do not choose an index scan, which leads to poor performance.
The selectivity should be computed against the cardinality of the orderlines in the sample which is 10630, so estimated selectivity should be 922/10630 = 0.0867. When the query has an UNNEST operator, we need to account for an "unnesting factor" in the sample size used for selectivity estimation.
Attachments
Issue Links
- relates to
-
MB-55346 [CX] Add cardinality estimations for UNNEST & Co.
- Resolved
- links to