Description
For predicates involving IN-list, the planner can generate multiple spans for elements in the IN-list such that precise index scan can be performed. For example, assuming an index exists with c1 as the leading index key, the predicate c1 IN [1,2,3] generates 3 spans:
"spans": [ |
{
|
"exact": true, |
"range": [ |
{
|
"high": "1", |
"inclusion": 3, |
"low": "1" |
}
|
]
|
},
|
{
|
"exact": true, |
"range": [ |
{
|
"high": "2", |
"inclusion": 3, |
"low": "2" |
}
|
]
|
},
|
{
|
"exact": true, |
"range": [ |
{
|
"high": "3", |
"inclusion": 3, |
"low": "3" |
}
|
]
|
}
|
],
|
Note that such span generation also works if individual elements of the IN-list use host variables (this requires the number of elements in the IN-list is known at prepare time), for example, predicate c1 IN [$1, $2, $3] generates:
"spans": [ |
{
|
"exact": true, |
"range": [ |
{
|
"high": "$1", |
"inclusion": 3, |
"low": "$1" |
}
|
]
|
},
|
{
|
"exact": true, |
"range": [ |
{
|
"high": "$2", |
"inclusion": 3, |
"low": "$2" |
}
|
]
|
},
|
{
|
"exact": true, |
"range": [ |
{
|
"high": "$3", |
"inclusion": 3, |
"low": "$3" |
}
|
]
|
}
|
],
|
However, if the number of elements in the IN-list is not known at prepare time, and the entire IN-list is passed in as a host variable, then currently the planner does not know how many spans to generate before hand, and thus it generates a single span with the low and high expression generated as ARRAY_MIN($inlist) and ARRAY_MAX($inlist). For predicate c1 IN $inlist, here is the span generated currently:
"spans": [ |
{
|
"exact": true, |
"range": [ |
{
|
"high": "array_max($inlist)", |
"inclusion": 3, |
"low": "array_min($inlist)" |
}
|
]
|
}
|
],
|
This can have performance implications if the range of values between ARRAY_MIN() and ARRAY_MAX() is large and thus a large region of an index needs to be scanned. Depending on circumstance most of the index keys returned from index node to query node would not satisfy the re-evaluation of the IN-list and thus are thrown out. If the IN-list is large (many elements) the reevaluation of the IN-list predicate can also be problematic (which is addressed in a separate improvement MB-25652). The large index scan, and subsequent transport of large number of index key from index node to query node, plus reevaluation of IN-list predicate on each returned index key, all contribute to performance problems in such situations. In fact there has been many customer issues related to performance problems when an IN-list is specified as a host variable, for example CBSE-3859, CBSE-4495.