Summary:
When running the following UNNEST query via CBQ when a query memory quota is in place, the query returns partial results and a Request has exceeded memory quota error.
When running the same query with LIMIT n (in this case we used LIMIT 55, though this was also seen with LIMIT 32 in a different cluster) the query runs successfully.
When running the same query again with LIMIT n+1 (we used LIMIT 56), the query returns an error and usedMemory is significantly higher. There seems to be a point n, where usedMemory increases significantly, eg:
No LIMIT clause:
FROM `travel-sample`.inventory.route
|
UNNEST schedule sched
|
WHERE sched.day = 1
|
"errors": [
|
{
|
"code": 5500,
|
"msg": "Request has exceeded memory quota"
|
}
|
],
|
"status": "errors",
|
"metrics": {
|
"elapsedTime": "32.153754ms",
|
"executionTime": "31.647172ms",
|
"resultCount": 11,
|
"resultSize": 1419,
|
"serviceLoad": 25,
|
"usedMemory": 18446744073709545076,
|
"errorCount": 1
|
}
|
}
|
LIMIT n used (55 in this example):
FROM `travel-sample`.inventory.route
|
UNNEST schedule sched
|
WHERE sched.day = 1
|
LIMIT 55;
|
"status": "success",
|
"metrics": {
|
"elapsedTime": "32.937281ms",
|
"executionTime": "32.851006ms",
|
"resultCount": 55,
|
"resultSize": 7095,
|
"serviceLoad": 25,
|
"usedMemory": 16116
|
}
|
}
|
LIMIT n+1 used (56) (usedMemory is significantly higher):
FROM `travel-sample`.inventory.route
|
UNNEST schedule sched
|
WHERE sched.day = 1
|
LIMIT 56;
|
"errors": [
|
{
|
"code": 5500,
|
"msg": "Request has exceeded memory quota"
|
}
|
],
|
"status": "errors",
|
"metrics": {
|
"elapsedTime": "47.781226ms",
|
"executionTime": "47.781053ms",
|
"resultCount": 56,
|
"resultSize": 7224,
|
"serviceLoad": 25,
|
"usedMemory": 18446744073709502869,
|
"errorCount": 1
|
}
|
}
|
The value of n where the issue starts to occur seems to be different in different clusters.
Steps To Reproduce:
- Spin up a test cluster.
- Deploy the travel-sample bucket.
- On UI under Settings -> Advanced Query Settings -> Memory Quota
set the limit to 400 MB.
./cbq -e http://localhost:8091 -u Administrator -p password
|
* Execute the following query from CBQ:
SELECT route.sourceairport, route.destinationairport, sched.flight, sched.utc FROM `travel-sample`.inventory.route UNNEST schedule sched WHERE sched.day = 1;
|
* Observe the query returns an error and `usedMemory` is high, with partial results returned:
"errors": [
|
{
|
"code": 5500,
|
"msg": "Request has exceeded memory quota"
|
}
|
],
|
"status": "errors",
|
"metrics": {
|
"elapsedTime": "29.24446ms",
|
"executionTime": "29.191898ms",
|
"resultCount": 11,
|
"resultSize": 1419,
|
"serviceLoad": 25,
|
"usedMemory": 18446744073709545076,
|
"errorCount": 1
|
}
|
}
|
* Try the same with a high LIMIT clause (I used LIMIT 56) and observe the query also returns the same error:
SELECT route.sourceairport, route.destinationairport, sched.flight, sched.utc
|
FROM `travel-sample`.inventory.route
|
UNNEST schedule sched
|
WHERE sched.day = 1LIMIT 56;
|
"errors": [
|
{
|
"code": 5500,
|
"msg": "Request has exceeded memory quota"
|
}
|
],
|
"status": "errors",
|
"metrics": {
|
"elapsedTime": "59.381922ms",
|
"executionTime": "59.326452ms",
|
"resultCount": 56,
|
"resultSize": 7224,
|
"serviceLoad": 25,
|
"usedMemory": 18446744073709502869,
|
"errorCount": 1
|
}
|
}
|
* Run the query again with lower LIMIT values iteratively, eventually there seems to be a point n where the query runs successfully with no error (this was LIMIT 55 in my cluster):
SELECT route.sourceairport, route.destinationairport, sched.flight, sched.utc FROM `travel-sample`.inventory.route UNNEST schedule sched WHERE sched.day = 1
|
LIMIT 55;
|
"status": "success",
|
"metrics": {
|
"elapsedTime": "32.877363ms",
|
"executionTime": "32.822379ms",
|
"resultCount": 55,
|
"resultSize": 7095,
|
"serviceLoad": 25,
|
"usedMemory": 16116
|
}
|
}
|
* Ie. LIMIT 55 works, but LIMIT 56 doesn't.
- Everything below 55 also works, and everything above 56 doesn't.
- This point seems to be different on different clusters - saw the same issue where LIMIT 32 (and below) worked but LIMIT 33 (and higher) showed the issue on a different cluster.
- Might require testing different values of LIMIT n to reproduce the issue.