Description
Query timeout reached under neath bulkGet must abort. It makes query service unusable some time.
If query timeout is set we pass that as deadline to go-couchbase which in turn sets to TCP connection as deadline (if nothing is set we pass 2m).
Recent changes ReadTimeout discard the connection (which is right). Then ignored error and retried.
As already reached deadline, We should not retry. This makes connection again fail and makes things worse taking all cpu resources and even makes network port unusable until WAIT_TIME reached.
repro:
Install travel-sample
\set -timeout "5ms";
select type, country, x FROM `travel-sample` where type = "airline" limit 10;
First select will fail. Try again. and check query.log
Few of them are okay because we do in parallel. But in this case it keep on doing.
2021-03-13T18:08:42.200-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57814->127.0.0.1:11210: i/o timeout |
2021-03-13T18:08:42.200-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57816->127.0.0.1:11210: i/o timeout |
2021-03-13T18:08:42.200-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57815->127.0.0.1:11210: i/o timeout |
2021-03-13T18:08:42.209-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57821->127.0.0.1:11210: i/o timeout |
2021-03-13T18:08:42.209-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57819->127.0.0.1:11210: i/o timeout |
2021-03-13T18:08:42.209-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57824->127.0.0.1:11210: i/o timeout |
2021-03-13T18:08:42.209-08:00 [ERROR] Transmit failed in GetBulkAll write tcp 127.0.0.1:57822->127.0.0.1:11210: i/o timeout |
After some time (No more usable ports)
2021-03-13T18:09:31.182-08:00 [INFO] Pool Get returned travel-sample: dial tcp 127.0.0.1:11210: connect: can't assign requested address |
2021-03-13T18:09:31.184-08:00 [INFO] Pool Get returned travel-sample: dial tcp 127.0.0.1:11210: connect: can't assign requested address |
2021-03-13T18:09:31.185-08:00 [INFO] Pool Get returned travel-sample: dial tcp 127.0.0.1:11210: connect: can't assign requested address |
2021-03-13T18:09:31.186-08:00 [INFO] Pool Get returned travel-sample: dial tcp 127.0.0.1:11210: connect: can't assign requested address |
2021-03-13T18:09:31.193-08:00 [INFO] Pool Get returned travel-sample: dial tcp 127.0.0.1:11210: connect: can't assign requested address |
2021-03-13T18:09:31.195-08:00 [INFO] Pool Get returned travel-sample: dial tcp 127.0.0.1:11210: connect: can't assign requested address |
I think this even makes memcached crash
Attachments
Issue Links
- is caused by
-
MB-30800 go-couchbase does not discard connections that have timeout on read.
-
- Closed
-