Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
4.0.0, 4.1.2, 4.5.1, 4.6.4, 5.0.1, Cheshire-Cat, 6.0.2, 5.5.5, 6.6.1
-
Untriaged
-
1
-
Unknown
Description
Marking it as query, but it affects any component that uses go-couchbase or gomemcached.
When a gomemcached Get() ends prematurely (the most likely cause being a golang timeout),
client/transport.go:getResponse() leaves the response structure uninitialized, which means that it has a 0 status.
This is not picked as fatal, and the connection gets eventually reused, reading incorrectly, data left behind on the wire.
Attachments
For Gerrit Dashboard: MB-42555 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
139570,1 | MB-42555 mark connections as unusable on golang / os socket errors | mad-hatter | gomemcached | Status: MERGED | +2 | +1 |
139715,2 | MB-42555 mark connections as unusable on golang / os socket errors | master | gomemcached | Status: MERGED | +2 | +1 |
we still see the error in 6.6.1-9177
tid: 0 loop: 22
{"elapsedTime": "1.489946ms", "executionTime": "1.300617ms", "resultSize": 0, "resultCount": 0, "errorCount": 2}[
{"msg": "Error performing bulk get operation - cause: read tcp 127.0.0.1:47996->127.0.0.1:11210: i/o timeout", "code": 12008, "retry": true},
{"msg": "Timeout 1ms exceeded", "code": 1080, "retry": true}]
Using sitarams script in the cbse and his repro steps:
VM 8 cpu, 8gb machine
single node cluster 2GB data node
Bucket onebigjson 2GB
Create primary index on onebigjson;
SELECT META(d).id, d.* FROM onebigjson AS d LIMIT 1;
timeout is random as Marco Greco suggested
/opt/couchbase/bin/cbworkloadgen -n 127.0.0.1:8091 -u Administrator -p password -j -b onebigjson -i 1000 -s 2000000