Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.5.1, 6.6.2, Cheshire-Cat
-
1
Description
Based on a customer set up, it is possible in a very rare case for goxdcr to leak connections and end up taking up all the file descriptors of a system.
(Below is finding from the customer's case without the customer reference)
This is the code to clean up any failed REST calls, such as to ns_server:
http://src.couchbase.org/source/xref/6.5.1/goproj/src/github.com/couchbase/goxdcr/utils/utils.go#2168-2173
2167 transport, ok := client.Transport.(*http.Transport)
|
2168 if ok {
|
2169 if u.IsSeriousNetError(err) {
|
2170 logger.Debugf("Encountered %v, close all idle connections for this http client.\n", err)
|
2171 }
|
2172 transport.CloseIdleConnections()
|
2173 }
|
The suspect thing is that it is possible for transport not to be set. As is the case, for http calls (to local ns_server, we don’t encrypt), goxdcr doesn’t set the transport:
client = &http.Client{Timeout: base.DefaultHttpTimeout}
|
If transport is not set, then we’re not closing idle connection, and depending on golang to close it for us.
It just so happens that golang has had an issue https://github.com/golang/go/issues/28012 that showcases how TCP connection is not closed if the server doesn’t respond.
In particular, the user posted a code snip that is exactly how XDCR creates the http client. See https://github.com/golang/go/issues/28012#issuecomment-562290662 and he claims that the TCP connection isn’t closed.
This issue was fixed Dec 11, 2019 in golang 1.14, with the tile: “net/http: don't wait indefinitely in Transport for proxy CONNECT response”.
XDCR for 6.5.1 is shipped with golang 1.11 according to CMakefile. The golang issue I mentioned was filed with the OP using 1.11 as well.
Attachments
Issue Links
- is a backport of
-
MB-44128 XDCR TCP connection leak when host does not respond and XDCR retries
- Closed