Uploaded image for project: 'Couchbase Go SDK'
  1. Couchbase Go SDK
  2. GOCBC-233

Support keep-alive for TCP, HTTP requests

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: library
    • Labels:
      None

      Description

      Support has reported issues running in Azure environments, where the Azure VM idle timeout appears to be killing connections that do not have a keep-alive configured.

      In Sync Gateway 1.4, the fix for this (via go-couchbase) is to:

      1. Call SetKeepAlive and SetKeepAlivePeriod on the memcached TCPConn.
      2. Use the DefaultTransport as the starting point when creating an HTTPClient, to pick up the default DialContext settings. (https://golang.org/pkg/net/http/#RoundTripper)

      I don't believe gocb currently supports either of the above two options.

      For TCP, keep alive isn't being set:
      https://github.com/couchbase/gocbcore/blob/ace0f2ec0d007700390c2eb073a6a7fc9ee58f2d/memdconn.go#L41

      For HTTP, gocb builds a custom transport from scratch that doesn't include DialContext settings:
      https://github.com/couchbase/gocbcore/blob/v7/agent.go#L335

      From previous conversations with Brett Lawson, this may be by design on the TCP side, but I'd like to hear more specifics about why that's not needed to support something like the Azure environment.

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          Hide
          adamf Adam Fraser added a comment -

          We're currently looking at a maintenance patch for Sync Gateway 1.4 to fix this on the go-couchbase side. Given that, we also need to have this fixed asap for SG 1.5 (which uses gocb), to avoid regressions when moving from 1.4 to 1.5.

          Show
          adamf Adam Fraser added a comment - We're currently looking at a maintenance patch for Sync Gateway 1.4 to fix this on the go-couchbase side. Given that, we also need to have this fixed asap for SG 1.5 (which uses gocb), to avoid regressions when moving from 1.4 to 1.5.
          Hide
          brett19 Brett Lawson added a comment -

          Hey Adam Fraser,

          Do you know what the reason behind needing Keep Alive is?  The Go SDK performs operations on its TCP sockets regularly (config fetches and such), which should effectively keep them alive.  In terms of our HTTP connections, in all cases they are part of a short-lived pool and are not long-lived connections anyways.

          Cheers, Brett

          Show
          brett19 Brett Lawson added a comment - Hey Adam Fraser , Do you know what the reason behind needing Keep Alive is?  The Go SDK performs operations on its TCP sockets regularly (config fetches and such), which should effectively keep them alive.  In terms of our HTTP connections, in all cases they are part of a short-lived pool and are not long-lived connections anyways. Cheers, Brett
          Hide
          adamf Adam Fraser added a comment -

          That sounds reasonable on the TCP side.

          On the HTTP side, isn't the agent reusing the same long-lived http.Client (httpCli) for all requests? In the go-couchbase case, we were seeing Azure close the connection when executing moderately large view queries, so I expect we'd see the same thing when using bucket.executeViewQuery().

          Support currently has an Azure environment they are using to repro this scenario with SG 1.4/gocouchbase - I'll continue trying to get them to test/repro using SG 1.5/gocb to get you more details.

          Show
          adamf Adam Fraser added a comment - That sounds reasonable on the TCP side. On the HTTP side, isn't the agent reusing the same long-lived http.Client (httpCli) for all requests? In the go-couchbase case, we were seeing Azure close the connection when executing moderately large view queries, so I expect we'd see the same thing when using bucket.executeViewQuery(). Support currently has an Azure environment they are using to repro this scenario with SG 1.4/gocouchbase - I'll continue trying to get them to test/repro using SG 1.5/gocb to get you more details.
          Hide
          adamf Adam Fraser added a comment - - edited

          Asif Kazi has tested SG 1.5 (which uses gocb) in an Azure environment, and is reporting the same problems related to keep-alive we were seeing with go-couchbase/1.4.

          Show
          adamf Adam Fraser added a comment - - edited Asif Kazi has tested SG 1.5 (which uses gocb) in an Azure environment, and is reporting the same problems related to keep-alive we were seeing with go-couchbase/1.4.
          Hide
          brett19 Brett Lawson added a comment -

          Hey Adam Fraser,

          I believe the following change should resolve the issue with connections being held in the pool for an overly long period of time:
          http://review.couchbase.org/83312 GOCBC-233: Specify timeout values for HTTP connections.

          Cheers, Brett

          Show
          brett19 Brett Lawson added a comment - Hey Adam Fraser , I believe the following change should resolve the issue with connections being held in the pool for an overly long period of time: http://review.couchbase.org/83312 GOCBC-233 : Specify timeout values for HTTP connections. Cheers, Brett
          Hide
          adamf Adam Fraser added a comment -

          This has been successfully verified in an Azure environment - should be good to merge.

          Show
          adamf Adam Fraser added a comment - This has been successfully verified in an Azure environment - should be good to merge.
          Hide
          build-team Couchbase Build Team added a comment -

          Build 5.1.0-1448 contains gocbcore commit 87479d3ac646baac2385314bfb743bb04928f280 with commit message:
          GOCBC-233: Specify timeout values for HTTP connections.
          https://github.com/couchbase/gocbcore/commit/87479d3ac646baac2385314bfb743bb04928f280

          Show
          build-team Couchbase Build Team added a comment - Build 5.1.0-1448 contains gocbcore commit 87479d3ac646baac2385314bfb743bb04928f280 with commit message: GOCBC-233 : Specify timeout values for HTTP connections. https://github.com/couchbase/gocbcore/commit/87479d3ac646baac2385314bfb743bb04928f280
          Hide
          build-team Couchbase Build Team added a comment -

          Build 5.0.1-4738 contains gocbcore commit 87479d3ac646baac2385314bfb743bb04928f280 with commit message:
          GOCBC-233: Specify timeout values for HTTP connections.
          https://github.com/couchbase/gocbcore/commit/87479d3ac646baac2385314bfb743bb04928f280

          Show
          build-team Couchbase Build Team added a comment - Build 5.0.1-4738 contains gocbcore commit 87479d3ac646baac2385314bfb743bb04928f280 with commit message: GOCBC-233 : Specify timeout values for HTTP connections. https://github.com/couchbase/gocbcore/commit/87479d3ac646baac2385314bfb743bb04928f280

            People

            • Assignee:
              brett19 Brett Lawson
              Reporter:
              adamf Adam Fraser
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Gerrit Reviews

                There are no open Gerrit changes

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.