Uploaded image for project: 'Couchbase Go SDK'
  1. Couchbase Go SDK
  2. GOCBC-509

Tons of "network error"s returned when connecting using "couchbases://"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Major
    • None
    • 1.6.0
    • library
    • None

    Description

      Related to CBG-317

      gocb version: d46732ea85f0ca44f82842ca2996bd2a21995172
      gocbcore version: 5905fd813f0ec718a59374725efaab4993560436

       

      Background

      Using gocb to connect to Couchbase Server using the "couchbases://" scheme (with the built-in self-signed cert on server side), and standard RBAC user/password bucket authentication. We're seeing a ton of "network error"s returned for various bucket ops in a QE environment when running Sync Gateway. This can reliably be reproduced by QE in one of their tests.

       

      The gocb trace logging only gives us a little insight, but we don't understand why this is happening, and why it only seems to happen when connecting to Couchbase Server using TLS. If you change the scheme to "couchbase://", the problem goes away.

       

      Unfortunately, given the connection to Couchbase Server is encrypted, we can't see what are in the packets between gocb and Couchbase Server, and I don't have the keys for the self-signed cert to decrypt the packets. It's difficult to tell what's going on.

       

      Looking at a specific example in attached files

      Looking closer at the pcap file in wireshark, with the filter "tcp.stream eq 43", and comparing this against the "sg_trace.log" entries around timestamp "2019-08-01T11:10:50.833-07:00", we can see some correlation between a bunch of TCP RST packets being sent from server to the client, and these "network error"s being returned to Sync Gateway by gocb.

       From sg_trace.log

      2019-08-01T11:10:50.833-07:00 [ERR] gocb: memdClient read failure: read tcp 172.23.104.204:49994->172.23.107.86:11207: read: connection reset by peer -- base.GoCBCoreLogger.Log() at logger_external.go:47
      2019-08-01T11:10:50.833-07:00 [ERR] gocb: Failed to shut down client connection (write tcp 172.23.104.204:49994->172.23.107.86:11207: write: broken pipe) -- base.GoCBCoreLogger.Log() at logger_external.go:47
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` client died
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` closing consumer 0xc000dac8f0
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` fetching new consumer
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` waiting for client shutdown
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` received client shutdown notification
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline Client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` preparing for new client loop
      2019-08-01T11:10:50.833-07:00 [TRC] Bucket: GetWithXattr("sg_3", ...) [4.458546ms]
      2019-08-01T11:10:50.833-07:00 [TRC] gocb: Pipeline Client `s61401cnt72.sc.couchbase.com:11207/0xc000328200` retrieving new client connection for parent 0xc0002ee6e0
      2019-08-01T11:10:50.833-07:00 [TRC] Bucket: Incr("_sync:seq", 1, 1, 0) [1.180804ms]
      2019-08-01T11:10:50.833-07:00 [WRN] Error from Incr in sequence allocator (1) - attempt (1/3): network error -- db.(*sequenceAllocator).incrWithRetry() at sequence_allocator.go:112
      

      Wireshark showing same timeframe

       

      Questions:

      1. We want to understand what these network errors are.
           The gocb trace logs don't reveal too much, and neither does the raw packet capture.
           There's a cbcollect attached, which may shed shome light.

      2. Why is the pipeline client is dying/restarting a lot in general? Is it because of the network errors in 1?

       

       

      If at all possible, we'd like some assistance on this issue from the SDK or server side! Thanks

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              charles.dixon Charles Dixon
              ben.brooks Ben Brooks
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty