Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-23423

Memcached connection closed for no apparent reason after a couple minutes

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      XDCR sets up two memcached connections to the same node that it resides on:

      1. one memcached connection to start dcp feed and streams
      2. one memcached connection to collect dcp runtime stats

      It used to be the case that these two connections use different username and password, and they co-exist fine.

      1. connection 1 uses bucket name and bucket password
      2. connection 2 uses local admin user name and password, and then do a selectBucket

      To accommodate the new "getting rid of bucket password" effort in spock, I changed connection 1 to use the same local admin user name and password as that in connection 2. Afterward, whenever a connection 2 was set up, connection 1 got dropped by memcached and XDCR replication got restarted as a result.

      Please confirm whether this is the designed behavior. If it is, what is the way to get around from XDCR's perspective.

      The username and password used in memcached connections are shown below:

      "authClient called with user=@goxdcr, pass=dc3067bd6f6612f01c6a17a5d376d9ff, bucket=default"

      As shown in the following extract from memcached log, right after a new memcached connection is set up by goxdcr pipeline supervisor at 17:39:27 (i.e., connection 2), the existing dcp connection got dropped.

      2017-03-20T17:37:30.108088-07:00 NOTICE (default) DCP (Producer) eq_dcpq:xdcr:dcp_5e25d8d1ed7b33baf9f11fe717d8e489/default/sasl_127.0.0.1:12000_1:pMLGE5iVBCrWmaKwlAWO7w== - (vb 899) Creating stream with start seqno 0 and end seqno 18446744073709551615
      2017-03-20T17:37:30.108283-07:00 NOTICE (default) DCP (Producer) eq_dcpq:xdcr:dcp_5e25d8d1ed7b33baf9f11fe717d8e489/default/sasl_127.0.0.1:12000_1:pMLGE5iVBCrWmaKwlAWO7w== - (vb 999) Creating stream with start seqno 0 and end seqno 18446744073709551615
      2017-03-20T17:39:27.680689-07:00 NOTICE 48: HELO [Goxdcr PipelineSupervisor  SourceBucket:default TargetBucket:sasl] TCP NODELAY [ 127.0.0.1:59084 - 127.0.0.1:12000 (@goxdcr) ]
      2017-03-20T17:39:45.097186-07:00 NOTICE (default) DCP (Producer) eq_dcpq:xdcr:dcp_5e25d8d1ed7b33baf9f11fe717d8e489/default/sasl_127.0.0.1:12000_0:FP-rKKEutnWP3MTIpteN2Q== - Removing connection 0x1093546c8
      2017-03-20T17:39:45.097219-07:00 NOTICE (default) DCP (Producer) eq_dcpq:xdcr:dcp_5e25d8d1ed7b33baf9f11fe717d8e489/default/sasl_127.0.0.1:12000_0:FP-rKKEutnWP3MTIpteN2Q== - (vb 499) Stream closing, sent until seqno 0 remaining items 0, reason: The stream closed early because the conn was disconnected

       

      Note, the fact that dcp connection got closed right after pipeline supervisor set up a new connection may just be a coincidence. I tried disabling pipeline supervisor altogether and the dcp connection still got closed for no apparent reason after a couple minutes.

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              yu Yu Sui (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty