Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4783

overstressed memcached keeps conns in close_wait even if the client has already closed its conns

    Details

    • Flagged:
      Release Note

      Description

      user has reported that memcached stats reports there are 10k open connections there but there are only 1000 physical connections open on this box

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        moxi closes the conneciton but memcached keeps it open as close_wait so next connection will be rejected

        this causes further issues during the rebalancing

        Show
        farshid Farshid Ghods (Inactive) added a comment - moxi closes the conneciton but memcached keeps it open as close_wait so next connection will be rejected this causes further issues during the rebalancing
        Hide
        steve Steve Yen added a comment -

        One workaround, which we used for a customer with a large cluster, was to increase their client-side moxi timeouts to larger values (larger than the longest backfill request)

        Two longer term solution ideas include...

        1) Unified dispatcher in memcached should help fix this (allows more than one R/O dispatcher for a single bucket).

        2) Allow adaptive/dynamic timeout backoffs in moxi.

        Show
        steve Steve Yen added a comment - One workaround, which we used for a customer with a large cluster, was to increase their client-side moxi timeouts to larger values (larger than the longest backfill request) Two longer term solution ideas include... 1) Unified dispatcher in memcached should help fix this (allows more than one R/O dispatcher for a single bucket). 2) Allow adaptive/dynamic timeout backoffs in moxi.
        Hide
        steve Steve Yen added a comment -

        3) Change moxi default timeout from 5 secs to ?

        Show
        steve Steve Yen added a comment - 3) Change moxi default timeout from 5 secs to ?
        Hide
        steve Steve Yen added a comment -

        4) When moxi has a timeout, moxi shouldn't close a downstream connection?

        Show
        steve Steve Yen added a comment - 4) When moxi has a timeout, moxi shouldn't close a downstream connection?
        Hide
        trond Trond Norbye added a comment -

        Do we have any "tap stats" from this node? in our tap implementation we're "reserving" the connection until we believe it's safe to close it from the ep-engine side (and keeping the connections in a "pending close" state. (a reconnect to the same tap struct won't immediately release the connection, because we're "cloning" the metadata in ep-engine to let it close itself). This logic has been revised on 1.8.1.

        Show
        trond Trond Norbye added a comment - Do we have any "tap stats" from this node? in our tap implementation we're "reserving" the connection until we believe it's safe to close it from the ep-engine side (and keeping the connections in a "pending close" state. (a reconnect to the same tap struct won't immediately release the connection, because we're "cloning" the metadata in ep-engine to let it close itself). This logic has been revised on 1.8.1.
        Hide
        trond Trond Norbye added a comment -

        We've changed the shutdown logic for connections in 1.8.2 and 2.0.

        Show
        trond Trond Norbye added a comment - We've changed the shutdown logic for connections in 1.8.2 and 2.0.

          People

          • Assignee:
            trond Trond Norbye
            Reporter:
            farshid Farshid Ghods (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes