Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60795

Memcached stuck on shutdown, waiting for clients to disconnect

    XMLWordPrintable

Details

    • Triaged
    • 0
    • Yes

    Description

      Starting within the last few days, Analytics is seeing test failures due to memcached getting stuck on graceful shutdown. The test expects that memcached will close the client connections on shutdown which it did previously, but now it does not. Even once the connection is detected as dead within Analytics and closed, memcached continues to log periodically it is waiting for the connections to be closed, and does not exit. I was forced to kill -9 memcached to get it to exit.

      2024-02-14T11:41:40.487333-05:00 INFO Gracefully shutting down due to shutdown message or stdin closure
      2024-02-14T11:41:40.550639-05:00 INFO Initiating graceful shutdown. sigint:false sigterm:false
      2024-02-14T11:41:40.550643-05:00 INFO Stop all buckets
      2024-02-14T11:41:40.550645-05:00 INFO Waiting for delete of gbook_users to complete
      2024-02-14T11:41:40.563166-05:00 INFO <none>: Delete bucket [gbook_users]. Notifying engine
      2024-02-14T11:41:40.563182-05:00 INFO (gbook_users) Shutting down all DCP connections in preparation for bucket deletion.
      2024-02-14T11:41:40.563185-05:00 INFO (gbook_users) Shutting down dcp connections!
      ...
      2024-02-14T11:41:40.563389-05:00 INFO 69: (No Engine) DCP (Producer) eq_dcpq:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:1 - Removing connection [ {"ip":"192.168.38.72","port":59802} - {"ip":"192.168.38.72","port":11998} (System, @cbas) ]
      2024-02-14T11:41:40.563547-05:00 INFO 68: (No Engine) DCP (Producer) eq_dcpq:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:0 - Removing connection [ {"ip":"192.168.38.72","port":59801} - {"ip":"192.168.38.72","port":11998} (System, @cbas) ]
      2024-02-14T11:41:40.563569-05:00 INFO <none>: Delete bucket [gbook_users]. Engine ready for shutdown
      2024-02-14T11:41:40.563572-05:00 INFO <none>: Delete bucket [gbook_users]. Wait for 2 clients to disconnect 
      

      after several minutes, the connections are still active:

      $ date; netstat -n | grep 11998
      Wed Feb 14 11:48:24 EST 2024
      tcp4       0      0  192.168.38.72.11998    192.168.38.72.59802    ESTABLISHED
      tcp4       0      0  192.168.38.72.11998    192.168.38.72.59801    ESTABLISHED
      

      Even once our DeadConnectionDetection closes the connections, memcached doesn't exit and continues to list the connections as open:

      2024-02-14T11:49:35.875-05:00 INFO CBAS.events.DeadConnectionDetection [FixerThread:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:1] Running dead connection detection
      2024-02-14T11:49:35.875-05:00 INFO CBAS.conductor.DcpChannel [FixerThread:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:1] Detected dead connection on {"class" : "DcpChannel", "inetAddress" : "/192.168.38.72:11998", "bucket" : "gbook_users", "state" : "CONNECTED"}
      2024-02-14T11:49:35.875-05:00 INFO CBAS.conductor.DcpChannel [FixerThread:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:1] {"class" : "DcpChannel", "inetAddress" : "/192.168.38.72:11998", "bucket" : "gbook_users", "state" : "CONNECTED"} is disconnecting
      2024-02-14T11:49:35.876-05:00 INFO CBAS.events.DeadConnectionDetection [FixerThread:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:0] Running dead connection detection
      2024-02-14T11:49:35.876-05:00 INFO CBAS.conductor.DcpChannel [FixerThread:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:0] Detected dead connection on {"class" : "DcpChannel", "inetAddress" : "/192.168.38.72:11998", "bucket" : "gbook_users", "state" : "CONNECTED"}
      2024-02-14T11:49:35.876-05:00 INFO CBAS.conductor.DcpChannel [FixerThread:cbas:Local:gbook_users:88cd2fc68bb3fc3f7571f7c0f9f87bf4:0] {"class" : "DcpChannel", "inetAddress" : "/192.168.38.72:11998", "bucket" : "gbook_users", "state" : "CONNECTED"} is disconnecting
      2024-02-14T11:49:35.878-05:00 DEBU CBAS.conductor.DcpChannelCloseListener [nioEventLoopGroup-3-3] DCP Connection dropped
      2024-02-14T11:49:35.879-05:00 DEBU CBAS.conductor.DcpChannelCloseListener [nioEventLoopGroup-3-4] DCP Connection dropped
      

      2024-02-14T11:54:40.616786-05:00 INFO <none>: Delete bucket [gbook_users].Still waiting: 2 clients connected: {"68":{"agent_name":"cbas:Local:gbook_users:88cd2fc68","bucket_index":1,"clustermap":{"epoch":-1,"revno":0},"connection":"0x0000000109f3c000","connection_id":"192.168.38.72:59801","cookies":[],"datatype":"json","dcp":true,"dcp_deleted_user_xattr":false,"dcp_no_value":false,"dcp_xattr_aware":false,"features":["xerror","tcp nodelay","collections"],"flow_ctrl_buffer_size":5242880,"internal":true,"last_used":"12 ms","max_reqs_per_event":5,"max_sched_time":"2000","min_sched_time":"0","nevents":5,"peername":{"ip":"192.168.38.72","port":59801},"priority":"Medium","protocol":"memcached","refcount":1,"sasl_enabled":false,"sendqueue":{"actual":0,"last":3781449419508583,"size":0,"term":false},"socket":68,"socket_options":{"so_keepalive":8,"so_linger":"off","so_rcvbuf":261312,"so_sndbuf":7456540,"tcp_keepcnt":3,"tcp_keepidle":360,"tcp_keepintvl":10},"sockname":{"ip":"192.168.38.72","port":11998},"ssl":true,"state":"pending close","thread":3,"total_cpu_time":"233033625","total_recv":5538,"total_send":18031174,"type":"producer","user":{"domain":"local","name":"@cbas"},"yields":0},"69":{"agent_name":"cbas:Local:gbook_users:88cd2fc68","bucket_index":1,"clustermap":{"epoch":-1,"revno":0},"connection":"0x0000000109fcc000","connection_id":"192.168.38.72:59802","cookies":[],"datatype":"json","dcp":true,"dcp_deleted_user_xattr":false,"dcp_no_value":false,"dcp_xattr_aware":false,"features":["xerror","tcp nodelay","collections"],"flow_ctrl_buffer_size":5242880,"internal":true,"last_used":"12 ms","max_reqs_per_event":5,"max_sched_time":"459","min_sched_time":"0","nevents":5,"peername":{"ip":"192.168.38.72","port":59802},"priority":"Medium","protocol":"memcached","refcount":1,"sasl_enabled":false,"sendqueue":{"actual":0,"last":3781449459330708,"size":0,"term":false},"socket":69,"socket_options":{"so_keepalive":8,"so_linger":"off","so_rcvbuf":261312,"so_sndbuf":7456540,"tcp_keepcnt":3,"tcp_keepidle":360,"tcp_keepintvl":10},"sockname":{"ip":"192.168.38.72","port":11998},"ssl":true,"state":"pending close","thread":4,"total_cpu_time":"199525350","total_recv":5538,"total_send":17986113,"type":"producer","user":{"domain":"local","name":"@cbas"},"yields":0}}
      

      I have attached cbcollect_infos.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              michael.blow Michael Blow
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty