Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
6.5.1, 6.6.0, Cheshire-Cat
-
None
-
1
Description
Currently, when a Couchbase cluster is up and running, in effect we expect it to always continue running. There are some situations where a more graceful shutdown would be useful.
One scenario is when a cluster is used for development. In this case, the user would like to stop Couchbase Server on all nodes, presumably auto-failover is not enabled, and then possibly even "turn off" the system where it is running. For example, this would be a VM/Cloud environment.
Another scenario is non-HA service maintenance and management. At the moment, there is not a good way for a non-HA application to be shut down in a graceful way. Client connections are dropped without a goodbye. Some services have a grace period, but the KV service does not.
Specific design of this will require some cross-component collaboration. Just to inspire with some thoughts though…
Upon receiving a SIGTERM or other shutdown (see also MB-40802), the cluster manager could signal to KV engine that it should stop accepting frontend traffic, probably by returning some kind of error indicating the node is going offline to return later, which would signal to internal/external clients that gracefully hanging up connections is in order. This would allow replication and disk IO to complete and processes to shut down.
We could then have any clients, who had been connected, gracefully go into a backoff-retry of reconnecting at a high linear level and return some kind of error indicating service is unavailable for planned reasons.
Attachments
Issue Links
- relates to
-
MB-40802 Support graceful shutdown after receipt of a SIGTERM signal (on non-Windows systems)
- Closed