From a customer request:
We are obliged to follow a regular OS patching schedule for all our servers and have a maintenance window every Friday night.
How would you recommend we deal with our Couchbase clusters for patching?
From reading the Couchbase 2.0 Manual it looks like we have two options, one being a failover, and the other removing the node then re-adding it.
What steps would you recommend we do when taking a node out to do maintenance on it? We plan to do this during our regular maintenance window when load on the servers would be really light.
And the answer:
Our best practice would be a graceful remove and rebalance so I would recommend that first. If you find it takes too long, you could do a failover. The danger with that is that some data would not be replicated and so an unexpected failure during that time would introduce a situation where you need to manually recover data. The graceful remove doesn't introduce that.
Given that these are vms, it would actually be best to spin up one or more new nodes and swap them into the cluster, that way you never reduce capacity.