Details
-
Improvement
-
Resolution: Fixed
-
Major
-
None
-
5.0.0
-
Security Level: Public
-
None
Description
There are many possible causes of node auto-failover, for example,
- Hardware failures. E.g. Node, network or power failure.
- Unreliable communication among the nodes. E.g. Network congestion
- memcached slowdown or crash
- Out Of Memory problems.
- Time forward jump – unstable system clock.
- Machine freeze or process scheduling issues.
- Stability problems in a virtualized environment. E.g. unhealthy VMs due to overcommitted CPU, RAM.
Please enhance the messaging (UI and logs/Log tab) when a node is auto-failovered to indicate which of these issues is responsible. For example, you could indicate whether the missed heartbeat or not-ready bucket caused the failover.
Attachments
Issue Links
- relates to
-
MB-20579 Robust Failure Detection Mechanism
- Resolved