Details
-
Improvement
-
Resolution: Fixed
-
Major
-
None
Description
- Operator detects node(s) down
- Watch logs to detect that down node cannot be Auto-Failed over
- Any event in the autofailover module != EVENT_NODE_AUTO_FAILOVERED
- Check that the persistent volumes of the failed pods have status = Ready
- Quit if any Pod volumes are inaccessible
- Delete failed pod if it exists in kubernetes (Pod volumes are not deleted)
- Recreate Pod with exact same name and spec as the failed Pod.
- Wait for new pod to become active within cluster
- Repeat for all down nodes until all Pods are recovered