Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-259

Persistent Pod recovery when cluster cannot auto-failover

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • 1.0.0
    • None
    • operator

    Description

      1. Operator detects node(s) down
      2. Watch logs to detect that down node cannot be Auto-Failed over
      3. Any event  in the autofailover module != EVENT_NODE_AUTO_FAILOVERED
      4. Check that the persistent volumes of the failed pods have status = Ready
      5. Quit if any Pod volumes are inaccessible
      6. Delete failed pod if it exists in kubernetes (Pod volumes are not deleted)
      7. Recreate Pod with exact same name and spec as the failed Pod.
      8. Wait for new pod to become active within cluster
      9. Repeat for all down nodes until all Pods are recovered

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tommie Tommie McAfee (Inactive)
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                PagerDuty