Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-591

Rebalance failure during fail-over reconciliation

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • couchbase-server
    • None

    Description

      This error happens sometimes without a clear pattern.

      My environment:

      Minishift 3.10, k8s operator 1.0.0

      To reproduce:

      Install operator and create 3-nodes demo cluster

      Kill one of the Couchbase pods

      When auto-failover happens, operator add a new node and rebalance.

      The rebalance fails and the operator try to rebalance again and again without success.

      Operator pod logs: operator.log

      Couchbase console logs: 

       

      To workaround, I have delete and create the cluster again. Repeating the process "sometimes" work, but no t always.

       

       

       

      Attachments

        1. cblogs.png
          cblogs.png
          506 kB
        2. operator.log
          39 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          simon.murray Simon Murray added a comment -

          Nothing to do with us, it's analytics going haywire.  We've seen this numerous times.

          If you can run `cbopinfo` with server logs enabled, raise a ticket with the relevant team to get the full story.

          Realistically, cbas needs to 'just work' or NS server needs to tell us that an operation is unsafe to perform, which is all well and good for us, but the end user may do something dangerous in a non-managed environment regardless, so the former is better!

          simon.murray Simon Murray added a comment - Nothing to do with us, it's analytics going haywire.  We've seen this numerous times. If you can run `cbopinfo` with server logs enabled, raise a ticket with the relevant team to get the full story. Realistically, cbas needs to 'just work' or NS server needs to tell us that an operation is unsafe to perform, which is all well and good for us, but the end user may do something dangerous in a non-managed environment regardless, so the former is better!
          simon.murray Simon Murray added a comment -

          As regards the demo, you can play it safe by removing eventing and analytics from your server class in the configuration, that'll be less prone to upsetting the demo gods

          simon.murray Simon Murray added a comment - As regards the demo, you can play it safe by removing eventing and analytics from your server class in the configuration, that'll be less prone to upsetting the demo gods

          Thank you for the tip Simon Murray

          Eventing & analytics is not really the goal of the demo so I will make this small sacrifice to them

          manuel.hurtado Manuel Hurtado (Inactive) added a comment - Thank you for the tip Simon Murray .  Eventing & analytics is not really the goal of the demo so I will make this small sacrifice to them

          People

            simon.murray Simon Murray
            manuel.hurtado Manuel Hurtado (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty