Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-512

Getting rebalance incomplete event before the cluster enters balanced condition

    XMLWordPrintable

Details

    Description

      Testcase: TestNodeUnschedulable

      Getting "rebalance-incomplete" event before the cluster enters the balanced state.

      Attaching operator and couchbase cluster logs with this.

      util.go:189: Expected events to be:
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0000 added to cluster
                  Type: Normal | Reason: BucketCreated | Message: A new bucket `default` was created
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0001 added to cluster
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0002 added to cluster
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0003 added to cluster
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0004 added to cluster
                  Type: Normal | Reason: RebalanceStarted | Message: A rebalance has been started to balance data across the cluster
                  Type: Normal | Reason: RebalanceCompleted | Message: A rebalance has completed
       
                  but got:
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0000 added to cluster
                  Type: Normal | Reason: BucketCreated | Message: A new bucket `default` was created
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0001 added to cluster
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0002 added to cluster
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0003 added to cluster
                  Type: Normal | Reason: NewMemberAdded | Message: New member test-couchbase-k2zd9-0004 added to cluster
                  Type: Normal | Reason: RebalanceStarted | Message: A rebalance has been started to balance data across the cluster
                  Type: Normal | Reason: RebalanceIncomplete | Message: A rebalance is incomplete
                  Type: Normal | Reason: RebalanceStarted | Message: A rebalance has been started to balance data across the cluster
                  Type: Normal | Reason: RebalanceCompleted | Message: A rebalance has completed

      Attachments

        Issue Links

          For Gerrit Dashboard: K8S-512
          # Subject Branch Project Status CR V

          Activity

            simon.murray Simon Murray added a comment -

            From the server logs:

            [ns_server:error,2018-08-05T11:19:39.949Z,ns_1@test-couchbase-k2zd9-0000.test-couchbase-k2zd9.ashwin.svc:<0.19219.0>:ns_single_vbucket_mover:spawn_and_wait:105]Got unexpected exit signal {'EXIT',<0.19239.0>,
                                        {bulk_set_vbucket_state_failed,
                                         [{'ns_1@test-couchbase-k2zd9-0002.test-couchbase-k2zd9.ashwin.svc',
                                           {'EXIT',
                                            {{{{{case_clause,
                                                 {error,
                                                  {{{badmatch,
                                                     {error,
                                                      {{badmatch,{error,nxdomain,}}

             

            So this appears to be a DNS issue, and nothing to do with the operator.  You may want to raise this with the NS server team to double check.  One suggestion is to retry the connection attempt a few times with a fibonaci back off before declaring that you cannot connect, as it's a transient error evidenced by the fact that a subsequent rebalance succeeds.

            simon.murray Simon Murray added a comment - From the server logs: [ns_server:error,2018-08-05T11:19:39.949Z,ns_1@test-couchbase-k2zd9-0000.test-couchbase-k2zd9.ashwin.svc:<0.19219.0>:ns_single_vbucket_mover:spawn_and_wait:105] Got unexpected exit signal {'EXIT',<0.19239.0>,                             {bulk_set_vbucket_state_failed,                              [{'ns_1@test-couchbase-k2zd9-0002.test-couchbase-k2zd9.ashwin.svc',                                {'EXIT',                                 {{{{{case_clause,                                      {error,                                       {{{badmatch,                                          {error,                                           {{badmatch,{error, nxdomain ,}}   So this appears to be a DNS issue, and nothing to do with the operator.  You may want to raise this with the NS server team to double check.  One suggestion is to retry the connection attempt a few times with a fibonaci back off before declaring that you cannot connect, as it's a transient error evidenced by the fact that a subsequent rebalance succeeds.

            Closing this out since it's due to networking issues in the test environment.

            mikew Mike Wiederhold [X] (Inactive) added a comment - Closing this out since it's due to networking issues in the test environment.

            Closing this, since it relates to network flakiness in K8S environment

            ashwin.govindarajulu Ashwin Govindarajulu added a comment - Closing this, since it relates to network flakiness in K8S environment

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              ashwin.govindarajulu Ashwin Govindarajulu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty