Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62724

[Eventing][K8S] : Rebalance exited with reason <<"failed to aggregate rebalance progress from all eventing nodes, err: map[127.0.0.1:8096:Get \"http://127.0.0.1:8096/getAggRebalanceProgress\": context deadline exceeded">>

    XMLWordPrintable

Details

    Description

      Steps to reproduce

      1. Created a 3 node cluster on k8s with operator with all services
      2. On one pod, memcached was killed in a loop. Multiple failovers and rebalance failures occur as expected
      3. Stopped the memcached kill loop.
      4. Rebalances beyond this fail in a loop(as triggered by the operator again and again)

      Rebalance exited with reason {service_rebalance_failed,eventing,
      {worker_died,
      {'EXIT',<0.3701.18>,
      {task_failed,rebalance,
      {service_error,
      <<"failed to aggregate rebalance progress from all eventing nodes, err: map[127.0.0.1:8096:Get \"http://127.0.0.1:8096/getAggRebalanceProgress\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)]">>}}}}}.
      Rebalance Operation Id = 73d4b9c58925559a9ce54cb6089d2624 

      During this time tried to query multiple endpoints in the pods

      curl -X GET http://127.0.0.1:8096/getAggRebalanceProgress -u Administrator:password
      {
       "name": "INTERNAL_SERVER_ERROR",
       "code": 59,
       "attributes": [
        "retry"
       ],
       "description": "Failed to get progress"
      }
       
      curl -X GET http://127.0.0.1:8096/api/v1/status -u Administrator:password
      {
       "name": "INTERNAL_SERVER_ERROR",
       "code": 59,
       "attributes": [
        "retry"
       ],
       "description": "Unable to fetch active Eventing nodes, err: Get \"http://cb-example-0001.cb-example.default.svc:8096/uuid\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
      } 

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            santosh.hegde Santosh Hegde
            raghav.sk Raghav S K
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty