Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50716

Query service rebalance failed due to timeout during ServiceAPI.PrepareTopologyChange call with reason 'linked_process_died'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • query
    • Enterprise Edition 7.1.0 build 2179

    Description

       

      Steps:

      • 111 node cluster running multiple services
      • Two couchbase bucket
      • One KV node failed over
      • Updated travel-sample bucket to replica=3
      • Perform full add back of failed node
      • Trigger rebalance

      Observation:

      Rebalance failed due to following reason,

      Rebalance exited with reason {service_rebalance_failed,n1ql,
      {agent_died,<33863.30199.52>,
      {linked_process_died,<33863.14699.864>,
      {'ns_1@172.23.122.178',
      {timeout,
      {gen_server,call,
      [<33863.3087.53>,
      {call,"ServiceAPI.PrepareTopologyChange",
      #Fun<json_rpc_connection.0.86436583>,
      #{timeout => 60000}},
      60000]}}}}}}.
      Rebalance Operation Id = f821e08ef755342990a10bc0f2641714

      Following that observing the issues,

      • Rebalance button is not getting enabled in the UI
      • Rest endpoint reports that the cluster is balanced as per ns_server (But then the auto-failover counter is not reset)

      "balanced": true,

       

      Attachments

        1. pools_default.txt
          195 kB
        2. rebalance_issue.jpg
          rebalance_issue.jpg
          536 kB
        3. Rest requests.png
          Rest requests.png
          116 kB
        4. sys_cpu_utilization_rate.png
          sys_cpu_utilization_rate.png
          121 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty