Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45939

[System Test Upgrade] Graceful failover + delta recovery + rebalance of indexing + kv node hangs

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Description

      Steps to Repro
      Same as MB-45920.

      Deepkaran Salooja Suggested the following which worked perfectly fine.

      Balakumaran Gopal, to unblock you can cancel the rebalance, wait for 10 minutes(allow for rollback indexes to re-build) and run the rebalance again.

      Then I started one more set of graceful failover + delta recovery + rebalance of indexing + kv node which hung again. Tried the above work around once again which failed as shown below.

      ns_1@172.23.105.102 1:34:46 AM 27 Apr, 2021

      Rebalance exited with reason {service_rebalance_failed,index,
      {{badmatch,
      {error,
      {bad_nodes,index,get_agent,
      [{'ns_1@172.23.105.109',
      {exit,
      {{timeout,
      {gen_server,call,
      [<29698.2821.1350>,
      {call,"ServiceAPI.GetTaskList",
      #Fun<json_rpc_connection.0.23897766>},
      60000]}},
      {gen_server,call,
      [{'service_agent-index',
      'ns_1@172.23.105.109'},
      get_agent,infinity]}}}}]}}},
      [{service_rebalancer,wait_for_agents,1,
      [{file,"src/service_rebalancer.erl"},
      {line,80}]},
      {service_rebalancer,run_rebalance,1,
      [{file,"src/service_rebalancer.erl"},
      {line,59}]},
      {proc_lib,init_p,3,
      [{file,"proc_lib.erl"},{line,234}]}]}}.
      Rebalance Operation Id = 26d6b05a176132a390d09479b6680f86
      

      Retry 1 :
      Retry 2 :

      See rebalanceReport.json for rebalance report. I also noticed one of the indexes(o3_result_rating) is stuck in the moving state for maybe 6-7 hours. See for details.

      cbcollect_info attached.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            kevin.cherkauer Kevin Cherkauer (Inactive)
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty