Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51666

[magma, 4TB, 1%]: Rebalance exited with reason service_rebalance_failed,index, agent_died. ServiceAPI.StartTopologyChange timed out.

    XMLWordPrintable

Details

    Description

      1. Create a 3 node cluster
      2. Create required buckets and collections.
      3. Create 20000000 items sequentially
      4. Update 20000000 RandomKey keys to create 50 percent fragmentation
      5. Create 20000000 items sequentially
      6. Update 20000000 RandomKey keys to create 50 percent fragmentation
      7. Rebalance in with Loading of docs
        Rebalance completed with progress: 100% in 14996.132 sec
      8. Rebalance Out with Loading of docs
        Rebalance completed with progress: 100% in 15814.6619999 sec
      9. Rebalance In_Out with Loading of docs
        Rebalance completed with progress: 100% in 22021.734 sec
      10. Swap with Loading of docs
        Rebalance completed with progress: 100% in 17247.155 sec
      11. Failover 1 node and RebalanceOut that node with loading in parallel
        Rebalance done. Taken 16233.78 seconds to complete
      12. Failover a node and FullRecovery that node
        Rebalance completed with progress: 100% in 26818.3310001 sec
      13. Failover a node and DeltaRecovery that node with loading in parallel
        Rebalance completed with progress: 100% in 740.840000153 sec
      14. Updating the bucket replica to 2. Replica is updated successfully on the KV side but looks like updating the topology failed after successful KV rebalance as API timed out.

      Rebalance exited with reason {service_rebalance_failed,index,
      {agent_died,<33904.5851.0>,
      {linked_process_died,<33904.30386.431>,
      {'ns_1@172.23.107.126',
      {timeout,
      {gen_server,call,
      [<33904.5910.0>,
      {call,"ServiceAPI.StartTopologyChange",
      #Fun<json_rpc_connection.0.86436583>,
      #{timeout => 60000}},
      60000]}}}}}}.
      Rebalance Operation Id = 7dde3911b331195713c177a2f5c96c5a
      

      Note: The test has passed on the better machines at a higher scale.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ritesh.agarwal Ritesh Agarwal
            ritesh.agarwal Ritesh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty