Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46840

[System Test] Rebalance to remove a failed over index node failed with error inactivity_timeout

    XMLWordPrintable

Details

    Description

      Build : 7.0.0-5274
      Test : -test tests/integration/cheshirecat/test_cheshirecat_kv_gsi_coll_xdcr_backup_sgw_fts_itemct_txns_eventing_cbas_scale3.yml -scope tests/integration/cheshirecat/scope_cheshirecat_with_backup.yml (Longevity)
      Scale : 3
      Iteration : 3rd
      Day : 4th

      In the longevity system test, there is a step to failover and remove an index node - 172.23.120.58. This rebalance failed with the following error -

      Test console

      [2021-06-08T15:22:24-07:00, sequoiatools/couchbase-cli:7.0:34c71b] failover -c 172.23.97.74:8091 --server-failover 172.23.120.58:8091 -u Administrator -p password --hard
      [2021-06-08T15:22:33-07:00, sequoiatools/couchbase-cli:7.0:4ed641] rebalance -c 172.23.97.74:8091 -u Administrator -p password
       
      Error occurred on container - sequoiatools/couchbase-cli:7.0:[rebalance -c 172.23.97.74:8091 -u Administrator -p password]
       
      docker logs 4ed641
      docker start 4ed641
       
      *Unable to display progress bar on this os
      JERROR: Rebalance failed. See logs for detailed reason. You can try again.
      [2021-06-08T15:34:02-07:00, sequoiatools/cmd:22c663] 60
      

      From the error.log on the orchestrator node 172.23.97.74 :

      [ns_server:error,2021-06-08T15:33:54.887-07:00,ns_1@172.23.97.74:service_rebalancer-index<0.4905.2860>:service_rebalancer:run_rebalance_worker:119]Worker terminated abnormally: {'EXIT',<0.2503.2864>,
                                            {rebalance_failed,inactivity_timeout}}
      [user:error,2021-06-08T15:33:54.890-07:00,ns_1@172.23.97.74:<0.25747.0>:ns_orchestrator:log_rebalance_completion:1416]Rebalance exited with reason {service_rebalance_failed,index,
                                    {worker_died,
                                     {'EXIT',<0.2503.2864>,
                                      {rebalance_failed,inactivity_timeout}}}}.
      Rebalance Operation Id = 9730844ea608e8291f72899afccfc5f4
      

      The subsequent rebalances went fine.

      Index nodes in the cluster : 172.23.120.58, 172.23.96.243, 172.23.97.105, 172.23.97.110, 172.23.97.148, 172.23.120.75

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mihir.kamdar Mihir Kamdar (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty