Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46039

[System Test] : Index rebalance failed due to reason - Post http://127.0.0.1:9102/createIndexRebalance: EOF

    XMLWordPrintable

Details

    Description

      Build : 7.0.0-5057
      Test : -test tests/integration/cheshirecat/test_cheshirecat_kv_gsi_coll_xdcr_backup_sgw_fts_itemct_txns_eventing_cbas_scale3.yml -scope tests/integration/cheshirecat/scope_cheshirecat_with_backup.yml
      Scale : 3
      Day : 1
      Iteration : 1st

      Rebalance to add a new index node to the cluster failed

      [2021-04-29T17:19:34-07:00, sequoiatools/couchbase-cli:7.0:7935f5] server-add -c 172.23.108.103:8091 --server-add https://172.23.99.11 -u Administrator -p password --server-add-username Administrator --server-add-password password --services index
      [2021-04-29T17:19:54-07:00, sequoiatools/couchbase-cli:7.0:55b2a3] rebalance -c 172.23.108.103:8091 -u Administrator -p password
       
      Error occurred on container - sequoiatools/couchbase-cli:7.0:[rebalance -c 172.23.108.103:8091 -u Administrator -p password]
       
      docker logs 55b2a3
      docker start 55b2a3
       
      *Unable to display progress bar on this os
      JERROR: Rebalance failed. See logs for detailed reason. You can try again.
      [2021-04-29T18:13:25-07:00, sequoiatools/cmd:16a0b6] 60
      

      [ns_server:error,2021-04-29T18:13:18.857-07:00,ns_1@172.23.108.103:service_rebalancer-index<0.19273.479>:service_rebalancer:run_rebalance_worker:119]Worker terminated abnormally: {'EXIT',<0.18026.479>,
                                     {rebalance_failed,
                                      {service_error,
                                       <<"Post http://127.0.0.1:9102/createIndexRebalance: EOF">>}}}
      [user:error,2021-04-29T18:13:18.862-07:00,ns_1@172.23.108.103:<0.22663.0>:ns_orchestrator:log_rebalance_completion:1405]Rebalance exited with reason {service_rebalance_failed,index,
                                    {worker_died,
                                     {'EXIT',<0.18026.479>,
                                      {rebalance_failed,
                                       {service_error,
                                        <<"Post http://127.0.0.1:9102/createIndexRebalance: EOF">>}}}}}.
      Rebalance Operation Id = 0cd1574cb1098413302c87c18d910e92
      

      Indexer nodes : 172.23.104.67, 172.23.121.117, 172.23.96.252, 172.23.96.253, 172.23.99.11

      Seeing the following in the indexer logs of 172.23.104.67 & 172.23.121.117 :

      172.23.104.67

      indexer.log.8.gz:2021-04-29T18:13:18.838-07:00 [Info] Rebalancer::decodeTransferToken TransferToken TransferToken20:90:9c:87:88:d3:d1:fe  MasterId: c7fc96e8303672d2c60e8de4ea7b2508 SourceId: c7fc96e8303672d2c60e8de4ea7b2508 (172.23.121.117:8091) DestId: 4f019a09f1fed08431cfd7f88966f0c0 (172.23.96.253:8091) RebalId: 8cfdef6fa8c2c512128f4da4a22ee75c State: TransferTokenCreated BuildSource: Dcp TransferMode: Move Error: Post http://127.0.0.1:9102/createIndexRebalance: EOF InstId: 12883746746932021693 RealInstId: 17570484419532551099 Partitions: [3] Versions: [9] Inst:
      

      172.23.121.117

      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Info] Rebalancer::decodeTransferToken TransferToken TransferToken20:90:9c:87:88:d3:d1:fe  MasterId: c7fc96e8303672d2c60e8de4ea7b2508 SourceId: c7fc96e8303672d2c60e8de4ea7b2508 (172.23.121.117:8091) DestId: 4f019a09f1fed08431cfd7f88966f0c0 (172.23.96.253:8091) RebalId: 8cfdef6fa8c2c512128f4da4a22ee75c State: TransferTokenCreated BuildSource: Dcp TransferMode: Move Error: Post http://127.0.0.1:9102/createIndexRebalance: EOF InstId: 12883746746932021693 RealInstId: 17570484419532551099 Partitions: [3] Versions: [9] Inst:
      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Error] Rebalancer::processTokenAsMaster Detected TransferToken in Error state  MasterId: c7fc96e8303672d2c60e8de4ea7b2508 SourceId: c7fc96e8303672d2c60e8de4ea7b2508 (172.23.121.117:8091) DestId: 4f019a09f1fed08431cfd7f88966f0c0 (172.23.96.253:8091) RebalId: 8cfdef6fa8c2c512128f4da4a22ee75c State: TransferTokenCreated BuildSource: Dcp TransferMode: Move Error: Post http://127.0.0.1:9102/createIndexRebalance: EOF InstId: 12883746746932021693 RealInstId: 17570484419532551099 Partitions: [3] Versions: [9] Inst:
      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Info] Rebalancer::doFinish Cleanup Post http://127.0.0.1:9102/createIndexRebalance: EOF
      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Info] Rebalancer::processDropIndexQueue Done Received
      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Info] Rebalancer::observeRebalance exiting err <nil>
      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Info] Rebalancer::updateProgress Done Received
      indexer.log.8.gz:2021-04-29T18:13:18.831-07:00 [Info] ServiceMgr::onRebalanceDoneLOCKED Rebalance Done, cancel: false, err: Post http://127.0.0.1:9102/createIndexRebalance: EOF
      

      Marking this is as a regression as this issue hasn't been seen in the recent longevity system test runs that went on for 5 & 7 days.

      Attachments

        For Gerrit Dashboard: MB-46039
        # Subject Branch Project Status CR V

        Activity

          People

            varun.velamuri Varun Velamuri
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty