Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44311

[System Test] : Index rebalance failures due to Index bucket3.idx3 already exists

    XMLWordPrintable

Details

    Description

      Build : 7.0.0-4444
      Test : -test tests/2i/cheshirecat/test_idx_clusterops_cheshire_cat_basic.yml -scope tests/2i/cheshirecat/scope_idx_cheshire_cat.yml
      Scale : 2
      Iteration : 1st

      In the system test with cluster ops, a rebalance operation to add 2 indexer nodes - 172.23.97.77 & 172.23.97.82 to the cluster fails because of the following error -

      [ns_server:error,2021-02-12T03:38:09.884-08:00,ns_1@172.23.104.16:service_rebalancer-index<0.22002.126>:service_rebalancer:run_rebalance_worker:125]Worker terminated abnormally: {'EXIT',<0.22562.126>,
                                     {rebalance_failed,
                                      {service_error,
                                       <<"Index bucket3.idx3 already exists">>}}}
      [user:error,2021-02-12T03:38:09.890-08:00,ns_1@172.23.104.16:<0.11450.0>:ns_orchestrator:log_rebalance_completion:1406]Rebalance exited with reason {service_rebalance_failed,index,
                                    {worker_died,
                                     {'EXIT',<0.22562.126>,
                                      {rebalance_failed,
                                       {service_error,
                                        <<"Index bucket3.idx3 already exists">>}}}}}.
      Rebalance Operation Id = 1a3e57a4f83e55941c23110ebdb5cc87
      

      On 172.23.97.82, the following is observed in the logs around the same time :

      2021-02-12T03:38:09.881-08:00 [Info] Rebalancer::decodeTransferToken TransferToken TransferTokena5:54:89:9c:48:68:fb:b1  MasterId: d8741c2cb61a493a36b1854435dfb5d7 SourceId:  DestId: 1d4b645fc5f87178298885f28a5acc62 RebalId: 1ee310480e082bf53ee4c8c0fa83344a State: TransferTokenCreated BuildSource: Dcp TransferMode: Copy Error: Index bucket3.idx3 already exists InstId: 6936193397592831477 RealInstId: 0 Partitions: [0] Versions: [1] Inst:
              InstId: 6936193397592831477
              Defn: DefnId: 12703198803255379773 Name: idx3 Using: plasma Bucket: bucket3 Scope/Id: scope_0/8 Collection/Id: coll_-1/9 IsPrimary: false NumReplica: 3 InstVersion: 1
                      SecExprs: <ud>([`free_breakfast` `free_parking` `country` `city`])</ud>
                      Desc: [false false false false]
                      PartitionScheme: SINGLE
                      HashScheme: CRC32 PartitionKeys: [] WhereExpr: <ud>()</ud> RetainDeletedXATTR: false
              State: INDEX_STATE_ACTIVE
              RState: RebalActive
              Stream: NIL_STREAM
              Version: 0
              ReplicaId: 0
              PartitionContainer: <nil>
      2021-02-12T03:38:09.881-08:00 [Info] Rebalancer::processTokens RebalanceToken /indexing/rebalance/RebalanceToken []
      2021-02-12T03:38:09.881-08:00 [Info] Rebalancer::processTokens Rebalance Token Deleted. Mark Done.
      2021-02-12T03:38:09.881-08:00 [Info] Rebalancer::doFinish Cleanup <nil>
      2021-02-12T03:38:09.881-08:00 [Info] Rebalancer::processDropIndexQueue Done Received
      2021-02-12T03:38:09.886-08:00 [Info] ServiceMgr::GetTaskList []
      2021-02-12T03:38:09.886-08:00 [Info] ServiceMgr::GetTaskList returns &{[0 0 0 0 0 0 0 1] [{[0 0 0 0 0 0 0 0] prepare/1ee310480e082bf53ee4c8c0fa83344a task-prepared task-running true 0 map[]   map[rebalanceId:1ee310480e082bf53ee4c8c0fa83344a]}]}
      2021-02-12T03:38:09.887-08:00 [Info] ServiceMgr::GetCurrentTopology []
      2021-02-12T03:38:09.887-08:00 [Info] ServiceMgr::GetCurrentTopology returns &{[0 0 0 0 0 0 0 1] [634bd42a7ecc6e83f9f52a695318759a 1d4b645fc5f87178298885f28a5acc62 c7a25dfedb477e1230ab323ddced94f6 56044fe58b829ee9496755b8a03abdde d8741c2cb61a493a36b1854435dfb5d7] true []}
      2021-02-12T03:38:09.888-08:00 [Info] ServiceMgr::CancelTask prepare/1ee310480e082bf53ee4c8c0fa83344a []
      2021-02-12T03:38:09.888-08:00 [Info] ServiceMgr::cleanupRebalanceRunning Cleanup
      2021-02-12T03:38:09.888-08:00 [Info] ClustMgr:handleDelLocalValue Key RebalanceRunning

      Also seeing similar thing in the logs for 172.23.97.77 :

      2021-02-12T03:38:09.761-08:00 [Info] Rebalancer::decodeTransferToken TransferToken TransferTokena5:54:89:9c:48:68:fb:b1  MasterId: d8741c2cb61a493a36b1854435dfb5d7 SourceId:  DestId: 1d4b645fc5f87178298885f28a5acc62 RebalId: 1ee310480e082bf53ee4c8c0fa83344a State: TransferTokenCreated BuildSource: Dcp TransferMode: Copy Error: Index bucket3.idx3 already exists InstId: 6936193397592831477 RealInstId: 0 Partitions: [0] Versions: [1] Inst:
              InstId: 6936193397592831477
              Defn: DefnId: 12703198803255379773 Name: idx3 Using: plasma Bucket: bucket3 Scope/Id: scope_0/8 Collection/Id: coll_-1/9 IsPrimary: false NumReplica: 3 InstVersion: 1
                      SecExprs: <ud>([`free_breakfast` `free_parking` `country` `city`])</ud>
                      Desc: [false false false false]
                      PartitionScheme: SINGLE
                      HashScheme: CRC32 PartitionKeys: [] WhereExpr: <ud>()</ud> RetainDeletedXATTR: false
              State: INDEX_STATE_ACTIVE
              RState: RebalActive
              Stream: NIL_STREAM
              Version: 0
              ReplicaId: 0
              PartitionContainer: <nil>
      2021-02-12T03:38:09.761-08:00 [Info] Rebalancer::processTokens RebalanceToken /indexing/rebalance/RebalanceToken []
      2021-02-12T03:38:09.761-08:00 [Info] Rebalancer::processTokens Rebalance Token Deleted. Mark Done.

      Later in the test, 3 more such issues were observed.

      This could be a new regression as the same issue was fixed and verified in https://issues.couchbase.com/browse/MB-42220.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            varun.velamuri Varun Velamuri
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                PagerDuty