Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60517

Dropped replicas not rebuilt in swap rebalance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • Morpheus, 7.6.2
    • 7.6.0
    • secondary-index
    • None
    • Untriaged
    • 0
    • Yes

    Description

      When writing new tests, I see that dropped replicas are not rebuilt on existing indexer service in the cluster when running rebalance (valid for both DCP and shard based rebalance). Attaching logs

      Steps to recreate -

      • create a cluster with nodes n0 - kv + query, n1 - index, n2 - index
      • create "x" (where x > 1) partitioned indices with 1 replica.
      • drop at least 1 replica from all indices via alter index command
      • run rebalance

      Expectation is that the rebalance will recreate the dropped replicas but it does not happen. Rather I see planner errors like (below) and the final layout does not have dropped replicas -

      // node :9001 is in the cluster, node :9002 coming in, node :9003 going out
       
      2024-01-01T05:40:58.105+00:00 [Info] Planner::Fail to create plan satisfying constraint. Re-planning. Num of Try=5.  Elapsed Time=352us, err: 
      MemoryQuota: 1572864000
      CpuQuota: 6
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP__id_balance 5 (replica 1), default, _default, _default> (mem 1.95297M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP__id_balance 2 (replica 1), default, _default, _default> (mem 1.69818M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP__id_balance 1 (replica 1), default, _default, _default> (mem 2.0672M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_docid_picture 5, default, _default, _default> (mem 2.44296M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_docid_picture 1, default, _default, _default> (mem 2.57815M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_docid_picture 3, default, _default, _default> (mem 2.46969M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_guid_age 5, default, _default, _default> (mem 2.02009M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_guid_age 3, default, _default, _default> (mem 2.03975M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_balance_name 2 (replica 1), default, _default, _default> (mem 1.74794M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_balance_name 1 (replica 1), default, _default, _default> (mem 1.79035M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      --- Violations for index <TestReplicaRepairInMixedModeRebalance_5PTN_1RP_picture_gender 5, default, _default, _default> (mem 1.8401M, cpu 0) at node 127.0.0.1:9003 
      	Cannot move to 127.0.0.1:9002: ReplicaViolation (free mem 463.377M, free cpu 6)
      	Cannot move to 127.0.0.1:9001: ExcludeNodeViolation (free mem 778.731M, free cpu 6)
      2024-01-01T05:40:58.105+00:00 [Info] Cannot rebuild lost replica due to resource constraint in cluster.  Will not rebuild lost replica.
      

      New test -

      func TestReplicaRepairInMixedModeRebalance(t *testing.T) {
      	// t.Skipf("Unstable test")
      	skipShardAffinityTests(t)
       
      	resetCluster(t)
      	addNodeAndRebalance(clusterconfig.Nodes[3], "index", t)
      	clusterutility.SetDataAndIndexQuota(kvaddress, clusterconfig.Username, clusterconfig.Password, "1024", "1024")
      	// clusterutility.SetDataAndIndexQuota(kvaddress, clusterconfig.Username, clusterconfig.Password, "1024", SHARD_AFFINITY_INDEXER_QUOTA)
       
      	status := getClusterStatus()
      	if len(status) != 3 || !isNodeIndex(status, clusterconfig.Nodes[1]) ||
      		!isNodeIndex(status, clusterconfig.Nodes[3]) {
      		t.Fatalf("%v Unexpected cluster configuration: %v", t.Name(), status)
      	}
       
      	// config - [0: kv n1ql] [1: index]            [3: index]
      	printClusterConfig(t.Name(), "entry")
       
      	log.Println("*********Setup cluster*********")
      	err := secondaryindex.DropAllNonSystemIndexes(clusterconfig.Nodes[1])
      	tc.HandleError(err, "Failed to drop all non-system indices")
       
      	log.Printf("********Updating `indexer.settings.enable_shard_affinity`=true with node 3 in simulated mixed mode**********")
      	configChanges := map[string]interface{}{
      		// "indexer.settings.enable_shard_affinity":          true,
      		// "indexer.planner.honourNodesInDefn":               true,
      		// "indexer.thisNodeOnly.ignoreAlternateShardIds":    true,
      		"indexer.settings.rebalance.redistribute_indexes": true,
      	}
      	err = secondaryindex.ChangeMultipleIndexerSettings(configChanges, clusterconfig.Username, clusterconfig.Password, clusterconfig.Nodes[3])
      	tc.HandleError(err, fmt.Sprintf("Failed to change config %v", configChanges))
       
      	defer func() {
      		configChanges := map[string]interface{}{
      			"indexer.settings.enable_shard_affinity":          false,
      			"indexer.planner.honourNodesInDefn":               false,
      			"indexer.settings.rebalance.redistribute_indexes": false,
      		}
      		err := secondaryindex.ChangeMultipleIndexerSettings(configChanges, clusterconfig.Username, clusterconfig.Password, clusterconfig.Nodes[1])
      		tc.HandleError(err, fmt.Sprintf("Failed to change config %v", configChanges))
      	}()
       
      	log.Printf("********Create indices**********")
      	indices := []string{}
      	// create non-deffered partitioned indices
      	for field1 := 0; field1 < 6; field1++ {
      		fieldName1 := fieldNames[field1%len(fieldNames)]
      		fieldName2 := fieldNames[(field1+4)%len(fieldNames)]
      		indexName := t.Name() + "_5PTN_1RP_" + fieldName1 + "_" + fieldName2
      		n1qlStmt := fmt.Sprintf(
      			"create index %v on `%v`(%v, %v) partition by hash(Meta().id) with {\"num_partition\":5, \"num_replica\":1}",
      			indexName, BUCKET, fieldName1, fieldName2)
      		executeN1qlStmt(n1qlStmt, BUCKET, t.Name(), t)
      		indices = append(indices, indexName)
      	}
      	log.Printf("%v %v indices are now active.", t.Name(), indices)
       
      	performClusterStateValidation(t, true)
       
      	dropIndicesMap := make(map[string]int)
       
      	node1meta, err := getLocalMetaWithRetry(clusterconfig.Nodes[1])
      	tc.HandleError(err, "Failed to getLocalMetadata from node 1")
       
      	for _, defn := range node1meta.IndexTopologies[0].Definitions {
      		if len(dropIndicesMap) == 3 {
      			break
      		}
      		if _, exists := dropIndicesMap[defn.Name]; !exists {
      			// pick the replica ID of the first instance
      			dropIndicesMap[defn.Name] = int(defn.Instances[0].ReplicaId)
      		}
      	}
       
      	node3meta, err := getLocalMetaWithRetry(clusterconfig.Nodes[3])
      	tc.HandleError(err, "Failed to getLocalMetadata from node 3")
       
      	log.Printf("********Drop replicas on node 1 and 3**********")
       
      	for _, defn := range node3meta.IndexTopologies[0].Definitions {
      		if len(dropIndicesMap) == 6 {
      			break
      		}
      		if _, exists := dropIndicesMap[defn.Name]; !exists {
      			// pick the replica ID of the first instance
      			dropIndicesMap[defn.Name] = int(defn.Instances[0].ReplicaId)
      		}
      	}
       
      	for idxName, replicaId := range dropIndicesMap {
      		stmt := fmt.Sprintf("alter index %v on %v with {\"action\": \"drop_replica\", \"replicaId\": %v}",
      			idxName, BUCKET, replicaId)
      		executeN1qlStmt(stmt, BUCKET, t.Name(), t)
      		if waitForReplicaDrop(idxName, fmt.Sprintf("%v:%v:%v", BUCKET, "_default", "_default"), replicaId) ||
      			waitForReplicaDrop(idxName, BUCKET, replicaId) {
      			t.Fatalf("%v couldn't drop index %v replica %v", t.Name(), idxName, replicaId)
      		}
      	}
       
      	log.Printf("%v dropped the following index:replica %v", t.Name(), dropIndicesMap)
       
      	performClusterStateValidation(t, true)
       
      	log.Printf("********Swap Rebalance node 3 <=> 2**********")
       
      	swapRebalance(t, 2, 3)
      	indexStatus, err := getIndexStatusFromIndexer()
      	tc.HandleError(err, "idiot")
      	for _, i := range indexStatus.Status {
      		log.Printf("godlog - index %v - %v", i.Name, i.AlternateShardIds)
      	}
       
      	performClusterStateValidation(t, false)
      }
      

      Logs - [^norepair.tar]

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-60517
          # Subject Branch Project Status CR V

          Activity

            People

              dhruvil.ketanshah Dhruvil Shah
              amit.kulkarni Amit Kulkarni
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There is 1 open Gerrit change

                  PagerDuty