Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-63219

Replica partition loss- Existing partition skew in 7.2.5 are not resolved after upgrade to 7.2.6

    XMLWordPrintable

Details

    • Bug
    • Resolution: User Error
    • Critical
    • 7.2.6
    • 7.2.6
    • fts
    • Enterprise Edition 7.2.5 build 7604
      Enterprise Edition 7.2.6 build 8105

    Description

      Create 3 indexes in 7.2.5 with 18:3:3

      Update the partitions to 36:6:6 (updating partitions happened in 7.2.5 just to bring in skew of partitions. (This was done to avoid cluster upg from 7.2.4 to 7.2.5 to bring in skew). 

      We see a being skew present:

      Actives:
              172.23.121.140:8094 : 16
              172.23.122.68:8094 : 16
              172.23.120.78:8094 : 16
      Replicas:
              172.23.120.78:8094 : 14
              172.23.121.140:8094 : 14
              172.23.122.68:8094 : 14
      Actual number of index partitions in cluster: 90
      Expected number of index partitions in cluster: 90
      Indexes: 3
              default_index_1 :: maxPartitionsPerPIndex: 29, indexPartitions: 36, numReplicas: 1
              default_index_2 :: maxPartitionsPerPIndex: 171, indexPartitions: 6, numReplicas: 1
              default_index_3 :: maxPartitionsPerPIndex: 171, indexPartitions: 6, numReplicas: 0Index actives distribution:
              Index: default_index_1
                      172.23.121.140:8094 : 12
                      172.23.122.68:8094 : 12
                      172.23.120.78:8094 : 12
              Index: default_index_3
                      172.23.122.68:8094 : 2
                      172.23.121.140:8094 : 2
                      172.23.120.78:8094 : 2
              Index: default_index_2
                      172.23.122.68:8094 : 2
                      172.23.121.140:8094 : 2
                      172.23.120.78:8094 : 2
      Index replicas distribution:
              Index: default_index_1
                      172.23.120.78:8094 : 12
                      172.23.121.140:8094 : 12
                      172.23.122.68:8094 : 12
              Index: default_index_3
              Index: default_index_2
                      172.23.121.140:8094 : 2
                      172.23.120.78:8094 : 2
                      172.23.122.68:8094 : 2 

      Start upgrade to 7.2.6

      Observing the skew isn't fixed and there is a replica partition loss for index : default_index_3. In 7.2.5 no matter from where the skew comes it shouldn't matter and it should be that 7.2.6 should fix it eventually which didn't happen.

      Actives:
              172.23.120.78:8094 : 16
              172.23.121.140:8094 : 16
              172.23.122.68:8094 : 16
      Replicas:
              172.23.122.68:8094 : 14
              172.23.120.78:8094 : 14
              172.23.121.140:8094 : 14
      Actual number of index partitions in cluster: 90
      Expected number of index partitions in cluster: 90
      Indexes: 3
              default_index_2 :: maxPartitionsPerPIndex: 171, indexPartitions: 6, numReplicas: 1
              default_index_3 :: maxPartitionsPerPIndex: 171, indexPartitions: 6, numReplicas: 0
              default_index_1 :: maxPartitionsPerPIndex: 29, indexPartitions: 36, numReplicas: 1Index actives distribution:
              Index: default_index_2
                      172.23.121.140:8094 : 2
                      172.23.120.78:8094 : 2
                      172.23.122.68:8094 : 2
              Index: default_index_1
                      172.23.122.68:8094 : 12
                      172.23.120.78:8094 : 12
                      172.23.121.140:8094 : 12
              Index: default_index_3
                      172.23.122.68:8094 : 2
                      172.23.120.78:8094 : 2
                      172.23.121.140:8094 : 2
      Index replicas distribution:
              Index: default_index_1
                      172.23.120.78:8094 : 12
                      172.23.122.68:8094 : 12
                      172.23.121.140:8094 : 12
              Index: default_index_3
              Index: default_index_2
                      172.23.120.78:8094 : 2
                      172.23.121.140:8094 : 2
                      172.23.122.68:8094 : 2
       

      The recovery rebalance paths were used during the upgrade:

      root@s72526-deb11:~# cat /opt/couchbase/var/lib/couchbase/logs/fts.log | grep 'recovery rebalance for index'
      2024-08-19T22:48:39.680-07:00 [INFO]   calcBegEndMaps: recovery rebalance for index: default_index_1
      2024-08-19T22:52:40.172-07:00 [INFO]   calcBegEndMaps: recovery rebalance for index: default_index_2
      2024-08-19T22:53:19.763-07:00 [INFO]   calcBegEndMaps: recovery rebalance for index: default_index_3 

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sarthak.dua Sarthak Dua
            sarthak.dua Sarthak Dua
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty