Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46806

FTS: parallel partitions move feature is not working.

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not a Bug
    • None
    • None
    • fts

    Description

      Build 7.0.0-5247

      Create cluster: {kv}

      {fts}

      , having additional {fts} node aside

      Perform the following steps:

      • Create bucket 'default'
      • Load 500,000 docs into 'default'
      • Create FTS index on 'default', everything is by default, index partitions=20

      {
       "name": "default_index_1",
       "type": "fulltext-index",
       "params": {
        "doc_config": {
         "docid_prefix_delim": "",
         "docid_regexp": "",
         "mode": "type_field",
         "type_field": "type"
        },
        "mapping": {
         "default_analyzer": "standard",
         "default_datetime_parser": "dateTimeOptional",
         "default_field": "_all",
         "default_mapping": {
          "dynamic": true,
          "enabled": true
         },
         "default_type": "_default",
         "docvalues_dynamic": true,
         "index_dynamic": true,
         "store_dynamic": false,
         "type_field": "_type"
        },
        "store": {
         "indexType": "scorch",
         "mossStoreOptions": {},
         "segmentVersion": 15
        }
       },
       "sourceType": "gocbcore",
       "sourceName": "default",
       "sourceUUID": "b794cde760b14301a27aff739b303acc",
       "sourceParams": {},
       "planParams": {
        "maxPartitionsPerPIndex": 52,
        "indexPartitions": 20,
        "numReplicas": 0
       },
       "uuid": "2a8d8156d6d1df16"
      }
      

      • perform swap rebalance of fts nodes. Calculate rebalance time
      • Update cluster setting: maxConcurrentPartitionMovesPerNode = 6
      • perform swap rebalance of fts nodes once again, calculate time

      Rebalance with default maxConcurrentPartitionMovesPerNode value takes 220 seconds, 

      rebalance with maxConcurrentPartitionMovesPerNode=6 takes 414 seconds

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Evgeny Makarenko, I would really encourage you to get used to the fts logs and to familiarise yourself with some basic patterns.

          If you wish, we may talk over this whenever our next sync up happens.

          If we glance into the logs, it is very evident that the app_herder is kicking in during the swap rebalance partition build-ups and hence the slowness is expected.

          2021-06-08T15:25:21.868-07:00 [INFO] app_herder: indexing over indexQuota: 760320000, memUsed: 977029274, preIndexingMemory: 2967202, indexes: 16, waiting: 14
          2021-06-08T15:25:21.901-07:00 [INFO] app_herder: indexing over indexQuota: 760320000, memUsed: 976774156, preIndexingMemory: 2712084, indexes: 16, waiting: 15
          2021-06-08T15:25:21.914-07:00 [INFO] app_herder: indexing over indexQuota: 760320000, memUsed: 976784042, preIndexingMemory: 2721970, indexes: 16, waiting: 16
          2021-06-08T15:25:22.294-07:00 [INFO] ctl/manager: revNum: 67, progress: 0.072125
          2021-06-08T15:25:22.294-07:00 [INFO] %%%                                       720d7fbe86843dfe657e77126aee96e8 d2c9fcbcd0c0315c6a59994785df1775
           %                  default_index_1_2a8d8156d6d1df16_04886d3f 0 + 7.2%                         1 .                             
           %                  default_index_1_2a8d8156d6d1df16_17739eb2   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_2135c587 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_2fbeac50   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_31482ff4 0 + 0.0%                         1 .                             
           %                  default_index_1_2a8d8156d6d1df16_353cc8ba   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_3661af3c 0 + 0.0%                         1 .                             
           %                  default_index_1_2a8d8156d6d1df16_441ec72f   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_4853bc7b 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_4d66dd25   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_55ba9666 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_6650a3fd   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_67624462 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_6a1493fd   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_7b8e3941 0 + 0.0%                         1 .                             
           %                  default_index_1_2a8d8156d6d1df16_8a53903c   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_999f1d7d 0 + 0.0%                         1 .                             
           %                  default_index_1_2a8d8156d6d1df16_e19205be   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_fd1a648f 0 + 0.0%                         1 .                             
           %                  default_index_1_2a8d8156d6d1df16_fec711fa   .                                .        

           

          `app_herder: indexing over indexQuota` shows the memory pressure applied by the system to contain the memory usage within the specified memory quota. The whole indexing is heavily slowed down by this. 5K+ occurrences is seen.

          If you look into the partition progress chart shown above/below, it shows that 6 partitions are concurrently built.

          2021-06-08T15:27:44.293-07:00 [INFO] %%%                                       720d7fbe86843dfe657e77126aee96e8 d2c9fcbcd0c0315c6a59994785df1775
           %                  default_index_1_2a8d8156d6d1df16_04886d3f 0 + 91.3%                        1 .                             
           %                  default_index_1_2a8d8156d6d1df16_17739eb2   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_2135c587 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_2fbeac50   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_31482ff4 0 + 82.2%                        1 .                             
           %                  default_index_1_2a8d8156d6d1df16_353cc8ba   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_3661af3c 0 + 53.5%                        1 .                             
           %                  default_index_1_2a8d8156d6d1df16_441ec72f   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_4853bc7b 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_4d66dd25   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_55ba9666 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_6650a3fd   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_67624462 0 .                              1 .                             
           %                  default_index_1_2a8d8156d6d1df16_6a1493fd   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_7b8e3941 0 + 55.4%                        1 .                             
           %                  default_index_1_2a8d8156d6d1df16_8a53903c   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_999f1d7d 0 + 73.3%                        1 .                             
           %                  default_index_1_2a8d8156d6d1df16_e19205be   .                                .                             
           %                  default_index_1_2a8d8156d6d1df16_fd1a648f 0 + 78.5%                        1 .                             
           %                  default_index_1_2a8d8156d6d1df16_fec711fa   .                                .                             
          2021-06-08T15:27:44.293-07:00 [INFO] ctl/manager: revNum: 3283, progress: 0.723860 

           

          So the parallel partition build is really happening, but to really observe faster rebalance we need 

          1. Enough RAM quota to facilitate the concurrent partition build-ups.
          2. In certain cases, override of the dcp sharing parameter `maxDCPAgents` with FTS nodes.

           

          Closing this ticket as this is not an issue.

          Sreekanth Sivasankaran Sreekanth Sivasankaran added a comment - Evgeny Makarenko , I would really encourage you to get used to the fts logs and to familiarise yourself with some basic patterns. If you wish, we may talk over this whenever our next sync up happens. If we glance into the logs, it is very evident that the app_herder is kicking in during the swap rebalance partition build-ups and hence the slowness is expected. 2021 - 06 -08T15: 25 : 21.868 - 07 : 00 [INFO] app_herder: indexing over indexQuota: 760320000 , memUsed: 977029274 , preIndexingMemory: 2967202 , indexes: 16 , waiting: 14 2021 - 06 -08T15: 25 : 21.901 - 07 : 00 [INFO] app_herder: indexing over indexQuota: 760320000 , memUsed: 976774156 , preIndexingMemory: 2712084 , indexes: 16 , waiting: 15 2021 - 06 -08T15: 25 : 21.914 - 07 : 00 [INFO] app_herder: indexing over indexQuota: 760320000 , memUsed: 976784042 , preIndexingMemory: 2721970 , indexes: 16 , waiting: 16 2021 - 06 -08T15: 25 : 22.294 - 07 : 00 [INFO] ctl/manager: revNum: 67 , progress: 0.072125 2021 - 06 -08T15: 25 : 22.294 - 07 : 00 [INFO] %%% 720d7fbe86843dfe657e77126aee96e8 d2c9fcbcd0c0315c6a59994785df1775 % default_index_1_2a8d8156d6d1df16_04886d3f 0 + 7.2 % 1 . % default_index_1_2a8d8156d6d1df16_17739eb2 . . % default_index_1_2a8d8156d6d1df16_2135c587 0 . 1 . % default_index_1_2a8d8156d6d1df16_2fbeac50 . . % default_index_1_2a8d8156d6d1df16_31482ff4 0 + 0.0 % 1 . % default_index_1_2a8d8156d6d1df16_353cc8ba . . % default_index_1_2a8d8156d6d1df16_3661af3c 0 + 0.0 % 1 . % default_index_1_2a8d8156d6d1df16_441ec72f . . % default_index_1_2a8d8156d6d1df16_4853bc7b 0 . 1 . % default_index_1_2a8d8156d6d1df16_4d66dd25 . . % default_index_1_2a8d8156d6d1df16_55ba9666 0 . 1 . % default_index_1_2a8d8156d6d1df16_6650a3fd . . % default_index_1_2a8d8156d6d1df16_67624462 0 . 1 . % default_index_1_2a8d8156d6d1df16_6a1493fd . . % default_index_1_2a8d8156d6d1df16_7b8e3941 0 + 0.0 % 1 . % default_index_1_2a8d8156d6d1df16_8a53903c . . % default_index_1_2a8d8156d6d1df16_999f1d7d 0 + 0.0 % 1 . % default_index_1_2a8d8156d6d1df16_e19205be . . % default_index_1_2a8d8156d6d1df16_fd1a648f 0 + 0.0 % 1 . % default_index_1_2a8d8156d6d1df16_fec711fa . .   `app_herder: indexing over indexQuota` shows the memory pressure applied by the system to contain the memory usage within the specified memory quota. The whole indexing is heavily slowed down by this. 5K+ occurrences is seen. If you look into the partition progress chart shown above/below, it shows that 6 partitions are concurrently built. 2021 - 06 -08T15: 27 : 44.293 - 07 : 00 [INFO] %%% 720d7fbe86843dfe657e77126aee96e8 d2c9fcbcd0c0315c6a59994785df1775 % default_index_1_2a8d8156d6d1df16_04886d3f 0 + 91.3 % 1 . % default_index_1_2a8d8156d6d1df16_17739eb2 . . % default_index_1_2a8d8156d6d1df16_2135c587 0 . 1 . % default_index_1_2a8d8156d6d1df16_2fbeac50 . . % default_index_1_2a8d8156d6d1df16_31482ff4 0 + 82.2 % 1 . % default_index_1_2a8d8156d6d1df16_353cc8ba . . % default_index_1_2a8d8156d6d1df16_3661af3c 0 + 53.5 % 1 . % default_index_1_2a8d8156d6d1df16_441ec72f . . % default_index_1_2a8d8156d6d1df16_4853bc7b 0 . 1 . % default_index_1_2a8d8156d6d1df16_4d66dd25 . . % default_index_1_2a8d8156d6d1df16_55ba9666 0 . 1 . % default_index_1_2a8d8156d6d1df16_6650a3fd . . % default_index_1_2a8d8156d6d1df16_67624462 0 . 1 . % default_index_1_2a8d8156d6d1df16_6a1493fd . . % default_index_1_2a8d8156d6d1df16_7b8e3941 0 + 55.4 % 1 . % default_index_1_2a8d8156d6d1df16_8a53903c . . % default_index_1_2a8d8156d6d1df16_999f1d7d 0 + 73.3 % 1 . % default_index_1_2a8d8156d6d1df16_e19205be . . % default_index_1_2a8d8156d6d1df16_fd1a648f 0 + 78.5 % 1 . % default_index_1_2a8d8156d6d1df16_fec711fa . . 2021 - 06 -08T15: 27 : 44.293 - 07 : 00 [INFO] ctl/manager: revNum: 3283 , progress: 0.723860   So the parallel partition build is really happening, but to really observe faster rebalance we need  Enough RAM quota to facilitate the concurrent partition build-ups. In certain cases, override of the dcp sharing parameter `maxDCPAgents` with FTS nodes.   Closing this ticket as this is not an issue.

          People

            evgeny.makarenko Evgeny Makarenko (Inactive)
            evgeny.makarenko Evgeny Makarenko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty