Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44422

FTS: indexing of pretty small number of docs goes very slow

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not a Bug
    • Cheshire-Cat
    • 7.0.0
    • fts
    • Triaged
    • Centos 64-bit
    • 1
    • Yes

    Description

      Build: 7.0.0-4454

      Indexing of 20,000 docs was not finished after 20 minutes.

      index definition:

      {
       "name": "default_index_1",
       "type": "fulltext-index",
       "params": {
        "doc_config": {
         "docid_prefix_delim": "",
         "docid_regexp": "",
         "mode": "type_field",
         "type_field": "type"
        },
        "mapping": {
         "default_analyzer": "standard",
         "default_datetime_parser": "dateTimeOptional",
         "default_field": "_all",
         "default_mapping": {
          "dynamic": true,
          "enabled": true
         },
         "default_type": "_default",
         "docvalues_dynamic": true,
         "index_dynamic": true,
         "store_dynamic": false,
         "type_field": "_type"
        },
        "store": {
         "indexType": "scorch",
         "mossStoreOptions": {},
         "segmentVersion": 15
        }
       },
       "sourceType": "gocbcore",
       "sourceName": "default",
       "sourceUUID": "8ea95aa4403e85942487013cfa93da67",
       "sourceParams": {},
       "planParams": {
        "maxPartitionsPerPIndex": 171,
        "numReplicas": 1,
        "indexPartitions": 6
       },
       "uuid": "5d694a19b13c4a75"
      }
      

      Logs are attached.

      Test execution logs fragment is here:

      [2021-02-14 10:26:56,981] - [rest_client:2314] INFO - http://172.23.123.101:8091/pools/default/buckets/default/stats?zoom=minute
      [2021-02-14 10:26:57,661] - [fts_base:4421] INFO - Docs in bucket = 20000, docs in FTS index 'default_index_1': 15993
      [2021-02-14 10:27:03,922] - [rest_client:2314] INFO - http://172.23.123.101:8091/pools/default/buckets/default/stats?zoom=minute
      [2021-02-14 10:27:04,592] - [fts_base:4421] INFO - Docs in bucket = 20000, docs in FTS index 'default_index_1': 15993
      [2021-02-14 10:27:10,831] - [rest_client:2314] INFO - http://172.23.123.101:8091/pools/default/buckets/default/stats?zoom=minute
      [2021-02-14 10:27:11,512] - [fts_base:4421] INFO - Docs in bucket = 20000, docs in FTS index 'default_index_1': 15993
      [2021-02-14 10:27:11,513] - [fts_base:4469] INFO - FTS index count not matching bucket count even after 20 tries: Docs in bucket = 20000, docs in FTS index 'default_index_1': 15993
      

       

      Attachments

        1. test_4.zip
          76.56 MB
        2. test_4-1.zip
          44.05 MB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Charles Dixon, the cluster is still live. So you may explore in case that helps you.

          Sreekanth Sivasankaran Sreekanth Sivasankaran added a comment - Charles Dixon , the cluster is still live. So you may explore in case that helps you.

          There is certainly something very weird going on here.

          The last logs on the node .56 ..

          2021-04-08T02:14:39.640-07:00 [INFO] app_herder: persister progress, indexes: 1, waiting: 2
          2021-04-08T02:14:39.640-07:00 [INFO] app_herder: indexing proceeding, indexes: 1, waiting: 1, usage: 736968952
          2021-04-08T02:14:39.641-07:00 [INFO] app_herder: indexing proceeding, indexes: 1, waiting: 0, usage: 736968952
          2021-04-08T02:14:42.777-07:00 [WARN] (GOCBCORE) Pipeline Client 0xc0168cc2a0 failed to bootstrap: dial tcp 172.23.107.35:11210: connect: connection refused -- cbgt.GocbcoreLogger.Log() at gocbcore_utils.go:615
          2021-04-08T02:14:42.777-07:00 [WARN] (GOCBCORE) Pipeline Client 0xc0168cd3e0 failed to bootstrap: dial tcp 172.23.107.35:11210: connect: connection refused -- cbgt.GocbcoreLogger.Log() at gocbcore_utils.go:615

          Looking into the UI logs, the bucket was created at ..

          Bucket "default" loaded on node 'ns_1@172.23.107.35' in 0 seconds.      ns_1@172.23.107.35        2:14:45 AM   8 Apr, 2021

          So Evgeny Makarenko, would you be able to give us the test.log - with a list of events on how you reproduced this situation?

          abhinav Abhinav Dangeti added a comment - There is certainly something very weird going on here. The last logs on the node .56 .. 2021 - 04 -08T02: 14 : 39.640 - 07 : 00 [INFO] app_herder: persister progress, indexes: 1 , waiting: 2 2021 - 04 -08T02: 14 : 39.640 - 07 : 00 [INFO] app_herder: indexing proceeding, indexes: 1 , waiting: 1 , usage: 736968952 2021 - 04 -08T02: 14 : 39.641 - 07 : 00 [INFO] app_herder: indexing proceeding, indexes: 1 , waiting: 0 , usage: 736968952 2021 - 04 -08T02: 14 : 42.777 - 07 : 00 [WARN] (GOCBCORE) Pipeline Client 0xc0168cc2a0 failed to bootstrap: dial tcp 172.23 . 107.35 : 11210 : connect: connection refused -- cbgt.GocbcoreLogger.Log() at gocbcore_utils.go: 615 2021 - 04 -08T02: 14 : 42.777 - 07 : 00 [WARN] (GOCBCORE) Pipeline Client 0xc0168cd3e0 failed to bootstrap: dial tcp 172.23 . 107.35 : 11210 : connect: connection refused -- cbgt.GocbcoreLogger.Log() at gocbcore_utils.go: 615 Looking into the UI logs, the bucket was created at .. Bucket "default" loaded on node 'ns_1@172.23.107.35' in 0 seconds. ns_1 @172 .23. 107.35 2 : 14 : 45 AM    8 Apr, 2021 So Evgeny Makarenko , would you be able to give us the test.log - with a list of events on how you reproduced this situation?

          I just deleted the index and recreated it to see what's going on.

          With a single partition, the ingest is very slow - data from 5 connections coming into 1 - because of the throttler slowing ingest quite a bit.

          When I upgrade the index to 5 partitions (1 partition per node), ingest was a lot faster - throttler was in a play a bit here as well.

          abhinav Abhinav Dangeti added a comment - I just deleted the index and recreated it to see what's going on. With a single partition, the ingest is very slow - data from 5 connections coming into 1 - because of the throttler slowing ingest quite a bit. When I upgrade the index to 5 partitions (1 partition per node), ingest was a lot faster - throttler was in a play a bit here as well.
          abhinav Abhinav Dangeti added a comment - - edited

          Evgeny Makarenko I see that this is also a community edition of the server you're using. Could you try the exact same scenario on enterprise edition and let me know of the behavior you see?

          Nevermind that.

          So gocbcore@9.1.3+ creates a buffer for every connection for reuse - so there's a bit of allocation done at the start. This took the memory consumption by 5 connections up a bit and FTS' memory usage when it begins to ingest items takes it over the quota causing very slow ingest.

          I just raised the FTS quota of your node to 1024MB, and restarted the same index - it completed very quickly.

          So this comes back down to sizing - for this test, I'd recommend either ..

          • raising the FTS quota if you want to create an index with a single partition, or
          • partition your index for distributing it among all the nodes you have in the cluster.

          And for why we didn't see logs from your initial test - i've no idea.

          abhinav Abhinav Dangeti added a comment - - edited Evgeny Makarenko I see that this is also a community edition of the server you're using. Could you try the exact same scenario on enterprise edition and let me know of the behavior you see? Nevermind that. So gocbcore@9.1.3+ creates a buffer for every connection for reuse - so there's a bit of allocation done at the start. This took the memory consumption by 5 connections up a bit and FTS' memory usage when it begins to ingest items takes it over the quota causing very slow ingest. I just raised the FTS quota of your node to 1024MB, and restarted the same index - it completed very quickly. So this comes back down to sizing - for this test, I'd recommend either .. raising the FTS quota if you want to create an index with a single partition, or partition your index for distributing it among all the nodes you have in the cluster. And for why we didn't see logs from your initial test - i've no idea.

          Bulk closing non-fixed issues. Please reopen if necessary.

          mihir.kamdar Mihir Kamdar (Inactive) added a comment - Bulk closing non-fixed issues. Please reopen if necessary.

          People

            evgeny.makarenko Evgeny Makarenko (Inactive)
            evgeny.makarenko Evgeny Makarenko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty