Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37579

[BP 6.0.x] Number of scorch index files grow suspiciously high

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 6.0.4
    • 6.0.1
    • fts
    • Untriaged
    • No

    Description

      For tracking the issue observed with CBSE-7175.

      • There are 7 FTS indexes and out of that only 3 are of scorch types and the remaining 4 are of older upside_down/moss index type. 
      • Even though, the bucket contains roughly about 2.5M documents, looks like there are mutations happening all the time?
      • FTS services is provisioned with a meagre 2GB RAM for the ~7 partitions residing in every node. 
      • From a quick glance - each of the index definition has rich definition - meaning they certainly index 5-10 fields in most them. 
      • From the slow-queries logged, it seems that they always perform field scoped queries with no custom field sorting on results, still they have enabled `include_all` and `store` options for the field mappings in the index definitions. (un necessary)
      • As Steve Yen already pointed out, conjunct query with 275 sub queries doesn't look to be the best/right way to perform search. This can result in slow queries due to the heavy DGM (2GB RAM) situation in this case.
      • There are many 429 query reject errors happening due to "lack of resources/out of memory".
      • `id` field indexed as a text as per the index definition.

       

      Another wrinkle noted is the the number of files on disk grows with the scorch indexes.  The customer might soon hit the a file handle threshold issue if this trend continues. 

      "permanent:**:num_files_on_disk": 68152, 
      "permanent:**:num_root_filesegments": 21, 
      "permanent:**:num_root_memorysegments": 82, 
      "permanent:**:total_queries": 2, 
      "permanent:**:total_queries_error": 0, 
      "permanent:**:total_queries_slow": 0, 
      "permanent:**:total_queries_timeout": 0, 
      "permanent:**:total_request_time": 84485245, 
      "permanent:**:total_term_searchers": 6,
       
       
      CurRootEpoch : 15142197 
      LastMergedEpoch : 15141839 
      LastPersistedEpoch : 15141839 
      term_searchers_finished : 6 
      term_searchers_started : 6
       
       
      "permanent:**:num_files_on_disk": 66390, 
      "permanent:**:num_root_filesegments": 19, 
      "permanent:**:num_root_memorysegments": 2, 
      "permanent:**:total_queries": 83730, 
      "permanent:**:total_queries_error": 346, 
      "permanent:**:total_queries_slow": 1, 
      "permanent:**:total_queries_timeout": 0, 
      "permanent:**:total_request_time": 878390507089, 
      "permanent:**:total_term_searchers": 903179
       
       
      CurRootEpoch : 15135855 
      LastMergedEpoch : 15135849 
      LastPersistedEpoch : 15135846 
      term_searchers_finished : 903179 
      term_searchers_started : 903179
       
       
      "permanent:**:num_files_on_disk": 55091, 
      "permanent:**:num_root_filesegments": 16, 
      "permanent:**:num_root_memorysegments": 75, 
      "permanent:**:total_queries": 2068, 
      "permanent:**:total_queries_error": 13, 
      "permanent:**:total_queries_slow": 0, 
      "permanent:**:total_queries_timeout": 0, 
      "permanent:**:total_request_time": 46350320854, 
      "permanent:**:total_term_searchers": 407159
       
       
      CurRootEpoch : 14705448 
      LastMergedEpoch : 14705004 
      LastPersistedEpoch : 14705004 
      term_searchers_finished : 407159 
      term_searchers_started : 407159 
      

      The number of files at root are always less than 25 here and it looks like it never touch 30 too for any of the scorch indexes. But still the total files on disk are mounting slowly and reached at this massive figure. 

      -Ongoing searches doesn't look like a problem at this point. Terms search start/end counts are matching here.

      -Merger and persister are almost working together in all the indexes, but they are lagging a little behind the current root. 

      So this needs to be analysed further.  Not yet have an explanation for this #file growth.

       

      Attachments

        For Gerrit Dashboard: MB-37579
        # Subject Branch Project Status CR V

        Activity

          People

            Sreekanth Sivasankaran Sreekanth Sivasankaran (Inactive)
            abhinav Abhi Dangeti
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty