Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-61942

[System Test] FTS service exited with OOM error

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 7.6.2
    • 7.6.2
    • fts
    • None
    • 7.6.2-3635
    • Untriaged
    • 0
    • Unknown

    Description

      The fts service exited with error code 137 while running the queries.

       

      Service 'fts' exited with status 137. Restarting. Messages: 2024-05-16T16:53:12.956-07:00 [INFO] app_herder: querying over queryQuota: 13804339200, estimated size: 48, runningQueryUsed: 9000, memUsed: 25646663600 2024-05-16T16:53:12.956-07:00 [INFO] app_herder: querying over queryQuota: 13804339200, estimated size: 56, runningQueryUsed: 9000, memUsed: 25646663608 2024-05-16T16:53:12.971-07:00 [INFO] app_herder: querying over queryQuota: 13804339200, estimated size: 88, runningQueryUsed: 9000, memUsed: 25646663640 2024-05-16T16:53:12.972-07:00 [INFO] app_herder: querying over queryQuota: 13804339200, estimated size: 24, runningQueryUsed: 9000, memUsed: 25646663576 2024-05-16T16:53:12.972-07:00 [INFO] app_herder: querying over queryQuota: 13804339200, estimated size: 16, runningQueryUsed: 9000, memUsed: 25646663568 2024-05-16T16:53:12.973-07:00 [INFO] app_herder: querying over queryQuota: 13804339200, estimated size: 9000, runningQueryUsed: 9000, memUsed: 25646672552 2024-05-16T16:53:12.973-07:00 [INFO] app_herder: indexing over indexQuota: 11216025600, memUsed: 26107934072, preIndexingMemory: 461270520, indexes: 212, waiting: 123

       

      Steps followed to run the System test:

      1. Created an on-prem cluster with 12 nodes. Out of which 5 are fts nodes.
      2. Each node has atleast 12gb of memory.
      3. Created one scope and one collection under that scope
      4. Loaded 5 million documents for normal vectors and 10k documents with xattrs using the sift dataset
      5. Created two indexes with 90 partitions acoss 5 fts nodes. One index indexes the normal vector data and the other index indexes the xattrs vectors.
      6. Run knn queries
      7. Mutated the documents and then again ran the knn query
      8. Rebalanced in an FTS node
      9. Perform mutations and then again run the knn queries
      10. Rebalanced out an FTS node

       

      The above steps run in loop and the index count increases with the loop. The test ran for almost 8-10 hrs, then it gives OOM error. Logs are attached for the cluster.

       

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              koushal.sharma Koushal Sharma
              koushal.sharma Koushal Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty