Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46671

[System Test] 1 query node running with very high CPU and memory

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • 7.0.0
    • Cheshire-Cat
    • query

    Description

      Build : 7.0.0-5247 (RC3)
      Test : -test tests/integration/cheshirecat/test_cheshirecat_kv_gsi_coll_xdcr_backup_sgw_fts_itemct_txns_eventing_cbas_scale3.yml -scope tests/integration/cheshirecat/scope_cheshirecat_with_backup.yml
      Scale : 3
      Iteration : 3rd

      In the longevity test run, seeing one of the alter index query failed with error 500.

      Excerpt from the test console :

      [2021-06-01T20:21:38-07:00, sequoiatools/cbq:fd5335] -e=http://172.23.97.149:8093 -u=Administrator -p=password -script=ALTER INDEX `default`.default_claims WITH {"action":"replica_count","num_replica": 3}
       
      Error occurred on container - sequoiatools/cbq:[-e=http://172.23.97.149:8093 -u=Administrator -p=password -script=ALTER INDEX `default`.default_claims WITH {"action":"replica_count","num_replica": 3}]
       
      docker logs fd5335
      docker start fd5335
       
      l ERROR 100 : Unable to connect to http://172.23.97.149:8093/. Request failed with error code 500. 
      
      [2021-06-01T20:22:39-07:00, sequoiatools/cmd:d12523] 300
      

      Looking at the CPU and Memory consumption pattern for 172.23.97.149, it has been consistently high for quite some time.

      For the other query node 172.23.97.150, Memory usage has been fairly constant, and not too high, but CPU usage has been up and down - probably that can be attributed to the constant load.

      CPU, Memory, Heap profiles and goroutine dumps for both the nodes are attached (collected at 6/1 ~11 PM PST). query_dumps0601.zip

      Logs attached were collected around 10.30 PM PST. Let me know if you need logs from an earlier timestamp.

      Logging this bug as we haven't seen Error 500 for a query request in quite some time in the longevity test. High memory and CPU usage might be the cause for it so it should be investigated.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mihir.kamdar Mihir Kamdar (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty