Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45948

Uneven CPU usage across index nodes after creating indexes

    XMLWordPrintable

Details

    Description

      Summary
      Not sure if there is an issue here, but observed that index nodes have different cpu usage after creating 3000 gsi indexes (each with 3 replicas); some of them have 98% cpu usage, while a couple of them have 10 % of cpu usage. 

      Steps to Reproduce:
      1. Create 2 kv, 1 n1ql, 10 index nodes cluster
      2. Create a bucket with a total of 1000 collections. Flush all items
      3.  Create 3K indexes (3 indexes per collection) each with replicas=2 with defer_build=true and build them on their respective collections.
      4. Add 10 items into each collection.

      Observations

      Everything works fine. The  RAM is high on most of the index nodes (which is expected I think given the VMs have 24 GB RAM, and 8 cores). But the CPU usage varies a lot on index nodes after step 4. For example, 172.23.107.45 has 10% cpu usage while 172.23.121.78 is maxed out at 99% cpu. So filing this if in case there is an issue here

      Note that when the same test on bigger nodes having 256GiB/node, 72*2 threads/node did not face this issue of uneven cpu usage.

      Screenshots, logs, consoleText attached.
      (As seen in the node_78_cpu.png, the cpu usage spikes up at the end of step4.
      From servers.png nodes .74, .54, .45 all have low cpu usage)

      top on .78 node

         PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                           
       48471 couchba+  20   0   25.2g  20.7g  10092 S 782.7 88.6   1948:45 indexer                                                                           
       47560 couchba+  20   0 4944088 382044   3124 S   2.3  1.6 288:56.30 beam.smp                                                                          
       48314 couchba+  20   0 1399740 282216   5320 S   1.7  1.2  27:31.52 prometheus 

      Attachments

        1. 121_78.indexer_cprof.svg
          143 kB
        2. consoleText.txt
          190 kB
        3. new_107.44_indexer_cprof.svg
          141 kB
        4. new_107.58_indexer_cprof.svg
          146 kB
        5. node_78_cpu.png
          node_78_cpu.png
          208 kB
        6. servers.png
          servers.png
          310 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          ritam.sharma Ritam Sharma added a comment - Current Testing blocked at first run. https://issues.couchbase.com/secure/attachment/138025/Screenshot%202021-04-29%20at%209.23.45%20PM.png https://docs.google.com/spreadsheets/d/1Tx71SPJwC8lfRZTSyAKJcQERkinwuc0UUSRdK0RmvuY/edit#gid=1278811408 Line - 15-20. Plan is to get to 10K indexes
          jliang John Liang added a comment -

          Ritam Sharma If you are blocked re-running the test on 5108, can you provide logs or file a new MB? The issue may be something else.

          jliang John Liang added a comment - Ritam Sharma If you are blocked re-running the test on 5108, can you provide logs or file a new MB? The issue may be something else.
          jliang John Liang added a comment -

          I have merged all my fixes to master. I am resolving this for now. Feel free to re-open it or file a new MB.

          jliang John Liang added a comment - I have merged all my fixes to master. I am resolving this for now. Feel free to re-open it or file a new MB.

          Build couchbase-server-7.0.0-5116 contains plasma commit eb4abb7 with commit message:
          MB-45948: Reduce frequency in calling jemalloc mm_free2os

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-5116 contains plasma commit eb4abb7 with commit message: MB-45948 : Reduce frequency in calling jemalloc mm_free2os

          Verified on 7.0.0-5157. Closing.

          sumedh.basarkod Sumedh Basarkod (Inactive) added a comment - Verified on 7.0.0-5157. Closing.

          People

            sumedh.basarkod Sumedh Basarkod (Inactive)
            sumedh.basarkod Sumedh Basarkod (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty