Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7552

CPU usage of beam.smp is too high during initial loading of system test for 2.0.1 build

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.0.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
      None
    • Environment:
      couchbase-server-community_x86_64_2.0.1-129-rel on Centos

      Description

      15-node-cluster with HHD 8GB RAM and 4 core CPU. During initial loading, I have 1 k "sets" ops/ per sec for each node. Some of the nodes in cluster consume all the CPU resources
      Memcached (200%~250%), beam.smp (100%~150%). We only have compaction running, no views, rebalance. It's just a Key-Value load.

      I am using couchbase-python-client to do the multi-set loading.

      I attach the log from the node which suffers from this issue.

      1. 01-16-10.3.2.115.diags.txt
        8.72 MB
        Chisheng
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        Chisheng Chisheng Hong (Inactive) added a comment -

        Can not repro it on EC2 environment. After disable memory ballooning on the previous cluster which cause this issue, CPU usage of beam.smp is back to normal.

        Show
        Chisheng Chisheng Hong (Inactive) added a comment - Can not repro it on EC2 environment. After disable memory ballooning on the previous cluster which cause this issue, CPU usage of beam.smp is back to normal.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        the specific test is now moved to EC2

        will reopen if this is observed on EC2 environment

        Show
        farshid Farshid Ghods (Inactive) added a comment - the specific test is now moved to EC2 will reopen if this is observed on EC2 environment
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        last update:
        we changed VMware setting to avoid oversubscribing cpu utilization to any vm more than others and only use as much CPU available in the host.

        Chisheng however is still seeing same behavior and we will have to investigate this further on system test. will assign this ticket back to engineering if we find out more about the environment.

        Show
        farshid Farshid Ghods (Inactive) added a comment - last update: we changed VMware setting to avoid oversubscribing cpu utilization to any vm more than others and only use as much CPU available in the host. Chisheng however is still seeing same behavior and we will have to investigate this further on system test. will assign this ticket back to engineering if we find out more about the environment.
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        I ran measure-sched-delays program on one of the machines while it was experiencing high cpu load:

        1358816552.787212133 7704073
        1358816553.781080246 1571073
        1358816554.851312160 71804073
        1358816556.044983149 265474073
        1358816556.781110048 1602073
        1358816557.781501055 1993073

        So sometimes it doesn't get a cpu share for more than two seconds. This is most likely an indication of environment problems. Talked with Farshid and he'll be working Chisheng to understand if this is really a virtualization issue.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - I ran measure-sched-delays program on one of the machines while it was experiencing high cpu load: 1358816552.787212133 7704073 1358816553.781080246 1571073 1358816554.851312160 71804073 1358816556.044983149 265474073 1358816556.781110048 1602073 1358816557.781501055 1993073 So sometimes it doesn't get a cpu share for more than two seconds. This is most likely an indication of environment problems. Talked with Farshid and he'll be working Chisheng to understand if this is really a virtualization issue.
        Hide
        Chisheng Chisheng Hong (Inactive) added a comment -

        It is reproduced in build 2.0.1-139 with R 15 on the same cluster with the same load.

        Show
        Chisheng Chisheng Hong (Inactive) added a comment - It is reproduced in build 2.0.1-139 with R 15 on the same cluster with the same load.
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        Per previous discussion, please try to reproduce with the latest build.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - Per previous discussion, please try to reproduce with the latest build.

          People

          • Assignee:
            Chisheng Chisheng Hong (Inactive)
            Reporter:
            Chisheng Chisheng Hong (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes