Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32645

[high-bucket] High CPU utilisation during kv rebalance

    XMLWordPrintable

Details

    Description

      Build 6.0.0-1693

      As discussed in high bucket density sync-up meeting, logging this issue for investigation.
      Observed CPU utilisation spikes upto 80% on 24 core orchestrator machine during KV rebalance going on with 30 buckets present in cluster.

      CPU utilisation graph during rebalance-

      cbmonitor link- http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=arke_basic_600-1693_run_kv_rebalance_dd46

      Logs: 

      KV node- https://s3.amazonaws.com/bugdb/jira/index_reb_multibucket/collectinfo-2019-01-08T151840-ns_1%40172.23.97.12.zip
      KV node- https://s3.amazonaws.com/bugdb/jira/index_reb_multibucket/collectinfo-2019-01-08T151840-ns_1%40172.23.97.13.zip
      KV node- https://s3.amazonaws.com/bugdb/jira/index_reb_multibucket/collectinfo-2019-01-08T151840-ns_1%40172.23.97.14.zip

      Attachments

        1. 1_bucket_ns_server.png
          1_bucket_ns_server.png
          496 kB
        2. 1_bucket.png
          1_bucket.png
          667 kB
        3. 10_bucket_ns_server.png
          10_bucket_ns_server.png
          472 kB
        4. 10_bucket.png
          10_bucket.png
          411 kB
        5. 30_buckets_kv_cpu_util.png
          30_buckets_kv_cpu_util.png
          51 kB
        6. 30_buckets_ns_server.png
          30_buckets_ns_server.png
          559 kB
        7. 30_buckets.png
          30_buckets.png
          592 kB
        8. 5_buckets_ns_server.png
          5_buckets_ns_server.png
          610 kB
        9. 5_buckets.png
          5_buckets.png
          509 kB
        10. 6.6.2_CPU.png
          6.6.2_CPU.png
          497 kB
        11. 7.0.0_CPU.png
          7.0.0_CPU.png
          282 kB
        12. 8cores_24cores.png
          8cores_24cores.png
          411 kB
        13. eventing.png
          eventing.png
          532 kB
        14. fts.png
          fts.png
          372 kB
        15. image-2019-01-15-12-11-15-521.png
          image-2019-01-15-12-11-15-521.png
          640 kB
        16. index_query_15.png
          index_query_15.png
          465 kB
        17. index_query_19.png
          index_query_19.png
          655 kB
        18. index_query_20.png
          index_query_20.png
          554 kB
        19. kv_12.png
          kv_12.png
          588 kB
        20. kv_13.png
          kv_13.png
          621 kB
        21. kv_14.png
          kv_14.png
          577 kB
        22. new_30_bucket_ns_server.png
          new_30_bucket_ns_server.png
          610 kB
        23. new_30_bucket.png
          new_30_bucket.png
          640 kB
        24. oc_and_cbas.png
          oc_and_cbas.png
          463 kB
        25. orchestrator_vs_fts_only_node.png
          orchestrator_vs_fts_only_node.png
          611 kB
        26. Screen Shot 2019-01-15 at 09.23.26.png
          Screen Shot 2019-01-15 at 09.23.26.png
          90 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          shivani.gupta Shivani Gupta added a comment - - edited

          Bo-Chun Wang pinging you again on this? Is this something you can do?

          There is one more test we would like to have results for. Can you please run the same 30 bucket test on 7.0 with the CPU for Data Service nodes limited to 8 cores? I believe the data service nodes are 24 core machines, but can you limit them to 8 cores only? Don't change anything on the Index/Query nodes. Thanks much for running these tests.

          shivani.gupta Shivani Gupta added a comment - - edited Bo-Chun Wang  pinging you again on this? Is this something you can do? There is one more test we would like to have results for. Can you please run the same 30 bucket test on 7.0 with the CPU for Data Service nodes limited to 8 cores? I believe the data service nodes are 24 core machines, but can you limit them to 8 cores only? Don't change anything on the Index/Query nodes. Thanks much for running these tests.
          bo-chun.wang Bo-Chun Wang added a comment -

          Shivani Gupta

          It's possible. However, to limit CPU for data service nodes without touching other nodes, I have to do some settings manually. We are running 6.6.2 and 7.0 weekly runs right now so the cluster is busy. I will do it later this week after we finish weekly runs

          bo-chun.wang Bo-Chun Wang added a comment - Shivani Gupta It's possible. However, to limit CPU for data service nodes without touching other nodes, I have to do some settings manually. We are running 6.6.2 and 7.0 weekly runs right now so the cluster is busy. I will do it later this week after we finish weekly runs

          Thanks Bo-Chun Wang.

          shivani.gupta Shivani Gupta added a comment - Thanks Bo-Chun Wang .

          Shivani Gupta

          I have finished a 30-bucket run with 7.0.0-4678. The number of CPU cores on data service nodes is limited to 8 cores.

          Job: http://perf.jenkins.couchbase.com/job/themis_multibucket/83/

          Log:

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.15.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.19.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.20.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.23.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.97.177.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.157.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.158.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.159.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.160.zip

          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.161.zip

          cbmonitor link: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=source_cluster_700-4678_run_kv_rebalance_1df8

           

          Comparison between 8 cores and 24 cores:

          http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=source_cluster_700-4678_run_kv_rebalance_455d&label=24cores&snapshot=source_cluster_700-4678_run_kv_rebalance_1df8&label=8cores

           

          bo-chun.wang Bo-Chun Wang added a comment - Shivani Gupta I have finished a 30-bucket run with 7.0.0-4678. The number of CPU cores on data service nodes is limited to 8 cores. Job: http://perf.jenkins.couchbase.com/job/themis_multibucket/83/ Log: https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.15.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.19.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.20.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.96.23.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.97.177.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.157.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.158.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.159.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.160.zip https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-themis_multibucket-83/172.23.99.161.zip cbmonitor link: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=source_cluster_700-4678_run_kv_rebalance_1df8   Comparison between 8 cores and 24 cores: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=source_cluster_700-4678_run_kv_rebalance_455d&label=24cores&snapshot=source_cluster_700-4678_run_kv_rebalance_1df8&label=8cores  

          Thanks Bo-Chun Wang, this is very helpful.

          shivani.gupta Shivani Gupta added a comment - Thanks Bo-Chun Wang , this is very helpful.

          People

            bo-chun.wang Bo-Chun Wang
            mahesh.mandhare Mahesh Mandhare (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty