Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7972

Performance dip during rebalance on bucket not involved in rebalance

    Details

    • Sprint:
      PCI Team - Sprint 3, PCI Team - Sprint 4, PCI Team - Sprint 5, PCI Team - Sprint 6, CB Bucket - 08/12 - 08/29, 12/Aug - 30/Aug

      Description

      Environment:
      -memcache-test running through client-side Moxi.
      -memcache-test workload: /root/mctest/Users/ingenthr/tmp/memcachetest-centos_x86 -h localhost -i 1000 -t 4 -K test -M 10 -F -S -l

      -Moxi config: CLI: /opt/moxi/bin/moxi -u nobody -Z usr=Administrator,pwd=password http://10.197.1.147:8091/pools/default/bucketsStreaming/beer-sample -vv -O /var/log/moxi,
      config:
      port_listen=11211,
      default_bucket_name=default,
      downstream_max=1024,
      downstream_conn_max=4,
      downstream_conn_queue_timeout=200,
      downstream_timeout=5000,
      wait_queue_timeout=200,
      connect_max_errors=5,
      connect_retry_interval=30000,
      connect_timeout=400,
      auth_timeout=100,
      cycle=200

      -4 nodes of Couchbase Server 2.0.1. c1.medium EC2 instances, beer-sample database bucket
      -2 nodes of same config added to cluster and rebalanced

      Observation:
      -Workload very steady around 8k ops/sec before rebalance
      -Workload drops to around 4k ops/sec during rebalance. Both gets and sets affected, compaction not running when load initially drops (kicks in eventually). Workload very choppy during rebalance.
      -Workload returns to 8k ops/sec after rebalance

      Client memcachetest and moxi logs:
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/client1.zip
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/client2.zip

      Cluster logs:
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/node1.zip
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/node2.zip
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/node3.zip
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/node4.zip
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/node5.zip (aded to first 4)
      https://s3.amazonaws.com/customers.couchbase.com/demo_performance/node6.zip (added to first 4)

      1. Couchbase2Test.properties
        1 kB
        Pavel Paulau
      2. pillowfight-libcouchbase-with-moxi-timings.txt
        133 kB
        Sergey Avseyev
      3. pillowfight-libcouchbase-without-moxi-timings.txt
        2.37 MB
        Sergey Avseyev
      4. stats.tar
        226 kB
        Pavel Paulau
      1. aot_wrath_200_beginning.png
        553 kB
      2. aot_wrath_200_end.png
        527 kB
      3. aot_wrath_200_middle.png
        598 kB
      4. aot_wrath_201_beginning.png
        521 kB
      5. aot_wrath_201_end.png
        498 kB
      6. aot_wrath_201_middle.png
        556 kB
      7. aot_wrath_202_beginning.png
        502 kB
      8. aot_wrath_202_end.png
        471 kB
      9. aot_wrath_202_middle.png
        499 kB
      10. mb-7972-sergey-pillowfight-test.png
        136 kB
      11. pillowfight-libcouchbase-load.png
        136 kB
      12. pillowfight-libcouchbase-rebalance-after-remove-performance-drop.png
        555 kB
      13. pillowfight-libcouchbase-with-moxi.png
        525 kB
      14. pillowfight-libcouchbase-without-moxi.png
        518 kB
      15. roadrunner_200_beginning.png
        582 kB
      16. roadrunner_200_end.png
        542 kB
      17. roadrunner_200_midlle.png
        605 kB
      18. roadrunner_201_beginning.png
        574 kB
      19. roadrunner_201_end.png
        540 kB
      20. roadrunner_201_midlle.png
        607 kB
      21. Screen Shot 2013-04-26 at 11.59.39 AM.png
        41 kB
      22. Screen Shot 2013-04-26 at 12.02.08 PM.png
        42 kB
      23. ycsb_01.png
        576 kB
      24. ycsb_02.png
        573 kB
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        avsej Sergey Avseyev added a comment -

        Chiyoung, you in your script there should be

        LIBCOUCHBASE_VERSION=2.0.6_16_ge05afd4

        instead of

        LIBCOUCHBASE_VERSION=2.0.6_10_g84c6187

        And also have ./src/gethrtime.c during pillowfight compilation

        See my latest comment about how to build pillowfight tool with --dumb option from "06/Jun/13 4:39 PM"
        http://www.couchbase.com/issues/browse/MB-7972?focusedCommentId=60332&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-60332

        Show
        avsej Sergey Avseyev added a comment - Chiyoung, you in your script there should be LIBCOUCHBASE_VERSION=2.0.6_16_ge05afd4 instead of LIBCOUCHBASE_VERSION=2.0.6_10_g84c6187 And also have ./src/gethrtime.c during pillowfight compilation See my latest comment about how to build pillowfight tool with --dumb option from "06/Jun/13 4:39 PM" http://www.couchbase.com/issues/browse/MB-7972?focusedCommentId=60332&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-60332
        Hide
        chiyoung Chiyoung Seo added a comment -

        Sergey,

        Yes, I used the correct version and was able to use --dumb option in my tests. The instructions in my previous comments was not update to date.

        Show
        chiyoung Chiyoung Seo added a comment - Sergey, Yes, I used the correct version and was able to use --dumb option in my tests. The instructions in my previous comments was not update to date.
        Hide
        avsej Sergey Avseyev added a comment -

        I guess Chiyoung was using correct version, just copy/paste issue

        Show
        avsej Sergey Avseyev added a comment - I guess Chiyoung was using correct version, just copy/paste issue
        Hide
        chiyoung Chiyoung Seo added a comment -

        The ep-engine side fix was just merged into 2.2.0 branch. The 2.2.0 RC2 build should include this fix:

        http://review.couchbase.org/#/c/28554/

        I close this bug as the above fix addresses the significant drop issue although we still see 5 - 15% drop and this ticket has too long threads. I created a separate ticket MB-9004 to further address this issue in the next release.

        Show
        chiyoung Chiyoung Seo added a comment - The ep-engine side fix was just merged into 2.2.0 branch. The 2.2.0 RC2 build should include this fix: http://review.couchbase.org/#/c/28554/ I close this bug as the above fix addresses the significant drop issue although we still see 5 - 15% drop and this ticket has too long threads. I created a separate ticket MB-9004 to further address this issue in the next release.
        Hide
        perry Perry Krug added a comment -

        Thank you very much again Chiyoung. I verified that this does produce a meaningful improvement in performance during a rebalance. The second fix also provides a noticeable benefit so will eagerly look forward to that in the future release.

        Thanks to everyone who participated in this long thread of reproduction, isolation, investigation, resolving and testing.

        Show
        perry Perry Krug added a comment - Thank you very much again Chiyoung. I verified that this does produce a meaningful improvement in performance during a rebalance. The second fix also provides a noticeable benefit so will eagerly look forward to that in the future release. Thanks to everyone who participated in this long thread of reproduction, isolation, investigation, resolving and testing.

          People

          • Assignee:
            chiyoung Chiyoung Seo
            Reporter:
            perry Perry Krug
          • Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Agile

                Gerrit Reviews

                There are no open Gerrit changes