Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-20133

Major regression in DCP performance

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • Yes

    Description

      This is a temporary "umbrella" ticket for several issues that we observe in weekly performance tests.

      Our regular performance benchmarks indicate 10-20% regression in the following cases:

      • Initial and incremental views indexing
      • Views query latency and throughput
      • KV rebalance
      • Swap rebalance with views
      • Max. KV throughput (PillowFight workload)

      Rebalance-in with views doesn't finish at all, you can find logs here:

      http://perf.jenkins.couchbase.com/view/Weekly%20Linux/job/hestia/105/artifact/

      We noticed those issues in build 4.5.1-2750. The previous test cycle for build 4.5.1-2743 was OK.

      According to Wayne Siu it could be a DCP issue, so we ran additional PillowFight tests to understand when exactly the regression was introduced.

      It looks like the problems showed up in build 4.5.1-2748. That build had only one patch from Jim Walker:

      * Commit: e22c9ebeda1aac2fc8f4325cc39a93c3bcefffab (in build: 2748)
         Author: Jim Walker
         MB-18453: Make task scheduling fairer
         
         The MB identified that we can starve tasks by scheduling
         a higher priority task via ExecutorPool::wake().
         
         This occurs because ExecutorPool::wake() pushes tasks
         into the readyQueue enabling frequent wakes to trigger
         the starvation bug.
         
         The fix is to remove readyQueue.push from wake, so that we only
         push to the readyQueue. The fetch side of scheduling only looks at
         the futureQueue once the readyQueue is empty, thus the identified
         starvation won't happen.
         
         A unit-test demonstrates the fix using the single-threaded harness and
         expects that two tasks of differing priorities get executed, rather
         than the wake() starving the low-priority task.
         
         This test drives:
          - ExecutorPool::schedule
          - ExecutorPool::reschedule
          - ExecutorPool::wake
         
         These are all the methods which can add tasks into the scheduler
         queue.
         
         The fetch side is also covered:
          - ExecutorPool::fetchNextTask
         
         Change-Id: Ie797a637ce4e7066e3155751ff467bc65d083646
         Reviewed-on: http://review.couchbase.org/65385
         Well-Formed: buildbot 
         Tested-by: buildbot 
         Reviewed-by: Dave Rigby 
      

      cc Dave Finlay, Dave Rigby.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            jwalker Jim Walker
            pavelpaulau Pavel Paulau (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty