Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-23074

Performance issues when running Couchbase Server on CentOS 7.3 with kernel 3.10.0-514.6

    XMLWordPrintable

Details

    Description

      We upgraded our performance clusters to CentOS 7.3 a few days ago.
      Unfortunately that upgrade caused a lot of troubles:

      • There was ~60% drop in DGM cases.
      • KV latency in non-DGM cases became more inconsistent.

      I started analyzing the most basic case with the initial data load. I noticed that the drain rate became more choppy on 3 boxes (see screenshot) while one server was working just fine.

      I tried to examine IO performance using standalone benchmarks but I didn't manage to find anything interesting. Only read and write performance of Couchbase Server was affected.

      Eventually I noticed a tiny difference between those boxes. "Bad" machines had kernel 3.10.0-514.6.2 and "good" machine had 3.10.0-514.2.2. A few experiments confirmed that upgrade from *.514.2.2 to *.514.6.2. caused all those problems.

      I downgraded our servers all the way to 3.10.0-317 and relaxed. Until I started working with a setup provided by one of our partners. That setup has RHEL 7.3 with 3.10.0-514.6.2 and I am supposed to run some heavy DGM workloads...

      RHEL/CentOS is a very conservative distribution. Who knows how long this issue will remain open. I think we better find out what exactly happened before other people start hitting the same problem.

      Attachments

        1. drain.png
          801 kB
          Pavel Paulau

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              wayne Wayne Siu
              pavelpaulau Pavel Paulau (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty