Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-23074

Performance issues when running Couchbase Server on CentOS 7.3 with kernel 3.10.0-514.6

    XMLWordPrintable

    Details

      Description

      We upgraded our performance clusters to CentOS 7.3 a few days ago.
      Unfortunately that upgrade caused a lot of troubles:

      • There was ~60% drop in DGM cases.
      • KV latency in non-DGM cases became more inconsistent.

      I started analyzing the most basic case with the initial data load. I noticed that the drain rate became more choppy on 3 boxes (see screenshot) while one server was working just fine.

      I tried to examine IO performance using standalone benchmarks but I didn't manage to find anything interesting. Only read and write performance of Couchbase Server was affected.

      Eventually I noticed a tiny difference between those boxes. "Bad" machines had kernel 3.10.0-514.6.2 and "good" machine had 3.10.0-514.2.2. A few experiments confirmed that upgrade from *.514.2.2 to *.514.6.2. caused all those problems.

      I downgraded our servers all the way to 3.10.0-317 and relaxed. Until I started working with a setup provided by one of our partners. That setup has RHEL 7.3 with 3.10.0-514.6.2 and I am supposed to run some heavy DGM workloads...

      RHEL/CentOS is a very conservative distribution. Who knows how long this issue will remain open. I think we better find out what exactly happened before other people start hitting the same problem.

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            amarantha.kulkarni Amarantha Kulkarni added a comment -

            Sure - created DOC-2386 to track the release notes update and is currently planned for Beta2. Please let me know if you want this to be added to the Beta 1 release notes. 

            Show
            amarantha.kulkarni Amarantha Kulkarni added a comment - Sure - created DOC-2386  to track the release notes update and is currently planned for Beta2. Please let me know if you want this to be added to the Beta 1 release notes. 
            Hide
            pavelpaulau Pavel Paulau (Inactive) added a comment -

            According to RH, this is expected behavior.

            The above mentioned patch addresses a critical integrity issue. The fix comes with a price.

            Show
            pavelpaulau Pavel Paulau (Inactive) added a comment - According to RH, this is expected behavior. The above mentioned patch addresses a critical integrity issue. The fix comes with a price.
            Hide
            amarantha.kulkarni Amarantha Kulkarni added a comment -

            Description for release notes:

            Summary: Performance issues may be observed when running Couchbase Server on CentOS 7.3 with kernel 3.10.0-514.6.

            Show
            amarantha.kulkarni Amarantha Kulkarni added a comment - Description for release notes: Summary : Performance issues may be observed when running Couchbase Server on CentOS 7.3 with kernel 3.10.0-514.6.
            Hide
            prainer Peter Rainer added a comment -

            Is this problem limited to kernel version 3.10.0-514.6 or does it also occur in newer kernel versions (i.e. 3.10.0-862.2.3) ?

            Show
            prainer Peter Rainer added a comment - Is this problem limited to kernel version 3.10.0-514.6 or does it also occur in newer kernel versions (i.e. 3.10.0-862.2.3) ?
            Hide
            lynn.straus Lynn Straus added a comment -

            Found this as an open ticket in spock.next which is not active.  Added Mad-Hatter fix version so that the ticket can be triaged.

            Show
            lynn.straus Lynn Straus added a comment - Found this as an open ticket in spock.next which is not active.  Added Mad-Hatter fix version so that the ticket can be triaged.

              People

              • Assignee:
                pavelpaulau Pavel Paulau (Inactive)
                Reporter:
                pavelpaulau Pavel Paulau (Inactive)
              • Votes:
                1 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty

                    Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.