Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32956

KV throughput regression in SSL scenarios

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • 6.5.0
    • None
    • couchbase-bucket
    • Untriaged
    • Centos 64-bit
    • Unknown

    Description

      We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

      Scenarios and results:

      Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS
      http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all

      This is basic pillowfight test with certification enabled ( --certpath root.pem )
      6.5.0-2185 - 314K ops/sec 
      6.5.0-2201 - 292K ops/sec
      Logs: http://perf.jenkins.couchbase.com/job/ares/9352/console

       

      Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL

      http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all

      YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

      6.5.0-2185 - 2158K ops/sec
      6.5.0-2201 - 172K ops/sec

      6.5.0-2185 logs and CPU profiles: [http://perf.jenkins.couchbase.com/view/Weekly/job/hebe/2911/
      ]6.5.0-2201 logs and CPU profiles: http://perf.jenkins.couchbase.com/job/hebe/2909/console

       

      Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes

      http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI

      Another YCSB test, workloadE

      6.5.0-2158 - 10K ops/sec
      6.5.0-2214 - 9K ops/sec

       

       

       

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          oleksandr.gyryk Alex Gyryk (Inactive) created issue -
          oleksandr.gyryk Alex Gyryk (Inactive) made changes -
          Field Original Value New Value
          Description We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

          Scenarios and results:

          *Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS*
          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all]

          This is basic pillowfight test with certification enabled ( --certpath root.pem )
          6.5.0-2185 - 314K ops/sec 
          6.5.0-2201 - 292K ops/sec
          Logs: [http://perf.jenkins.couchbase.com/job/ares/9352/console]

           
          h4. *Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all]

          YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

          6.5.0-2185 - 2158K ops/sec
          6.5.0-2201 - 175K ops/sec

          Logs: [http://perf.jenkins.couchbase.com/job/hebe/2909/console]

           
          h4. *Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes*
          **[http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI]

          Another YCSB test, workloadE

          6.5.0-2158 - 10K ops/sec
          6.5.0-2214 - 9K ops/sec

           

           

           

           
          We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

          Scenarios and results:

          *Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS*
           [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all]

          This is basic pillowfight test with certification enabled ( --certpath root.pem )
          6.5.0-2185 - 314K ops/sec 
          6.5.0-2201 - 292K ops/sec
          Logs: [http://perf.jenkins.couchbase.com/job/ares/9352/console]

           
          h4. *Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all]

          YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

          6.5.0-2185 - 2158K ops/sec
          6.5.0-2201 - 175K ops/sec

          Logs: [http://perf.jenkins.couchbase.com/job/hebe/2909/console]

           
          h4. *Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI]

          Another YCSB test, workloadE

          6.5.0-2158 - 10K ops/sec
          6.5.0-2214 - 9K ops/sec

           

           

           

           
          wayne Wayne Siu made changes -
          Affects Version/s Mad-Hatter [ 15037 ]
          wayne Wayne Siu made changes -
          Fix Version/s Mad-Hatter [ 15037 ]
          oleksandr.gyryk Alex Gyryk (Inactive) made changes -
          Description We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

          Scenarios and results:

          *Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS*
           [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all]

          This is basic pillowfight test with certification enabled ( --certpath root.pem )
          6.5.0-2185 - 314K ops/sec 
          6.5.0-2201 - 292K ops/sec
          Logs: [http://perf.jenkins.couchbase.com/job/ares/9352/console]

           
          h4. *Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all]

          YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

          6.5.0-2185 - 2158K ops/sec
          6.5.0-2201 - 175K ops/sec

          Logs: [http://perf.jenkins.couchbase.com/job/hebe/2909/console]

           
          h4. *Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI]

          Another YCSB test, workloadE

          6.5.0-2158 - 10K ops/sec
          6.5.0-2214 - 9K ops/sec

           

           

           

           
          We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

          Scenarios and results:

          *Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS*
           [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all]

          This is basic pillowfight test with certification enabled ( --certpath root.pem )
          6.5.0-2185 - 314K ops/sec 
          6.5.0-2201 - 292K ops/sec
          Logs: [http://perf.jenkins.couchbase.com/job/ares/9352/console]

           
          h4. *Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all]

          YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

          6.5.0-2185 - 2158K ops/sec
          6.5.0-2201 - 172K ops/sec

          6.5.0-2185 logs and CPU profiles: 
          6.5.0-2201 logs and CPU profiles: [http://perf.jenkins.couchbase.com/job/hebe/2909/console|http://perf.jenkins.couchbase.com/view/Weekly/job/hebe/2910/]

           
          h4. *Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI]

          Another YCSB test, workloadE

          6.5.0-2158 - 10K ops/sec
          6.5.0-2214 - 9K ops/sec

           

           

           

           
          oleksandr.gyryk Alex Gyryk (Inactive) made changes -
          Description We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

          Scenarios and results:

          *Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS*
           [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all]

          This is basic pillowfight test with certification enabled ( --certpath root.pem )
          6.5.0-2185 - 314K ops/sec 
          6.5.0-2201 - 292K ops/sec
          Logs: [http://perf.jenkins.couchbase.com/job/ares/9352/console]

           
          h4. *Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all]

          YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

          6.5.0-2185 - 2158K ops/sec
          6.5.0-2201 - 172K ops/sec

          6.5.0-2185 logs and CPU profiles: 
          6.5.0-2201 logs and CPU profiles: [http://perf.jenkins.couchbase.com/job/hebe/2909/console|http://perf.jenkins.couchbase.com/view/Weekly/job/hebe/2910/]

           
          h4. *Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI]

          Another YCSB test, workloadE

          6.5.0-2158 - 10K ops/sec
          6.5.0-2214 - 9K ops/sec

           

           

           

           
          We have several  SSL scenarios. All of them indicate throughput regression between 6.5.0-2185 and 6.5.0-2201. I can't identify exact build/change because none of the builds in between work on CentOS.

          Scenarios and results:

          *Max ops/sec, cbc-pillowfight, 2 nodes, 80/20 R/W, 512B JSON items, 1K batch size, TLS*
           [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all]

          This is basic pillowfight test with certification enabled ( --certpath root.pem )
          6.5.0-2185 - 314K ops/sec 
          6.5.0-2201 - 292K ops/sec
          Logs: [http://perf.jenkins.couchbase.com/job/ares/9352/console]

           
          h4. *Avg Throughput (ops/sec), Workload A, 3 nodes, 12 vCPU, SSL*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all]

          YCSB WokloadA test with certification enabled when creating a connection object (Java SDK, builder.sslEnabled(true))

          6.5.0-2185 - 2158K ops/sec
          6.5.0-2201 - 172K ops/sec

          6.5.0-2185 logs and CPU profiles: [http://perf.jenkins.couchbase.com/view/Weekly/job/hebe/2911/
          ]6.5.0-2201 logs and CPU profiles: [http://perf.jenkins.couchbase.com/job/hebe/2909/console|http://perf.jenkins.couchbase.com/view/Weekly/job/hebe/2910/]

           
          h4. *Avg Throughput (queries/sec), Workload E, MOI, SSL, 4 nodes*

          [http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/MOI]

          Another YCSB test, workloadE

          6.5.0-2158 - 10K ops/sec
          6.5.0-2214 - 9K ops/sec

           

           

           

           
          oleksandr.gyryk Alex Gyryk (Inactive) made changes -
          Component/s couchbase-bucket [ 10173 ]
          Operating System Centos 64-bit [ 10020 ]
          Assignee Alex Gyryk [ oleksandr.gyryk ] Dave Rigby [ drigby ]
          Labels TLS performance
          drigby Dave Rigby added a comment -

          Thanks Alex. Quick look at the changelog between the good / bad build numbers shows a number of kv_engine changes: http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-2185&tob=6.5.0-2201

          Interestingly there's no obvious SSL changes I could see. I'll investigate further tomorrow.

          drigby Dave Rigby added a comment - Thanks Alex. Quick look at the changelog between the good / bad build numbers shows a number of kv_engine changes: http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-2185&tob=6.5.0-2201 Interestingly there's no obvious SSL changes I could see. I'll investigate further tomorrow.
          drigby Dave Rigby added a comment -

          Trond Norbye - could you take a look at this please?

          drigby Dave Rigby added a comment - Trond Norbye - could you take a look at this please?
          drigby Dave Rigby made changes -
          Assignee Dave Rigby [ drigby ] Trond Norbye [ trond ]
          trond Trond Norbye added a comment -

          We haven't done anything related to how we handle SSL connections. I assume we haven't installed any OS patches which could have changed antything?

          trond Trond Norbye added a comment - We haven't done anything related to how we handle SSL connections. I assume we haven't installed any OS patches which could have changed antything?
          trond Trond Norbye added a comment - - edited

          Alex Gyryk - Can I get access to machines to manually run the client and the server to reproduce this (and run git bisect and figure out which commit which introduce the regression)? (I'm located in Norway working 8-16 CET)

          trond Trond Norbye added a comment - - edited Alex Gyryk - Can I get access to machines to manually run the client and the server to reproduce this (and run git bisect and figure out which commit which introduce the regression)? (I'm located in Norway working 8-16 CET)

          Trond Norbye
          sure, both hebe and ares clusters will be available for you tomorrow (your day)

          oleksandr.gyryk Alex Gyryk (Inactive) added a comment - Trond Norbye sure, both hebe and ares clusters will be available for you tomorrow (your day)
          oleksandr.gyryk Alex Gyryk (Inactive) added a comment - BTW, recent weekly run in 2282 confirmed the regression in SSL scenarios: http://showfast.sc.couchbase.com/#/timeline/Linux/kv/ycsb/all http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all  
          trond Trond Norbye added a comment -

          In build 2360 it is back where it used to be. Marking it resolved with "cannot reproduce" (we've not changed anything related to SSL for a long time, so this is most likely a regression elsewhere in the system which is now fixed)

          trond Trond Norbye added a comment - In build 2360 it is back where it used to be. Marking it resolved with "cannot reproduce" (we've not changed anything related to SSL for a long time, so this is most likely a regression elsewhere in the system which is now fixed)
          trond Trond Norbye made changes -
          Resolution Cannot Reproduce [ 5 ]
          Status Open [ 1 ] Resolved [ 5 ]
          trond Trond Norbye made changes -
          Actual End 2019-02-19 20:32 (issue has been resolved)
          wayne Wayne Siu made changes -
          Assignee Trond Norbye [ trond ] Alex Gyryk [ oleksandr.gyryk ]
          wayne Wayne Siu made changes -
          Fix Version/s Mad-Hatter [ 15037 ]
          wayne Wayne Siu added a comment -

          Closing the issue was not reproducible.

          wayne Wayne Siu added a comment - Closing the issue was not reproducible.
          wayne Wayne Siu made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

          People

            oleksandr.gyryk Alex Gyryk (Inactive)
            oleksandr.gyryk Alex Gyryk (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty