Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-1525

68% decrease in throughput for KV collections tests

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4
    • 3.2.5
    • library
    • 1

    Description

      In our 50/50 read/write KV throughput tests that use collections, we have been seeing very large regressions with LCB 3.2.3 and also LCB 3.2.4. For example: http://showfast.sc.couchbase.com/#/runs/kv_max_ops_balanced_512_1000s_1000c_ares/7.1.0-1695

       

      Some testing confirms that it is a regression introduced with LCB 3.2.0, and is different to the regression that led to CCBC-1515 (which was fixed in LCB 3.2.4). The following three runs demonstrate this:

       

       

      It would appear that the issue is similar to that in CCBC-1515, wherein we are using std::stringstream somewhat unnecessarily. This time it looks like the culprit is in collection_qualifier.hh:

      https://github.com/couchbase/libcouchbase/blob/master/src/capi/collection_qualifier.hh#L51

       

      Changing the construction of spec_ to use simple string appending instead of a stringstream would help, given that the collection_qualifier is used for every KV operation when using collections. This quickbench link demonstrates that it would be worth doing this:

      https://quick-bench.com/q/W-yAn-1scyinkDlyd-yKFvSUdhc

       

      I fully understand this is something I should have found when investigating CCBC-1515, so I apologise that this can now only be fixed for LCB 3.2.5.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Daniel.nagy Daniel Nagy added a comment -

          I've created a patch to replace the use of stringstream in collection_qualifer.hh with just a string append approach. I've ran one test so far with this patch, and results are promising:

          http://perf.jenkins.couchbase.com/job/ares-sdk/184 Throughput: 2,877,418

          I'd like to run a few more tests to confirm that this patch is enough, or whether more needs to be done.

          Daniel.nagy Daniel Nagy added a comment - I've created a patch to replace the use of stringstream in collection_qualifer.hh with just a string append approach. I've ran one test so far with this patch, and results are promising: http://perf.jenkins.couchbase.com/job/ares-sdk/184 Throughput: 2,877,418 I'd like to run a few more tests to confirm that this patch is enough, or whether more needs to be done.
          Daniel.nagy Daniel Nagy added a comment - - edited

          I've reran a few tests that were performing poorly with LCB 3.2.4 prior to the above patch, and here are the results:

           

           

          Test Throughput (GA 3.2.4) Throughput (patched 3.2.4) Link (GA 3.2.4) Link (patched 3.2.4)
          50/50 RW
          1000 scopes 1000 collections
          961183 2913686 http://perf.jenkins.couchbase.com/job/ares/25441 http://perf.jenkins.couchbase.com/job/ares-sdk/191/
          50/50 RW
          1 scope
          1000 collections
          1168809 2867797 http://perf.jenkins.couchbase.com/job/ares/25442 http://perf.jenkins.couchbase.com/job/ares-sdk/192
          50/50 RW
          1 scope
          1 collection
          977414 3007279 http://perf.jenkins.couchbase.com/job/ares/25443 http://perf.jenkins.couchbase.com/job/ares-sdk/193
          50/50 RW
          default collection
          977131 3075142 http://perf.jenkins.couchbase.com/job/ares/25444 http://perf.jenkins.couchbase.com/job/ares-sdk/194

           

          All of these new results are back in line with what we are getting with LCB 3.0.7 on newer couchbase server builds, as we can see on ShowFast: http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all

           

          Overall I'm confident that the patch has adequately addressed the performance regression, so I think we can close the ticket.

           

          Daniel.nagy Daniel Nagy added a comment - - edited I've reran a few tests that were performing poorly with LCB 3.2.4 prior to the above patch, and here are the results:     Test Throughput (GA 3.2.4) Throughput (patched 3.2.4) Link (GA 3.2.4) Link (patched 3.2.4) 50/50 RW 1000 scopes 1000 collections 961183 2913686 http://perf.jenkins.couchbase.com/job/ares/25441 http://perf.jenkins.couchbase.com/job/ares-sdk/191/ 50/50 RW 1 scope 1000 collections 1168809 2867797 http://perf.jenkins.couchbase.com/job/ares/25442 http://perf.jenkins.couchbase.com/job/ares-sdk/192 50/50 RW 1 scope 1 collection 977414 3007279 http://perf.jenkins.couchbase.com/job/ares/25443 http://perf.jenkins.couchbase.com/job/ares-sdk/193 50/50 RW default collection 977131 3075142 http://perf.jenkins.couchbase.com/job/ares/25444 http://perf.jenkins.couchbase.com/job/ares-sdk/194   All of these new results are back in line with what we are getting with LCB 3.0.7 on newer couchbase server builds, as we can see on ShowFast: http://showfast.sc.couchbase.com/#/timeline/Linux/kv/max_ops/all   Overall I'm confident that the patch has adequately addressed the performance regression, so I think we can close the ticket.  

          Build couchbase-server-7.1.0-2277 contains libcouchbase commit 45dcd8f with commit message:
          CCBC-1525: remove stringstream in collection_qualifier

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2277 contains libcouchbase commit 45dcd8f with commit message: CCBC-1525 : remove stringstream in collection_qualifier

          People

            avsej Sergey Avseyev
            Daniel.nagy Daniel Nagy
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty