Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-1477

CPU time x2.2 when upgrading to Couchbase server 7.0.0?

    XMLWordPrintable

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 3.2.1
    • 3.3.2
    • library
    • None
    • Ubuntu 20.04 64 bits
    • Impediment
    • 1

    Description

      At the occasion of the upgrade to 7.0.0, we witness an increase of CPU time on the client side of 2.2 times approximately, which is huge. That CPU time includes application + libcouchbase.

      Anything we're doing wrong?

      We're not using 7.0.0 features, and the application mostly saves documents. Nothing fancy.

      The number of writing threads is set to 8 on a 32-core machine.

      We're not using any compression explicitly, and even when using the minimum compression size to a very large number (which should inhibit compression), the CPU cost is still the same.

      Thanks for your help!

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Vincent Lextrait: can you give us some detail on how lcb is configured? And what are you comparing specifically? Did you upgrade just the cluster side, or both the cluster and the SDK side?

          Note that some of the behavioral changes from earlier SDKs (2.x to 3.x) are built in retries and Response Time Observability (tracing). Quite a bit of effort went in to ensure that tracing didn't use a lot of CPU, but it uses some and something could have been different between our environment and yours. If enabled, you may want to try disabling it to see if it's the cause.

          Any other profiling info would be quite useful as well.

          ingenthr Matt Ingenthron added a comment - Vincent Lextrait : can you give us some detail on how lcb is configured? And what are you comparing specifically? Did you upgrade just the cluster side, or both the cluster and the SDK side? Note that some of the behavioral changes from earlier SDKs (2.x to 3.x) are built in retries and Response Time Observability (tracing). Quite a bit of effort went in to ensure that tracing didn't use a lot of CPU, but it uses some and something could have been different between our environment and yours. If enabled, you may want to try disabling it to see if it's the cause. Any other profiling info would be quite useful as well.

          Hello Matt, I finally nailed it down to RAM quota on the bucket. The CPU time on the client seems affected by shortage of RAM on the bucket quota on the server side. Maybe when it is too small, it triggers retries on the client side?

           

          vincent.lextrait Vincent Lextrait added a comment - Hello Matt, I finally nailed it down to RAM quota on the bucket. The CPU time on the client seems affected by shortage of RAM on the bucket quota on the server side. Maybe when it is too small, it triggers retries on the client side?  

          We automate the creation of buckets, and we use cbc bucket-create. It assigns only 100MB of RAM quota by default.

          vincent.lextrait Vincent Lextrait added a comment - We automate the creation of buckets, and we use cbc bucket-create. It assigns only 100MB of RAM quota by default.

          Vincent Lextrait, do you see it on empty but small bucket, or any bucket which utilization reaches 100%?

          avsej Sergey Avseyev added a comment - Vincent Lextrait , do you see it on empty but small bucket, or any bucket which utilization reaches 100%?

          Hello Sergey, we create an empty bucket with cbc create-bucket. It assigns a 100MB RAM quota (which I had forgotten to upgrade, my bad), and we start saving a lot of documents to that bucket (~500,000). It's the only bucket in the instance.

          What confused me was that the server still did the task, it was only increasing CPU consumption on libcouchbase side (by x2-x2.5). So my eyes were only on that side.

          The pattern we use for libcouchbase is async save on ~5,000 documents, then wait on the instance, then proceed to 5,000 more documents until we save 500,000.

          When assigning 10GB RAM to the bucket, everything is back to normal.

          Thanks!

          vincent.lextrait Vincent Lextrait added a comment - Hello Sergey, we create an empty bucket with cbc create-bucket. It assigns a 100MB RAM quota (which I had forgotten to upgrade, my bad), and we start saving a lot of documents to that bucket (~500,000). It's the only bucket in the instance. What confused me was that the server still did the task, it was only increasing CPU consumption on libcouchbase side (by x2-x2.5). So my eyes were only on that side. The pattern we use for libcouchbase is async save on ~5,000 documents, then wait on the instance, then proceed to 5,000 more documents until we save 500,000. When assigning 10GB RAM to the bucket, everything is back to normal. Thanks!

          Clarification, the x2-x2.5 is for the entire application, including libcouchbase, so the extra cost for libcouchbase must be huge, as the remainder stays the same.

          vincent.lextrait Vincent Lextrait added a comment - Clarification, the x2-x2.5 is for the entire application, including libcouchbase, so the extra cost for libcouchbase must be huge, as the remainder stays the same.

          We have a theory that you might be able to confirm (and might explain another ticket we opened). libcouchbase uses either libev, libuv or libevent for asynchronous TCP I/Os. These libraries probably have "active loops", meaning that they set sockets in async mode, and loop around recv and sendmsg, attempting to receive or send whatever the socket can handle. So (especially in receive mode) the slower the peer, the more CPU is used on the client. Counter intuitive but true.

          So the client CPU consumption might be unrelated to the way libcouchbase is written, it's just the way the async library is written.

          Is it possible? We're witnessing this with boost asio, and at least in that case that's the explanation. Asio is state of the art, and has active loops (infinite for loops). If it could do different I guess it would. The libev et al. must do the same?

           

          vincent.lextrait Vincent Lextrait added a comment - We have a theory that you might be able to confirm (and might explain another ticket we opened). libcouchbase uses either libev, libuv or libevent for asynchronous TCP I/Os. These libraries probably have "active loops", meaning that they set sockets in async mode, and loop around recv and sendmsg, attempting to receive or send whatever the socket can handle. So (especially in receive mode) the slower the peer, the more CPU is used on the client. Counter intuitive but true. So the client CPU consumption might be unrelated to the way libcouchbase is written, it's just the way the async library is written. Is it possible? We're witnessing this with boost asio, and at least in that case that's the explanation. Asio is state of the art, and has active loops (infinite for loops). If it could do different I guess it would. The libev et al. must do the same?  

          People

            avsej Sergey Avseyev
            vincent.lextrait Vincent Lextrait
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty