Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-170

After upgrading to 2.0.2 with CBC163, I now get timesouts again!!!

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.0.2
    • Fix Version/s: 2.2.0
    • Component/s: library
    • Security Level: Public
    • Labels:
      None
    • Environment:
      CentsOS 5.X, 64 Bit, libev version

      Description

      I was using 2.0.0 beta3 with the patch for TIMEOUTS manually applied....this was working fine but I had high CPU. This ran for weeks (if not a month) without any
      timeouts at all...I was the happiest with lcb as I had been in a long time. However, after getting some complaints from my server admins and NOC, I upgraded to
      2.0.2 with the fix for high CPU...

      I rebuilt all modules and rolled to production. It's now been about 4 hours and I have received 2 timeouts as to where I had NONE for weeks without this change.

      The load is now normal though but somehow something has broken because of this fix.

      I looked at the change with CBC-163, but I don't see how this could have broken the code...but, it has - I promise. Can someone please take a look?

      HELP!

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        mleib Michael Leib created issue -
        Hide
        mleib Michael Leib added a comment -

        I have changed my code around so that I have an ev_timer throttle the millions of lcb_get() requests that I need to make over multiple request chunks from a queue rather than doing it all in a single loop and then letting libev handle it all and that appears to be working. There is still be a problem with the latest code if you do millions of requests within a function call before control is returned to the libev event loop. This doesn't happen everytime, but it does happen frequently enough.

        Keeping the number of lcb_get() calls sitting on the event stack low makes a BIG difference.

        Once again, with the version that didn't have the high-load fix implemented worked great (except for the load)...when the fix was applied, large amounts of pending events cause TIMEOUT returns from lcb_get(),

        Show
        mleib Michael Leib added a comment - I have changed my code around so that I have an ev_timer throttle the millions of lcb_get() requests that I need to make over multiple request chunks from a queue rather than doing it all in a single loop and then letting libev handle it all and that appears to be working. There is still be a problem with the latest code if you do millions of requests within a function call before control is returned to the libev event loop. This doesn't happen everytime, but it does happen frequently enough. Keeping the number of lcb_get() calls sitting on the event stack low makes a BIG difference. Once again, with the version that didn't have the high-load fix implemented worked great (except for the load)...when the fix was applied, large amounts of pending events cause TIMEOUT returns from lcb_get(),
        Hide
        mleib Michael Leib added a comment -

        And, of course, event with an ev_timer throttle of .01 secs, performance is about 1/4 of what it was when I just threw everything on the queue -
        basically about 5k/sec throughput vs 20k/sec just focusing on lcb_get() returns. My CB Cluster (2 servers) is reporting 50K ops/sec consistently
        without issue.

        Show
        mleib Michael Leib added a comment - And, of course, event with an ev_timer throttle of .01 secs, performance is about 1/4 of what it was when I just threw everything on the queue - basically about 5k/sec throughput vs 20k/sec just focusing on lcb_get() returns. My CB Cluster (2 servers) is reporting 50K ops/sec consistently without issue.
        Hide
        mnunberg Mark Nunberg added a comment -

        I'm going to assume most timeout issues were fixed in 2.2.0 and will close this.

        Show
        mnunberg Mark Nunberg added a comment - I'm going to assume most timeout issues were fixed in 2.2.0 and will close this.
        mnunberg Mark Nunberg made changes -
        Field Original Value New Value
        Fix Version/s 2.2.0 [ 11003 ]
        avsej Sergey Avseyev made changes -
        Assignee Sergey Avseyev [ avsej ] Mark Nunberg [ mnunberg ]
        mnunberg Mark Nunberg made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Incomplete [ 4 ]
        brett19 Brett Lawson made changes -
        Workflow jira [ 23766 ] Couchbase SDK Workflow [ 44010 ]

          People

          • Assignee:
            mnunberg Mark Nunberg
            Reporter:
            mleib Michael Leib
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes