Uploaded image for project: 'Couchbase .NET client library'
  1. Couchbase .NET client library
  2. NCBC-3712

Throughput drops to 0 on Subdoc workload on rebalancing cluster

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • None
    • None
    • None
    • 0
    • SDK12: Scp Fnc, FIT, Misc, SDK14: CLoE + Others, SDK16: Clmnr Proto & C++

    Description

      Running the SDKD jobs, a number of the subdoc workload tests have begun to fail:

      http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-7.2.4-7070/Rb2OutEpt-SUBDOC/03-08-24/050401/d6247cae2c76fdaa338a1ac449e833ef-SD.html

      .NET Logs:

      SdkdConsole.log

      In this run, the cluster starts with nodes: http://172.23.109.150:8091, http://172.23.109.146:8091, http://172.23.109.144:8091, http://172.23.109.142:8091 

      And these 2 are rebalanced out:  http://172.23.109.150:8091, http://172.23.109.146:8091

       

      Note any rebalance in or out of the cluster has the same effect.

      Looks like lots of `Couchbase.Core.Exceptions.KeyValue.DocumentNotFoundException` errors. This is see with 7.2 and 7.1, am checking 7.6.

       

      Would also add that in the KV tests, the performance has also gotten worse with failing operations where previously we didnt have any failures:

      RC2 : http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-7.2.4-7070/Rb2Out-KV/03-08-24/036385/44fd280c00f44874e943f63da3dc8a34-MC.html

      55e03c59: http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-7.2.4-7070/Rb2Out-KV/03-07-24/073139/1df353d7854c6efbbdf3fcf2835cf4cc-MC.html

      Though in these tests, the operations do seem to recover after some time, where they do not in the subdoc tests.

       

      Note I had run some tests last night in prep, and it was passing with .NET commit 
      55e03c598bce6bc7fd5814856b377720a87206e6
      So it was something in the last 3 commits since then.

      Passing test logs from 55e03c598bce: (if useful)
      SdkdConsole-passing.log.zip

       

      Edit: Matthew Bray Has also seen DocumentNotFound errors in FIT-SIT Capella tests - i.e. KV ops against a capella cluster that undergoes any scale.

      https://performance-sdk.couchbase.com:8080/situationalSingle?situationalRunId=f8c941ac-216d-4843-85f9-8ba1615a06f0&runId=958d34d5-bfc3-4a6b-9625-1f86acec8ff0

      So if wanting to reproduce this might be the easiest way.

       

       

      Attachments

        1. myapp-20240308.txt
          28.75 MB
        2. SdkdConsole.log
          95.13 MB
        3. SdkdConsole-passing.log.zip
          3.92 MB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              jmorris Jeff Morris
              will.broadbelt Will Broadbelt
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty