Uploaded image for project: 'Couchbase .NET client library'
  1. Couchbase .NET client library
  2. NCBC-3376

Operation Elapsed property is not calculated properly for completed operations with retriable response status

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 3.4.7
    • 3.4.5
    • None
    • None
    • 0

    Description

      I reproduced the issue on the scenario with an error VBucketBelongsToAnotherServer so let’s look at it as an example.

      Note: it is important to be able to simulate an operation that completes w/t exceptions, but retriable status code and to retry the operation until timeout.

      When this kind of response is received, SDK calls HandleOperationCompleted and this method regardless of the branch always calls StopRecording method which actually stops internal Stopwatch object that an operation uses to track how much time the operation took to be completed.

      _dispatchSpan?.Dispose();
      var prevCompleted = Interlocked.Exchange(ref _isCompleted, 1);
       
      // First branch
      if (prevCompleted == 1)
      {
          StopRecording();
          data.Dispose();
          return;
      }
       
      // ...
       
      // Second branch
      try
      {
        // ...
          _valueTaskSource.SetResult(status);
      }
      catch (Exception ex)
      {
          // ...
      }
      finally
      {
          StopRecording();
      }

      Assuming that no exceptions are thrown in the above try block we see that eventually it calls:

      _valueTaskSource.SetResult(status); 

      Which completes the task, so that the rest of the ClusterNode.ExecuteOp code might be executed

      Since it’s not a transport failure or something similar the ExecuteOp method is successfully completed with the response status as a return value. 

      Going up the stack we reach RetryOrchestrator.RetryAsync method which checks if the status is retriable. In our case it is retriable which means that the operation will be attempted to execute again. Thus the operation will be retried until the cancellation token provided is not cancelled by timeout.

      After timeout the execution goes to the catch block in the RetryAsync method. And there is the following code:

      if (operation.Elapsed < operation.Timeout)
      {
          // not a true timeout.
          throw;
      } 

      And as we know when the operation was completed on the first time, its stopwatch was stopped, so Elapsed }}represents only the first attempt of the operation, so this code will throw {{{}TaskCancelledException, but in fact should throw TimeoutException.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              jmorris Jeff Morris
              eugeneshcherbo Eugene Shcherbo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty