Uploaded image for project: 'Couchbase .NET client library'
  1. Couchbase .NET client library
  2. NCBC-2549

Subdoc failures after restarting CB server

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.2
    • None
    • 1

    Description

      With SDK 3.0.2 RC: NCBC-2548 (http://review.couchbase.org/c/couchbase-net-client/+/130287)

      Testing Against: CB Server 6.5.1 - 4 node cluster with services 1x{kv}, 3 x {kv,index,fts,n1ql}

      There is an issue where subdoc ops fail after all (or some) of the CB nodes are stopped and restarted. As seen below.

      http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-6.5.1-6299/SvcRestartAll-SUBDOC/06-11-20/043535/4ade3a166e84d5ceed8047c7575dd5c6-SD.html

      Logs: SdkdConsole.log.zip

      Also occurs with CB 6.0.4.

      Attachments

        1. consoleFull.log.zip
          496 kB
        2. SdkdConsole_6.0_Cancel.log.zip
          2.74 MB
        3. SdkdConsole_cancel.log.zip
          2.27 MB
        4. SdkdConsole_sd_1T.log.zip
          1.06 MB
        5. sdkdconsole_SD_fail.log.zip
          12 kB
        6. SdkdConsole_SD_NCBC2549.log.zip
          4.35 MB
        7. SdkdConsole_sd.log.zip
          4.78 MB
        8. SdkdConsole.log.zip
          3.48 MB
        9. SdkdConsole2.log.zip
          4.19 MB
        10. SdkdConsole3.log.zip
          4.94 MB
        For Gerrit Dashboard: NCBC-2549
        # Subject Branch Project Status CR V

        Activity

          will.broadbelt Will Broadbelt added a comment - - edited

          I think I have 'fixed' the issue running with 6.5 by just increasing the WaitForReady to 20s - though I'm still running tests for it now. UPDATE: Still getting timeouts with 20s.

          But for CB <6.5 I can't use this. Looking at the logs I think the issue is that the GetClusterConfig is returning BucketNotConnected (occasionally), so connecting to the cluster is failing. Filed - NCBC-2552 .

          will.broadbelt Will Broadbelt added a comment - - edited I think I have 'fixed' the issue running with 6.5 by just increasing the WaitForReady to 20s - though I'm still running tests for it now. UPDATE: Still getting timeouts with 20s. But for CB <6.5 I can't use this. Looking at the logs I think the issue is that the GetClusterConfig is returning BucketNotConnected (occasionally), so connecting to the cluster is failing. Filed - NCBC-2552 .
          will.broadbelt Will Broadbelt added a comment - - edited

          Jeff Morris:

          Running with http://review.couchbase.org/c/couchbase-net-client/+/130602/1 , I see there is a CircuitBreakerException happening now.

          Graph: http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-6.0.4-3082/SvcRestartAll-SUBDOC/06-16-20/043638/5ba23646c7bbb59f8908099e49a646d8-SD.html

          Logs: SdkdConsole_SD_NCBC2549.log.zip

          Also, here's some of the Sdkd stack traces that are being thrown:

          [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] 
          [293.14 INFO] (SDKD log:137)       Exception:One or more errors occurred. (Exception of type 'Couchbase.Core.CircuitBreakers.CircuitBreakerException' was thrown.) 
          [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.Core.CircuitBreakers.CircuitBreakerException' was thrown.) 
          [293.14 INFO] (SDKD log:137)  ---> Couchbase.Core.CircuitBreakers.CircuitBreakerException: Exception of type 'Couchbase.Core.CircuitBreakers.CircuitBreakerException' was thrown. 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 26 
          [293.14 INFO] (SDKD log:137)    --- End of inner exception stack trace --- 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 
          [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] 
          [293.14 INFO] (SDKD log:137)       Exception:One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) 
          [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) 
          [293.14 INFO] (SDKD log:137)  ---> Couchbase.AuthenticationFailureException: Exception of type 'Couchbase.AuthenticationFailureException' was thrown. 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 18 
          [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- 
          [293.14 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"7256","ClientContextId":"159196","Cas":0,"Status":36,"BucketName":"default","CollectionName":null,"ScopeName":null,"Message":"KV Error: {Name=\"EACCESS\", Description=\"Not authorized for command\", Attributes=\"support\"}"} 
          [293.14 INFO] (SDKD log:137)  
          [293.14 INFO] (SDKD log:137)    --- End of inner exception stack trace --- 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 
          [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] 
          [293.14 INFO] (SDKD log:137)       Exception:One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) 
          [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) 
          [293.14 INFO] (SDKD log:137)  ---> Couchbase.AuthenticationFailureException: Exception of type 'Couchbase.AuthenticationFailureException' was thrown. 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 18 
          [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- 
          [293.14 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"6877","ClientContextId":"159159","Cas":0,"Status":36,"BucketName":"default","CollectionName":null,"ScopeName":null,"Message":"KV Error: {Name=\"EACCESS\", Description=\"Not authorized for command\", Attributes=\"support\"}"} 
          [293.14 INFO] (SDKD log:137)  
          [293.14 INFO] (SDKD log:137)    --- End of inner exception stack trace --- 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 
          [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] 
          [293.14 INFO] (SDKD log:137)       Exception:One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) 
          [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) 
          [293.14 INFO] (SDKD log:137)  ---> Couchbase.CouchbaseException: Exception of type 'Couchbase.CouchbaseException' was thrown. 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 18 
          [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- 
          [293.14 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"4545","ClientContextId":"159161","Cas":0,"Status":8,"BucketName":"default","CollectionName":null,"ScopeName":null,"Message":"KV Error: {Name=\"NO_BUCKET\", Description=\"Not connected to any bucket\", Attributes=\"conn-state-invalidated\"}"} 
          [293.14 INFO] (SDKD log:137)  
          [293.14 INFO] (SDKD log:137)    --- End of inner exception stack trace --- 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 
          [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] 
          [293.14 INFO] (SDKD log:137)       Exception:One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) 
          [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) 
          [293.14 INFO] (SDKD log:137)  ---> Couchbase.CouchbaseException: Exception of type 'Couchbase.CouchbaseException' was thrown. 
          [293.14 INFO] (SDKD log:137) +++ Received exception of ID 20 
          [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- 
          [293.15 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"5844","ClientContextId":"159163","Cas":0,"Status":8,"BucketName":"default","CollectionName":"_default","ScopeName":null,"Message":"KV Error: {Name=\"NO_BUCKET\", Description=\"Not connected to any bucket\", Attributes=\"conn-state-invalidated\"}"} 
          [293.15 INFO] (SDKD log:137)  
          [293.15 INFO] (SDKD log:137)    --- End of inner exception stack trace --- 
          [293.15 INFO] (SDKD log:137) +++ Received exception of ID 12 

          will.broadbelt Will Broadbelt added a comment - - edited Jeff Morris : Running with http://review.couchbase.org/c/couchbase-net-client/+/130602/1 , I see there is a CircuitBreakerException happening now. Graph: http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-6.0.4-3082/SvcRestartAll-SUBDOC/06-16-20/043638/5ba23646c7bbb59f8908099e49a646d8-SD.html Logs: SdkdConsole_SD_NCBC2549.log.zip Also, here's some of the Sdkd stack traces that are being thrown: [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] [293.14 INFO] (SDKD log:137) Exception:One or more errors occurred. (Exception of type 'Couchbase.Core.CircuitBreakers.CircuitBreakerException' was thrown.) [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.Core.CircuitBreakers.CircuitBreakerException' was thrown.) [293.14 INFO] (SDKD log:137) ---> Couchbase.Core.CircuitBreakers.CircuitBreakerException: Exception of type 'Couchbase.Core.CircuitBreakers.CircuitBreakerException' was thrown. [293.14 INFO] (SDKD log:137) +++ Received exception of ID 26 [293.14 INFO] (SDKD log:137) --- End of inner exception stack trace --- [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] [293.14 INFO] (SDKD log:137) Exception:One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) [293.14 INFO] (SDKD log:137) ---> Couchbase.AuthenticationFailureException: Exception of type 'Couchbase.AuthenticationFailureException' was thrown. [293.14 INFO] (SDKD log:137) +++ Received exception of ID 18 [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- [293.14 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"7256","ClientContextId":"159196","Cas":0,"Status":36,"BucketName":"default","CollectionName":null,"ScopeName":null,"Message":"KV Error: {Name=\"EACCESS\", Description=\"Not authorized for command\", Attributes=\"support\"}"} [293.14 INFO] (SDKD log:137) [293.14 INFO] (SDKD log:137) --- End of inner exception stack trace --- [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] [293.14 INFO] (SDKD log:137) Exception:One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.AuthenticationFailureException' was thrown.) [293.14 INFO] (SDKD log:137) ---> Couchbase.AuthenticationFailureException: Exception of type 'Couchbase.AuthenticationFailureException' was thrown. [293.14 INFO] (SDKD log:137) +++ Received exception of ID 18 [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- [293.14 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"6877","ClientContextId":"159159","Cas":0,"Status":36,"BucketName":"default","CollectionName":null,"ScopeName":null,"Message":"KV Error: {Name=\"EACCESS\", Description=\"Not authorized for command\", Attributes=\"support\"}"} [293.14 INFO] (SDKD log:137) [293.14 INFO] (SDKD log:137) --- End of inner exception stack trace --- [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] [293.14 INFO] (SDKD log:137) Exception:One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) [293.14 INFO] (SDKD log:137) ---> Couchbase.CouchbaseException: Exception of type 'Couchbase.CouchbaseException' was thrown. [293.14 INFO] (SDKD log:137) +++ Received exception of ID 18 [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- [293.14 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"4545","ClientContextId":"159161","Cas":0,"Status":8,"BucketName":"default","CollectionName":null,"ScopeName":null,"Message":"KV Error: {Name=\"NO_BUCKET\", Description=\"Not connected to any bucket\", Attributes=\"conn-state-invalidated\"}"} [293.14 INFO] (SDKD log:137) [293.14 INFO] (SDKD log:137) --- End of inner exception stack trace --- [293.14 INFO] (SDKD log:137) +++ Received exception of ID 8 [293.14 INFO] (SDKD log:137) fail: Sdkd.Subdoc.SDRunCommand[0] [293.14 INFO] (SDKD log:137) Exception:One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) [293.14 INFO] (SDKD log:137) System.AggregateException: One or more errors occurred. (Exception of type 'Couchbase.CouchbaseException' was thrown.) [293.14 INFO] (SDKD log:137) ---> Couchbase.CouchbaseException: Exception of type 'Couchbase.CouchbaseException' was thrown. [293.14 INFO] (SDKD log:137) +++ Received exception of ID 20 [293.14 INFO] (SDKD log:137) -----------------------Context Info--------------------------- [293.15 INFO] (SDKD log:137) {"DispatchedFrom":null,"DispatchedTo":null,"DocumentKey":"5844","ClientContextId":"159163","Cas":0,"Status":8,"BucketName":"default","CollectionName":"_default","ScopeName":null,"Message":"KV Error: {Name=\"NO_BUCKET\", Description=\"Not connected to any bucket\", Attributes=\"conn-state-invalidated\"}"} [293.15 INFO] (SDKD log:137) [293.15 INFO] (SDKD log:137) --- End of inner exception stack trace --- [293.15 INFO] (SDKD log:137) +++ Received exception of ID 12
          will.broadbelt Will Broadbelt added a comment - Running with amended patch: SdkdConsole_sd.log.zip http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-6.5.1-6299/SvcRestartAll-SUBDOC/06-16-20/071012/f601b9632c46ef2de95dbc310077152d-SD.html
          will.broadbelt Will Broadbelt added a comment - With one thread: http://sdkqe-testresults.couchbase.com.s3.amazonaws.com/SDK-SDK/CB-6.5.1-6299/SvcRestartAll-SUBDOC/06-16-20/072953/fd9e0950709e31920a254a6b1fd340f0-SD.html Logs: SdkdConsole_sd_1T.log.zip Also included the Jenkins console log as it has stack traces: consoleFull.log.zip

          Jeff Morris -
          Patchset 3 has fixed it! Running the whole suite now to check for regressions, and I'll resolve the ticket once that is done.

          will.broadbelt Will Broadbelt added a comment - Jeff Morris - Patchset 3 has fixed it! Running the whole suite now to check for regressions, and I'll resolve the ticket once that is done.

          People

            jmorris Jeff Morris
            will.broadbelt Will Broadbelt
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty