Uploaded image for project: 'Couchbase Gateway'
  1. Couchbase Gateway
  2. CBG-284

Retry loops for bucket update ops have no backoff and no max retry count

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Cobalt
    • Component/s: SyncGateway
    • Security Level: Public
    • Labels:
      None
    • Sprint:
      CBG Sprint 18, CBG Sprint 19
    • Story Points:
      5

      Description

      The update operations seen in bucket_gocb.go use a naive for loop, with breaks only for unrecoverable errors, or successful updates.

      • func (bucket CouchbaseBucketGoCB) Update
      • func (bucket CouchbaseBucketGoCB) WriteUpdate
      • func (bucket CouchbaseBucketGoCB) WriteUpdateWithXattr

      In the case of network timeout errors, these for loops can retry forever with no backoff mechanism, which has the potential to make these network timeout issues even more likely to occur.

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            ben.brooks Ben Brooks added a comment -

            Adam Fraser Need to discuss the behaviour of retry loops on update functions - Do we really want to retry forever?

            We need some kind of throttling or backoff on the retry loop. Need to determine if we should use the same retry logic as seen in Get, GetAndTouch, Touch, Incr, etc. (which do have max retries based on bucket spec).

            Show
            ben.brooks Ben Brooks added a comment - Adam Fraser Need to discuss the behaviour of retry loops on update functions - Do we really want to retry forever ? We need some kind of throttling or backoff on the retry loop. Need to determine if we should use the same retry logic as seen in Get, GetAndTouch, Touch, Incr, etc. (which do have max retries based on bucket spec).
            Hide
            adamf Adam Fraser added a comment -

            There is additional discussion of this type of timeout-related enhancement on https://github.com/couchbase/sync_gateway/issues/3509, even for operations that only have a fixed number of retries.  I think that's the preferable approach, but it requires client uptake.  We could consider modifying the retry behaviour of update specifically for timeout errors in the meantime.

            The difference between update and the Get retry handling is that update is expected to retry indefinitely on CAS failure, so there's a bit more work to isolate retry limits for timeouts. 

             

            Show
            adamf Adam Fraser added a comment - There is additional discussion of this type of timeout-related enhancement on  https://github.com/couchbase/sync_gateway/issues/3509 , even for operations that only have a fixed number of retries.  I think that's the preferable approach, but it requires client uptake.  We could consider modifying the retry behaviour of update specifically for timeout errors in the meantime. The difference between update and the Get retry handling is that update is expected to retry indefinitely on CAS failure, so there's a bit more work to isolate retry limits for timeouts.   
            Hide
            build-team Couchbase Build Team added a comment -

            Build sync_gateway-2.6.0-16 contains sync_gateway commit 3fb082b with commit message:
            CBG-284 - Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081)

            Show
            build-team Couchbase Build Team added a comment - Build sync_gateway-2.6.0-16 contains sync_gateway commit 3fb082b with commit message: CBG-284 - Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081)
            Hide
            build-team Couchbase Build Team added a comment -

            Build sync_gateway-2.6.0-16 contains sync_gateway commit 3fb082b with commit message:
            CBG-284 - Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081)

            Show
            build-team Couchbase Build Team added a comment - Build sync_gateway-2.6.0-16 contains sync_gateway commit 3fb082b with commit message: CBG-284 - Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081)
            Hide
            build-team Couchbase Build Team added a comment -

            Build sync_gateway-2.5.1-9 contains sync_gateway commit d7b0a34 with commit message:
            CBG-320 - Backport CBG-284: Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081) (#4181)

            Show
            build-team Couchbase Build Team added a comment - Build sync_gateway-2.5.1-9 contains sync_gateway commit d7b0a34 with commit message: CBG-320 - Backport CBG-284 : Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081) (#4181)
            Hide
            build-team Couchbase Build Team added a comment -

            Build sync_gateway-2.5.1-9 contains sync_gateway commit d7b0a34 with commit message:
            CBG-320 - Backport CBG-284: Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081) (#4181)

            Show
            build-team Couchbase Build Team added a comment - Build sync_gateway-2.5.1-9 contains sync_gateway commit d7b0a34 with commit message: CBG-320 - Backport CBG-284 : Use MaxNumRetries in non-xattr gocb bucket Update/WriteUpdate (#4081) (#4181)

              People

              • Assignee:
                ben.brooks Ben Brooks
                Reporter:
                ben.brooks Ben Brooks
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty

                    Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.