Uploaded image for project: 'Couchbase Go SDK'
  1. Couchbase Go SDK
  2. GOCBC-905

'WaitUntilReady' is not correctly returning errors

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.1.3
    • 2.1.1
    • library
    • None
    • 1

    Description

      What's the problem?
      When using 'WaitUnitlReady' to wait for the gocbcore agent to connect to the cluster we are seeing an 'unambiguous timeout' error when in reality the server disconnected us because we were using the (unmerged in CC) 'backfill_order' control flag.

      What do we expect to see?
      When we get disconnected from the server, the error should be bubbled up to cbbackupmgr so that it can be handled correctly and returned to the user. I imagine that this isn't the only case in which a timeout will be masking an error that has occurred behind the scenes.

      Steps to reproduce
      Patrick Varley has commented a concise set of steps needed to reproduce this issue with cbbackupmgr in MB-39653 but to briefly recap:
      1) Install CC build 2208 onto a CentOS 7 vagrant
      2) Configure a one node cluster with only the data service
      3) Create a bucket
      4) Load some data in the bucket using cbworkloadgen
      5) Run a backup

      If we look in the memcached logs we will see:

      2020-05-29T18:17:45.505412+00:00 INFO 44: DCP connection opened successfully. PRODUCER, INCLUDE_XATTRS [ [::1]:57896 - [::1]:11210 (<ud>Administrator</ud>) ]
      2020-05-29T18:17:45.505588+00:00 WARNING 44: (default) DCP (Producer) eq_dcpq:cbbackupmgr_2020-05-29T18:17:20Z_19653_0 - Invalid ctrl parameter 'sequential' for backfill_order
      2020-05-29T18:17:45.505734+00:00 INFO 44: (No Engine) DCP (Producer) eq_dcpq:cbbackupmgr_2020-05-29T18:17:20Z_19653_0 - Removing connection [ [::1]:57896 - [::1]:11210 (<ud>Administrator</ud>) ]
      

      However cbbackupmgr will display:

       /opt/couchbase/bin/cbbackupmgr backup -u Administrator -p password -c localhost -a backup -r MB-39653
      Backing up to '2020-05-29T18_17_20.039976728Z'
      Copying at 0B/s (about 0s remaining) - Transferring key value data for 'default'                                                                                                                                                                                             0 items / 0B
      [===============================================================================================================================================================================================================================================================================] 100.00%
      Error backing up cluster: operation has timed out
      Backed up bucket "default" failed
      Mutations backed up: 0, Mutations failed to backup: 0
      Deletions backed up: 0, Deletions failed to backup: 0
      Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              charles.dixon Charles Dixon
              james.lee James Lee
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty