Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47144

Failed to restore bucket with error "ambiguous timeout" in AWS

    XMLWordPrintable

Details

    • Untriaged
    • Ubuntu 64-bit
    • 1
    • Unknown

    Description

      Problem
      Restore fails with error "ambiguous timeout" for bucket default:

      2021-06-27T14:00:39.029+00:00 (Cmd) Error restoring cluster: failed to execute cluster operations: failed to execute bucket operation for bucket 'default': failed to transfer bucket data for bucket 'default': failed to transfer key value data: failed to get sink data callbacks: failed to initialise worker 11: failed to create Memcached agent: agent failed to connect to the cluster: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":72644,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.36.182:11210","LastDispatchedFrom":"172.31.20.241:59798","LastConnectionID":"6884dbfd5cc2bcb5/0589e63b647fd400"}
      

      +Steps to reproduce +

      1. Install Couchbase server 7.0.0-5295 (RC5) on 11 AWS instances
      2. Create 2 buckets: default and other
      3. Create total ~ 600 scopes and collections on each bucket.
      4. Disable compact data in bucket
      5. Load ~6 billion docs with size from 512 to 1024 bytes to other bucket and ~7 billion keys to default bucket to get data size around total 13 billion keys and 14 TB data size for 2 buckets
      6. During loading, rebalance in and out a node using swap and add in rebalance.
      7. After reaching ~ 13 billion keys at both buckets, stop loader.
      8. Run compact on default bucket. => done
      9. Run backup with cbbackupmgr. => done.
      10. Run another backup without any loading => done
      11. Run merge 2 backups, in few minutes, kill the merge process (control -C)
      12. Delete default bucket and other bucket in cluster
      13. Run restore back 2 buckets to cluster, bucket other is restored, bucket default failed to restore with error "ambiguous timeout"

        2021-06-27T14:00:13.963+00:00 (Plan) (Data) Deciding which key value data to transfer for bucket 'default'
        2021-06-27T14:00:15.077+00:00 (Plan) (Data) Successfully decided which key value data to transfer for bucket 'default' | {"number":17,"duration":"1.114307521s"}
        2021-06-27T14:00:15.077+00:00 (Plan) (Data) Transferring new key value data for bucket 'default'
        2021-06-27T14:00:15.078+00:00 (REST) (Attempt 1) (GET) Dispatching request to 'http://172.31.34.157:8091/pools/default/buckets/default'
        2021-06-27T14:00:15.092+00:00 (REST) (Attempt 1) (GET) (200) Received response from 'http://172.31.34.157:8091/pools/default/buckets/default'
        2021-06-27T14:00:24.094+00:00 (Gocbcore) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout
        2021-06-27T14:00:33.693+00:00 (Gocbcore) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout
        2021-06-27T14:00:33.693+00:00 (Gocbcore) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout
        2021-06-27T14:00:33.693+00:00 (Gocbcore) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout
        2021-06-27T14:00:33.696+00:00 (Gocbcore) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout
        2021-06-27T14:00:33.704+00:00 (Gocbcore) Pipeline Client 0xc160359ce0 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":167432,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.36.38:11210","LastDispatchedFrom":"172.31.20.241:59376","LastConnectionID":"6884dbfd5cc2bcb5/cfb38096eb0de094"}
        2021-06-27T14:00:33.704+00:00 (Gocbcore) Pipeline Client 0xc160359f20 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":120236,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.42.32:11210","LastDispatchedFrom":"172.31.20.241:56050","LastConnectionID":"6884dbfd5cc2bcb5/0f5e6ca56875da76"}
        2021-06-27T14:00:33.705+00:00 (Gocbcore) Pipeline Client 0xc160359f80 failed to bootstrap: request canceled
        2021-06-27T14:00:33.798+00:00 (Gocbcore) Pipeline Client 0xc160359e60 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x0","TimeObserved":34823,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"","LastDispatchedFrom":"","LastConnectionID":""}
        2021-06-27T14:00:33.861+00:00 (Gocbcore) Pipeline Client 0xc160359ec0 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":99753,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.42.159:11210","LastDispatchedFrom":"172.31.20.241:40620","LastConnectionID":"6884dbfd5cc2bcb5/0b56e1a47fb3c6a3"}
        2021-06-27T14:00:33.861+00:00 (Gocbcore) Pipeline Client 0xc160359da0 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":52763,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.39.206:11210","LastDispatchedFrom":"172.31.20.241:34590","LastConnectionID":"6884dbfd5cc2bcb5/3e221ee7526bb378"}
        2021-06-27T14:00:33.864+00:00 (Gocbcore) Pipeline Client 0xc160359d40 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":93401,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.37.223:11210","LastDispatchedFrom":"172.31.20.241:47916","LastConnectionID":"6884dbfd5cc2bcb5/1ff9a3a502862dc1"}
        2021-06-27T14:00:33.885+00:00 (Gocbcore) Pipeline Client 0xc160359c20 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":103466,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.34.157:11210","LastDispatchedFrom":"172.31.20.241:36610","LastConnectionID":"6884dbfd5cc2bcb5/5e41bd111a0737e3"}
        2021-06-27T14:00:33.892+00:00 (Gocbcore) Pipeline Client 0xc160359c80 failed to bootstrap: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":72644,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.36.182:11210","LastDispatchedFrom":"172.31.20.241:59798","LastConnectionID":"6884dbfd5cc2bcb5/0589e63b647fd400"}
        2021-06-27T14:00:36.698+00:00 (Gocbcore) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":103466,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.34.157:11210","LastDispatchedFrom":"172.31.20.241:36610","LastConnectionID":"6884dbfd5cc2bcb5/5e41bd111a0737e3"}
        2021-06-27T14:00:39.029+00:00 (Cmd) Error restoring cluster: failed to execute cluster operations: failed to execute bucket operation for bucket 'default': failed to transfer bucket data for bucket 'default': failed to transfer key value data: failed to get sink data callbacks: failed to initialise worker 11: failed to create Memcached agent: agent failed to connect to the cluster: ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"CMD_HELLO","Opaque":"0x1","TimeObserved":72644,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"172.31.36.182:11210","LastDispatchedFrom":"172.31.20.241:59798","LastConnectionID":"6884dbfd5cc2bcb5/0589e63b647fd400"}
        

      Logs from cluster are attached.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            thuan Thuan Nguyen
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty