Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50536

[System Test][CBM] backups failed with error - http2: timeout awaiting response headers -- couchbase.(*DCPAsyncWorker).handleDCPError() at dcp_async_worker.go:620

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • tools
    • Untriaged
    • 1
    • Unknown
    • Tools 2022-Jan

    Description

      7.1.0-2097

      Test:
      -test tests/integration/neo/test_neo_couchstore_milestone4.yml -scope tests/integration/neo/scope_couchstore.yml
      Scale 3
      Iteration 3

      Added backup to gcp to the longevity cluster - the first two backups failed with error:

      {
        "task_name": "backup-1",
        "status": "failed",
        "start": "2022-01-20T18:00:47.418767682-08:00",
        "end": "2022-01-20T19:32:08.337480905-08:00",
        "node_runs": [
          {
            "node_id": "34e6902271c90b80be588f7ac0ed0ec0",
            "status": "failed",
            "start": "2022-01-20T18:02:33.636684128-08:00",
            "end": "2022-01-20T19:32:08.298731212-08:00",
            "error": "exit status 1",
            "progress": 72.70256050508593,
            "stats": {
              "id": "5569f67a-f1d0-4983-bf77-182fa6a7c18b",
              "current_transfer": 1,
              "total_transfers": 1,
              "transfers": [
                {
                  "description": "Backing up to 2022-01-20T18_02_41.44362674-08_00",
                  "stats": {
                    "started_at": 1642730560435979500,
                    "buckets": {
                      "N1QL_SYSTEM_BUCKET": {
                        "estimated_total_items": 16735,
                        "total_items": 461,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 458178,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "mutations_received": 211,
                        "deletions_received": 250,
                        "started_at": 1642734598843575600,
                        "finished_at": 1642734615075832300,
                        "complete": true
                      },
                      "bucket1": {
                        "estimated_total_items": 150802814,
                        "total_items": 7293660,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 4231650298,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "mutations_received": 3282371,
                        "deletions_received": 4019638,
                        "started_at": 1642735019835822800,
                        "finished_at": 1642735668467576300,
                        "complete": true
                      },
                      "bucket4": {
                        "estimated_total_items": 15475156,
                        "total_items": 1750522,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 937051427,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "mutations_received": 1157943,
                        "deletions_received": 592579,
                        "started_at": 1642730566302167000,
                        "finished_at": 1642730763357324300,
                        "complete": true
                      },
                      "bucket6": {
                        "estimated_total_items": 121551768,
                        "total_items": 1660639,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 932629534,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "mutations_received": 1157943,
                        "deletions_received": 502696,
                        "started_at": 1642734059752157000,
                        "finished_at": 1642734198519519000,
                        "complete": true
                      },
                      "bucket7": {
                        "estimated_total_items": 151553352,
                        "total_items": 616652,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 31219253,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "deletions_received": 616652,
                        "started_at": 1642731159534413800,
                        "finished_at": 1642731178354317800,
                        "complete": true
                      },
                      "bucket8": {
                        "estimated_total_items": 2039404,
                        "total_items": 43812,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 83504,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "started_at": 1642731585380897800,
                        "finished_at": 1642731603489330700,
                        "complete": true
                      },
                      "bucket9": {
                        "estimated_total_items": 2021984,
                        "total_items": 78588,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 83504,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "started_at": 1642732003977234200,
                        "finished_at": 1642732022603883800,
                        "complete": true
                      },
                      "default": {
                        "estimated_total_items": 836256538,
                        "total_items": 8248421,
                        "total_vbuckets": 1024,
                        "vbuckets_complete": 1024,
                        "bytes_received": 9425295144,
                        "snapshot_markers_received": 1024,
                        "failover_logs_received": 1024,
                        "mutations_received": 2564595,
                        "deletions_received": 5731040,
                        "started_at": 1642732422539028000,
                        "finished_at": 1642733529691775000,
                        "complete": true
                      }
                    }
                  },
                  "progress": 72.70256050508593,
                  "eta": "2022-01-20T20:05:40.992052589-08:00"
                }
              ],
              "progress": 72.70256050508593,
              "eta": "2022-01-20T20:05:40.992052589-08:00"
            },
            "error_code": 2
          }
        ],
        "error": "exit status 1",
        "error_code": 2,
        "type": "BACKUP",
        "show": true
      }
      

      From the CBM logs, we can see this error:

      2022-01-21T03:27:45.304-08:00 WARN: (DCP) (default) (vb 380) Received an unexpected error from the sink callback, beginning teardown: failed to close vBucket: failed to commit final chunk data to disk: failed to commit data store: failed to stop multipart upload worker: failed to upload part: Post "https://storage.googleapis.com/upload/storage/v1/b/longevity_testing/o?alt=json&name=centos1%2F3a44d445-dcbd-4c30-9172-ddd5504b702b%2F2022-01-21T02_21_46.160649561-08_00%2Fdefault-4d3954c5005e735fb976a075bd4947e0%2Fdata%2Fdata_380.rift.0-mpu-5fbfcb26-dfff-40d1-b876-7b2c59743cbc-75be403d-bf66-4adb-8ef0-5f9e4a9e4b2d&prettyPrint=false&projection=full&uploadType=multipart": http2: timeout awaiting response headers -- couchbase.(*DCPAsyncWorker).handleDCPError() at dcp_async_worker.go:620
      2022-01-21T03:27:45.315-08:00 (DCP) (default) (vb 347) Stream closed because all items were streamed | {"uuid":72275754736113,"snap_start":0,"snap_end":1044635,"last_seqno":1034972,"retries":0}
      2022-01-21T03:27:45.351-08:00 (DCP) (default) (vb 22) Stream closed because all items were streamed | {"uuid":102992160682906,"snap_start":0,"snap_end":1250627,"last_seqno":1228884,"retries":0}
      2022-01-21T03:27:47.185-08:00 (DCP) (default) (vb 360) Stream closed because all items were streamed | {"uuid":229627156948217,"snap_start":0,"snap_end":1250075,"last_seqno":1231698,"retries":0}
      2022-01-21T03:27:49.110-08:00 WARN: (DCP) (default) (vb 522) Received an unexpected error from the sink callback, beginning teardown: failed to close vBucket: failed to commit final chunk data to disk: failed to commit data store: failed to stop multipart upload worker: failed to upload part: Post "https://storage.googleapis.com/upload/storage/v1/b/longevity_testing/o?alt=json&name=centos1%2F3a44d445-dcbd-4c30-9172-ddd5504b702b%2F2022-01-21T02_21_46.160649561-08_00%2Fdefault-4d3954c5005e735fb976a075bd4947e0%2Fdata%2Fdata_522.rift.0-mpu-cffcd7a5-8899-451d-b59a-a04cc02e2546-1d47cb3e-2b2b-426b-a0fd-6e9b43bf9a76&prettyPrint=false&projection=full&uploadType=multipart": http2: timeout awaiting response headers -- couchbase.(*DCPAsyncWorker).handleDCPError() at dcp_async_worker.go:620
      

      Cluster config:

      ########## Cluster config ##################
      ######  n1ql : 2 ===== > [172.23.104.137:8091 172.23.99.11:8091]  ###########
      ######  index : 6 ===== > [172.23.104.155:8091 172.23.104.70:8091 172.23.120.245:8091 172.23.123.28:8091 172.23.96.251:8091 172.23.96.252:8091]  ###########
      ######  cbas : 4 ===== > [172.23.104.157:8091 172.23.105.168:8091 172.23.120.107:8091 172.23.97.239:8091]  ###########
      ######  fts : 2 ===== > [172.23.104.5:8091 172.23.105.111:8091]  ###########
      ######  eventing : 3 ===== > [172.23.104.67:8091 172.23.96.148:8091 172.23.97.122:8091]  ###########
      ######  backup : 1 ===== > [172.23.104.69:8091]  ###########
      ######  kv : 10 ===== > [172.23.105.107:8091 172.23.108.103:8091 172.23.121.117:8091 172.23.96.253:8091 172.23.97.119:8091 172.23.97.121:8091 172.23.98.135:8091 172.23.99.20:8091 172.23.99.21:8091 172.23.99.25:8091]  ###########
      

      Attaching CBM logs

      Attachments

        For Gerrit Dashboard: MB-50536
        # Subject Branch Project Status CR V

        Activity

          People

            arunkumar Arunkumar Senthilnathan (Inactive)
            arunkumar Arunkumar Senthilnathan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty